I'm a machine learning engineer.
- 🌱 I’m currently learning Rust, and using Go and Python in my daily work.
- 📫 How to reach me: twitter
- 😄 Pronouns: He/His
- ⚡ Fun fact: I enjoy playing with guitar and composed some original songs myself.
Pure Python implementation of machine learning algorithms
License: Apache License 2.0
I'm a machine learning engineer.
Thank you for your sharing.
However, when in the true development environments, there are always having missing value.
It seems no missing value process for each algorithm.
您好,请教下,https://github.com/tushushu/imylu/blob/master/imylu/tree/regression_tree.py 237行depth+1感觉没有必要啊
n_elements = sum(map(len, ratings.values()))
elements 数目偏大,在你的矩阵计算过程中被赋予的0 的默认值
导致ratings 的规模是m*n ,使得RMSE 的值偏小
ALS Example文件运行时,不执行predication函数,迭代十次之后直接关闭窗口,请问是什么情况?
def gen_data(low, high, n_rows, n_cols=None):
Returns:
list -- 1d or 2d list with int
but in imylu/imylu/utils/preprocessing.py/def min_max_scale(X):
X is ndarray .
and you can see imylu/examples/kd_tree_example.py/
's line 51~line 53
X = gen_data(low, high, n_rows, n_cols)
y = gen_data(low, high, n_rows)
Xi = gen_data(low, high, n_cols)
I think just add a line in def gen_data():
def gen_data(low, high, n_rows, n_cols=None):
"""Generate dataset randomly.
Arguments:
low {int} -- The minimum value of element generated.
high {int} -- The maximum value of element generated.
n_rows {int} -- Number of rows.
n_cols {int} -- Number of columns.
Returns:
list -- 1d or 2d list with int
"""
if n_cols is None:
ret = [randint(low, high) for _ in range(n_rows)]
else:
ret = [[randint(low, high) for _ in range(n_cols)]
for _ in range(n_rows)]
ret = array(ret) # This is my add.
return ret
Maybe I ignore something in somewhere, thanks!
_get_split_mse 中计算MSE的话 split_sqr_sum[0] - split_sum[0] * split_avg[0] 是不是应该为split_sqr_sum[0] - split_avg[0] * split_avg[0] 因为你是Y的实际值-Y的均值的平方
感谢您关于KD-tree的文章,这是在中文社区中能找到的最好的一篇!我有一个问题想请教您:当training data数据量增加时,您code中KD tree的NN搜索速度似乎并不比exhausted search的搜索速度快,复杂度并没有降到O(log(N)),请问这个问题是出在哪里了呢? https://zhuanlan.zhihu.com/p/45346117
博主你好,谢谢你分享代码。代码特别适合学习,书写整洁。
我是通过ALS搜索到您的代码,借此我想请教你一个问题。
ALS分解出来的两个矩阵,相乘之后,出现了很多负值,这样的负值怎么理解呢。举例如下:
userId,movieId,rating,timestamp
1,2,4.5,964982703
1,3,2.0,964981247
2,1,4.0,964982224
2,3,3.5,964983815
3,2,5.0,964982931
3,4,2.0,964982400
4,2,3.5,964980868
4,3,4.0,964982176
4,4,1.0,964984041
对应于评分矩阵
{{0, 4.5, 2, 0},
{4.0, 0, 3.5, 0},
{0, 5.0, 0, 2.0},
{0, 3.5, 4.0, 1.0}}
通过你的代码计算结果如下
model = ALS()
model.fit(X, k=2, max_iter=10)
## ===>
user_matrix =
[[0.7570282336382094, 0.03844973056986965, 0.8341276635923996, 0.7159946957782793],
[0.2987763898981603, 1.2236650397020115, -0.09214284357731845, 0.661286003351013]]
item_matrix =
[[-0.9352925817978953, 5.779445419172115, 1.301780265376408, 1.4251036233547738],
[2.7179214250950294, -0.33085416701657533, 3.272648111651669, -0.2344265756021138]]
user_matrix.transpose x item_matrix =
[[0.10400786028337483, 4.276351943480331, 1.9632744030892986, 1.0088025527850888],
[3.2898636807717296, -0.1826365582074784, 4.054678181939848, -0.23206475458923115],
[-1.0305904247583555, 4.851281148112146, 0.7842998282335908, 1.2103190870120526],
[1.1276588690551161, 3.919263034868889, 3.0962241552067207, 0.865343621997239]]
我通过Mathematica编程也计算出同样的结果
R = {{0, 4.5, 2, 0}, {4.0, 0, 3.5, 0}, {0, 5.0, 0, 2.0}, {0, 3.5, 4.0,1.0}};
X = RandomReal[1, {4, 2}]
(* {{0.242376,0.511595}, {0.661005,0.105123}, {0.00835057,0.137917}, {0.980405,0.334949}} *)
Do[
(* compute Y *)
Xt = Transpose[X];
Y = Transpose[LinearSolve[Xt.X, Xt.R]];
(* compute X *)
Yt = Transpose[Y];
X = Transpose[LinearSolve[Yt.Y, Yt.Transpose[R]]],
{i, 10}
]
很容易看到两个矩阵相乘出现了负值, 怎么理解这样的情况?谢谢
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.