xixiaoyao / cs224n-winter-together Goto Github PK
View Code? Open in Web Editor NEWan Open Course Platform for Stanford CS224n (2020 Winter)
Home Page: https://mp.weixin.qq.com/s/GsnhifWkd_lh88d3---4RQ
License: Apache License 2.0
an Open Course Platform for Stanford CS224n (2020 Winter)
Home Page: https://mp.weixin.qq.com/s/GsnhifWkd_lh88d3---4RQ
License: Apache License 2.0
感谢群友@风雪夜归人语
https://www.cnblogs.com/peghoty/p/3857839.html
这种根据中心词来预测中心词的上下文,有什么比较直观的解释吗?像CBOW那种,上下文预测中心词,脑海里想起来比较直观,好理解一些,但是skip-gram模型脑海里却想不到直观的解释,有什么想法或者参考资料吗?
推荐:
https://zhuanlan.zhihu.com/p/59011576
https://zhuanlan.zhihu.com/p/68502016
希望能辅助大家理解CS224N这门课,但是更希望大家能够像大佬们一样,多多输出高质量的课程笔记。每个人的视角不同,遇到的问题也不一样,积累的都是有价值的经验,期待大家在群里和github上踊跃讨论,共同进步!
一个很奇怪的事情是,为什么尽管并没有改变随机种子,A5中每一次从头开始训练都会收敛到不同的结果,讲道理当你对于torch、numpy和random都设好了相同的随机种子之后的答案应该是不变的才对啊,希望有大佬可以解答
我在完成assignment 5的过程当中,遇上了如题所述的问题,主要的原因是因为在inference time,word-level LSTM预测得到之后是用character-level LSTM,但是它第一个词的预测结果就是,不知道大家有没有遇到过这个问题,不知道是我写错了还是其本来就是这样,因为一般来说设置成作业的例子应该不会这么涉及到这么多细节的才对
因为skip-gram模型是根据中心词预测中心词的上下文,这直观上看来,应该“难度”会大于CBOW,那么按理来说会需要更多的语料才能比较好的收敛,那为什么小语料的情况下反而skip-gram表现更好一些呢?
参考资料:http://licstar.net/archives/620
Can u please share Stanford cs224n NLP lecture links if possible,Thanks @xixiaoyao
n-gram模型的讲义中提到了在处理每一个句子的时候都需要加一个首尾标志(<start>,<end>),比如如下的两个句子,bigram model为例:
(1). <start> I am Sam <end>
(2). <start> Sam I am <end>
具体我有三个疑惑:
(1). 对于结尾符<end>,文中的解释为"To make the bigram grammar a true probability distribution. Without an end-symbol, the sentence probabilities for all sentences of a given length would sum to one. This model would define an infinite set of probability distribution, with one distribution per sentence length."我不是很明白,请问有没有更直观的解释或者参考的资料呢?
(2).对于起始符<start>,文中解释是为了"to give us the bigram context of the first word."起始符没有像结尾符一样在概率分布方面的作用吗?
(3). 对于n-gram,是否需要在首尾加上n-1个起始和结尾符,还是仅仅只需要添加一个就行了呢?
跪求解惑。。。
感谢群友的支持:
https://www.bilibili.com/video/av24132174?p=1
HMM的讲解,中文版,最详细的一个tutorial
一个句子一轮decode ,Beam Search 阶段的时间复杂度应该是:k·step·|V|
训练迭代很慢,有解决办法吗?
可否像skip-gram的负采样一样,修改目标函数
感谢群友提供的百度网盘链接,这里有解压好的 glove.6B.200d.txt
https://pan.baidu.com/s/1TD7K059UVcxuPazWgtY8Dw
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.