Git Product home page Git Product logo

doc2vec's People

Contributors

hiyijian avatar tangbogreat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

doc2vec's Issues

训练方法

你好,我想使用你这个工具在ubuntu上训练维基百科的中文语料,目前我已经把维基百科中文下载下来并处理成纯文本了,接下来我要怎么做,能否指教一下。

PV-DBOW or PV-DM?

Is this implementation the distributed bag of words ('PV-DBOW') or the distributed memory ('PV-DM') model

每个doc的向量如何获取?

您好,有几个疑问:
我make完之后,执行了train这个工程,然后程序就执行结束了,之后就没有任何的回馈信息了——就是每个doc的向量存储在哪儿的?如果我要计算与"苹果"这个词最近相似度的词的话,如果写测试代码?在train.cpp里面自己加进去?这样确实有点。。。

如何处理中文文档的?我显示出来是乱码的。

几个问题

你好,想问几个问题,训练语料每一行的第一列表示什么?比如_*23134。是每一个文档的语料作为一行吗?那相似文档的输出怎么是没有分过词的?

doc2vec vs lda

两者都是将文档降维成向量,不知 doc2vec 的实际效果怎么样?

建议

提几个建议:
1、这个使用还需要自己写代码,改成命令行不是更好吗?
2、makefile文件写得不够好,gtest库也不能保证每个人都装,而且编译的时候库的位置每个人不一样,会导致编译出错
3、使用文档说明也太少了,至少说下可以使用命令行啊,建议参考一下facebook、google开源项目的使用说明

您的doc2vec代码如何获得每个doc的向量?

您好,有几个疑问:
我make完之后,执行了train这个工程,然后程序就执行结束了,之后就没有任何的回馈信息了——就是每个doc的向量存储在哪儿的?如果我要计算与"苹果"这个词最近相似度的词的话,如果写测试代码?在train.cpp里面自己加进去?这样确实有点。。。

如何处理中文文档的?我显示出来是乱码的。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.