marekrei / convertvec Goto Github PK
View Code? Open in Web Editor NEWConvert word2vec vectors between binary and plain text format
License: Apache License 2.0
Convert word2vec vectors between binary and plain text format
License: Apache License 2.0
Hi,
When I tried to convert binary to text ; i got the following error
"Segmentation fault (core dumped)".
Nice little tool, I received the error: fatal error: 'malloc.h' file not found when i first tried to make the file on my Mac (10.10). I changed the line from #include <malloc.h> to #include <malloc/malloc.h>
Figured this might help someone
I want to collect some useful word2vec related projects to one.
And want to do some to use easily for everyone.
This is my repo.
https://github.com/papower1/word2vec_addons
If you want to. Please reply.
regards.
Hi,
I think I found an issue with the code.
1.) Run the word2vec/demo-word.sh script.
Generates vectors.bin
2.) Run the following (modified version of above, but the output mode is text not binary)
time ./word2vec -train text8 -output vectors.txt -cbow 1 -size 200 -window 8 -negative 25 -hs 0 -sample 1e-4 -threads 20 -binary 0 -iter 15
3.) Convert the original binary into text with convertvec
./convertvec bin2txt vectors.bin vectors-converted.txt
4.) diff vectors.txt with vectors-converted.txt and the files are not the same. They have the same words in each file, and each word has 200 vectors, but the vectors do not correspond to the original vectors generated by word2vec in text mode.
Am I crazy? :)
How to write output in UTF-8?
I've a binary file with special characters, and the output doesn't encode them correctly!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.