Comments (1)
Certainly, you can reformat the data however you want.
One thing I've found is that it's impractical to maintain downloadable releases of every format someone might need. It's also expensive: each separate download has to remain stored on a server for a long time so that links don't break. So when people want just the vectors, I provide them as the lowest common denominator, the word2vec/fastText format.
If you use the vectors via the conceptnet5 repository, you'll be working with them in the efficient HDF5 format. (You'll also get the benefit of using the ConceptNet graph to extend the vocabulary, which you can't get from the vectors alone.) But there isn't yet a good tutorial on how to work with the data in this form.
The best place for questions that are not bug reports, by the way, is the Gitter chat: https://gitter.im/commonsense/conceptnet5
from conceptnet-numberbatch.
Related Issues (20)
- Wrong link in readme HOT 1
- Common word subset HOT 2
- Lemmatization for SNLI HOT 4
- Predict output word
- Sorting by occurrence count HOT 2
- Accuracy issues HOT 4
- Will there be a new version of Numberbatch? HOT 4
- @paper is not recognized while importing citation HOT 3
- conceptnet entities not present in the embeddings HOT 1
- KeyError: "word 'coffee_pot' not in vocabulary" HOT 5
- download of english version not available HOT 1
- training script for embedding
- Embedding for other dimensions: 50, 100 and 200 HOT 2
- Error when ninja : Shape mismatch in assignement HOT 1
- Do all versions occupy the same vector space?
- Spelling Error in README HOT 1
- meaning of number of # characters in subwords? HOT 4
- Can i use embeddings in closed source game (trough rest server)?
- ignore
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from conceptnet-numberbatch.