pythia's Introduction
For more info contact [email protected]
pythia's People
Forkers
erogol malab aurora1625 kiran9k blade091shenwei fyhe8082 agasani chrisemoulton nihaofuyue0617 tyler-popythia's Issues
Improve the algorithm for sentiment.
Currently we use Alchemy. Too slow and RUBBISH. Perhaps use ur own trained classifiers with nltk
Implement centroid based summarizer
Problem with jpeg support in PIL
How to fix:
- Uninstall PIL
- sudo apt-get install libjpeg-dev libpng-dev zlib1g-dev liblcms1-dev python-dev
- $ cd jpeg-8c
$ ./configure --enable-shared
$ make
$ make instal - lnstall PIL again
More info:
http://ygamretuta.me/2011/05/27/install-pil-in-ubuntu-natty-python27-virtualen/
EDIT:
A MUCH BETTER APPROACH IS
1/ Call 'pip install -I pil --no-install' to download and unpack the PIL source into your build directory;
2/ Get into your build directory and edit setup.py;
3/ Find the line that says 'add_directory(library_dirs, "/usr/lib")' (line 214 here);
4/ Add the line 'add_directory(library_dirs, "/usr/lib/i386-linux-gnu")' afterwards;
5/ Call 'pip install -I pil --no-download' to finish the installation.
Implement user classification
Visualize topic growth
Implement evaluation framework
Consider adding porter stemming to improve clustering
Next session
--> I added a threshld value in online clustering. Find out what the best value is
--> Add tests for online clustering. Suspicion of documents getting lost.
-->
Implement breadth first search to find all followers
See book example
Document experiments
What's next with visualizations
First of all after implementing LexRank we would be able to show either the top tweets or a text summary of what happened. Put this summary either on the left hand side of the timeline or right below.
Then we have to create the main timeline with the most important events. Coding this will be challenging but you are awesome so don't worry! And you have D3.js by your side.
Then prettify the timeline (shades and stuff)
text analyser should return only a feature vector and the rest of the methods must be refactored in a Cluster class
Allow text analyser to ignore arabic
Make urls appear in term vector
Try to create a feature vector to be used only with Jaccard
Implement LexRank
Add twitter oauth keys in a config file. They must not be hardcoded
Index should have methods to remove terms NOT DOCS
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.