uwnetlab / nate Goto Github PK
View Code? Open in Web Editor NEWResearch at the intersection of natural language processing and social network analysis.
Home Page: http://networkslab.org/
License: MIT License
Research at the intersection of natural language processing and social network analysis.
Home Page: http://networkslab.org/
License: MIT License
** Network SimSummary Visualizations
** Semantic Visualizations
** Semantic Network
should be a method in the socnet, simnet, and docnet objects as all three have similarity analyses.
Sasha also want to be able to produce a histogram of all similarity scores calculated. This could be an argument for the method.
A method that returns basic summary information about a Nate network object.
To include in output:
number of nodes
number of edges
number of isolates
number of subcomponents
number of subcomponents > 1
average size of subcomponents
average size of subcomponents that are not the giant component
modularity score
number of communities
directed / undirected
n-mode
if directed: average in degree, out degree
if undirected: average degree
list of node attributes
list of edge attributes
Currently, all of the 'columns' passed into a nate object are stored in self.data
, which can be very memory-inefficient.
If we keep the special columns (text, time, ID) separate, and package all of the non-special columns into the self.data
object, we'll get the best of both worlds (easy compartmentalization in namespaces, memory efficiency)
Not critical unless we encounter memory bottlenecks.
Currently, they return ranges or individual records (based on an slice or integer subscript, respectively). It's more useful to return a random sample.
John said:
"It would be amazing to have, for example, a tidy dataframe with a datetime index for doing some simple time series stuff.
2:11 PM
Or the time stamps in a column that we can make a datetime object."
Quick visualization of degree distribution
Ur bad at code get better
histogram of edgewise shared partners
Currently, it's necessary to run nate.preprocess
to get the spaCy data necessary to instantiate a nate
pipeline.
First, pipelines should elegantly check to see if the necessary preprocessing has been completed. This should be simple and is a logical endpoint.
For further user friendliness, though, it would be prudent to enable each of the pipeline-returning functions to also run preprocessing using defaults that will configure the preprocess
function to meet their requirements.
Low priority.
Alluvial flow diagram for temporal networks
When working with end-point classes in the pipeline (e.g. svo_bursts
), users are often called upon to submit a filtering term (either an SVO or a token pair) for visualization steps.
Currently, if users submit an SVO or token pair that isn't present in the svo_bursts
class, an uninformative KeyError
will result. Consider replacing these with informative errors.
Histogram of MGD
Right now, if a user runs a nate
pipeline from command line, they won't know what's going on - there's no outward indication of what nate
is working on, or even if it is still working.
We should include a judicious number of informative print statements designed to let users know what nate
is up to.
edge width by weight OR similarity score
with community detection
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.