Git Product home page Git Product logo

corex's People

Contributors

bryant1410 avatar gregversteeg avatar sbagri avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

corex's Issues

Many 'print' statement in the code raise SyntaxError

One of the failing message, for line 264 of corex.py is copied below, I am using python 3.5.1

print "Warning: Data matrix values should be consecutive integers starting with 0,1,..."
^
SyntaxError: Missing parentheses in call to 'print'

Continuous variables?

Hi,

Any progress on CorEx with continuous variables? It would be great as most of my data is continuous variables.

Thanks for the great work!

About the paper

Hi, CorEx is a great work! And I am recently reading the paper , and I get trouble in understanding the optimization part in Sec. A.
The original objective is like this:
01
And because of:
02
Then:
03
Why the red-box part can be extracted? I think $p(x)p(y| x)$ is different from $p(x_i)p(y| x_i)$. Can you offer more details? Thanks!

trianglation library missing

Working on a Mac with anaconda. Did not have sfdp.
Did conda install graphviz
Now get the trianglation library missing below.
Suggestions appreciated.

(base) katherinepaseman@Bills-MacBook-Pro-3 bio_corex % python vis_corex.py data/test_data.csv
vis_corex.py:719: DeprecationWarning: 'U' mode is deprecated
with open(filename, 'rU') as csvfile:
Time for first layer: 0.05
TC at layer 0 is: 2.078
TC at layer 1 is: -0.000
Groups in sorted_groups.txt
Pairwise plots among high TC variables in "relationships"
/opt/anaconda3/lib/python3.7/site-packages/seaborn/distributions.py:306: UserWarning: Dataset has 0 variance; skipping density estimate.
warnings.warn(msg, UserWarning)
/opt/anaconda3/lib/python3.7/site-packages/seaborn/distributions.py:306: UserWarning: Dataset has 0 variance; skipping density estimate.
warnings.warn(msg, UserWarning)
/opt/anaconda3/lib/python3.7/site-packages/seaborn/distributions.py:306: UserWarning: Dataset has 0 variance; skipping density estimate.
warnings.warn(msg, UserWarning)
/opt/anaconda3/lib/python3.7/site-packages/seaborn/distributions.py:306: UserWarning: Dataset has 0 variance; skipping density estimate.
warnings.warn(msg, UserWarning)
/opt/anaconda3/lib/python3.7/site-packages/seaborn/distributions.py:306: UserWarning: Dataset has 0 variance; skipping density estimate.
warnings.warn(msg, UserWarning)
/opt/anaconda3/lib/python3.7/site-packages/seaborn/distributions.py:306: UserWarning: Dataset has 0 variance; skipping density estimate.
warnings.warn(msg, UserWarning)
weight threshold is -0.000000 for graph with max of 12.000000 edges
non-isolated nodes,edges 6 4
Error: remove_overlap: Graphviz not built with triangulation library
Error: remove_overlap: Graphviz not built with triangulation library
non-isolated nodes,edges 6 4
Error: remove_overlap: Graphviz not built with triangulation library
Error: remove_overlap: Graphviz not built with triangulation library
(base) katherinepaseman@Bills-MacBook-Pro-3 bio_corex %

MemoryError

Hi,
I'd like to apply your method to my research problem (computational systems biology).
I'm trying to run fit method with my data (2d array, dim: 30 x 13800) and keep getting the following error:

Warning: Data matrix values should be consecutive integers starting with 0,1,...
Traceback (most recent call last):
  File "projectrepo/green/corex_fernandez/test.py", line 17, in <module>
    lv31.fit(arx)
  File "/home/fewpills/projectrepo/green/corex_fernandez/corex.py", line 202, in fit
    self.fit_transform(X)
  File "/home/fewpills/projectrepo/green/corex_fernandez/corex.py", line 229, in fit_transform
    self.update_marginals(X_event, self.p_y_given_x)  # Eq. 8
  File "/home/fewpills/projectrepo/green/corex_fernandez/corex.py", line 289, in update_marginals
    self.log_marg = self.calculate_p_y_xi(X_event, p_y_given_x) - self.log_p_y
  File "/home/fewpills/projectrepo/green/corex_fernandez/corex.py", line 299, in calculate_p_y_xi
    pseudo_counts = 0.001 + np.dot(X_event, p_y_given_x).transpose((1,0,2))  # n_hidden, n_events, dim_hidden
MemoryError

I've tried to run it both with python 2.7 and 3.4 as a script.
I'd be grateful for any suggestions.

Mean of empty slice

I get the following warning when running your example with four patterns having five variables:

/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
warnings.warn("Mean of empty slice.", RuntimeWarning)

Wondering if this is serious or can be safely ignored.

And by the way, does this implementation support sparse matrices in input?

S&P500 data

Re 2015 paper, I don't see the S&P components in any data directory; and quantquote does not seem to have it. Showing me a data source would be helpful.
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.