Git Product home page Git Product logo

activation_sparsity's People

Contributors

adamskiij99 avatar ndaultryball1 avatar samuel-chlam avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

adamskiij99

activation_sparsity's Issues

TO DO (coding tasks in italics) from Jared

  • The recursion function for hard and self thresholding: Simplify the formulae so that they are easy to understand, not having both the $\phi$ and $\tilde\Phi$, make it as easy to interpret as possible, remember you are writing so that someone wants to read this. Then plot this function, $q$ vs $V(q)$ for a few values of $(\sigma_w,\sigma_b,\tau)$.
  • Formula for $\chi$: Having $\chi=1$ now links $\sigma_w$ and $\sigma_b$ and you can express the $V(q)$ when $\chi=1$ as just a function $V(q; \tau,\sigma_w)$ and illustrate what this looks like; plot it, say, fix $\sigma_w$ and make a surface plot with say $x$ being $q$, $y$ being $\tau$, and $z$ being $V(q,\tau)$, for each of hard and soft thresholding and one that surface plot a line of the fixed point.
  • Prove that $\frac{d}{dq}V(q)$ at fixed points is less than one, and consequently it has a stable fixed point. Compute the fixed points, and make a plot of the fixed point $q*$ as a function $\sigma_w$ and $\tau$; say one dimensional plots of $q*$ as a function of $\tau$ for a few values of $\sigma_w$, then $q*$ as a function of $\sigma_w$ for a few values of $\tau$. Think how you can give a reader insight into what you computed.
  • The correlation map may be harder to compute, but it should be computed as it is an important part of the story. Michael Murray is part of the team and he has code that can do this. Write to him, ask him for some code and if you have trouble using it ask him for guidance on how to adapt it to the new activations.
  • You can also repeat the above for FAT ReLU. There is a lot one can do here to build this up and it would be good to be doing these things.
  • Great to see the loss function for the network. Be sure to explain what actually is being done here. This is a paper about initialisation. How were they initialised? What was $\tau$? Write this will all of the parameters explained so that a reader could reproduce what was done. The experiment should be conducted with $\chi=1$ and a few values of $\tau$ and $\sigma_w$. In addition to showing the loss function, you should show the training and test accuracy. Once you have code that can do this kind of experiment, conduct experiments with a few different choices of network depth and show a table/plot of the final/asymptotic training and test accuracy as a function of $\tau$ and/or the fraction of nonzero in the hidden layers. For a fixed $\sigma_w$ and $\tau$ show the fraction of nonzero in the hidden layers as a function of the training, and then include a few values of $\tau$.
  • Once you can get good results for the soft and hard thresholding one needs to compare with other methods. Repeat the experiments for FAT ReLU and by adding adding $l_1$ regularisation when using a more traditional nonlinear activation such as ReLU and hard tanh. Be sure to initialise them as in the edge of chaos theory. Include curves for these methods alongside your results, hoping that you can show an improvement over their approaches. Some thought will need to go into deciding what experiments to do here. Mostly one will be aiming to have their approaches give a similar fractional sparsity as your approach (for some given $\tau$), determine the parameters in their approach that will give this, and then plot things like training and test accuracy, plots of final test accuracy, etc…

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.