btel / spikesort Goto Github PK
View Code? Open in Web Editor NEWSpike sorting library implemented in Python/NumPy/PyTables
Home Page: http://spike-sort.readthedocs.org
License: Other
Spike sorting library implemented in Python/NumPy/PyTables
Home Page: http://spike-sort.readthedocs.org
License: Other
Spike sorting library implemented in Python/NumPy/PyTables ---------------------------------------------------------- Project website: spikesort.org Requirements: * Python >= 2.6 * PyTables * NumPy * matplotlib Optional: * scikits.learn -- clustering algorithms * neurotools -- spike train analysis Test dependencies: * all the above * hdf5-tools To see the library in actions see examples folder.
The names of the features which required by the components are currently hard coded, like so:
class SpikeDetector(base.Component):
"""Detect Spikes with alignment"""
waveform_src = base.RequiredFeature("SignalSource",
base.HasAttributes("signal"))
which imposes some limitations on how those components may be interconnected. For example, if one wants to use different filter for spike extraction, it will not be possible without inheriting the current SpikeExtractor and modifying the code.
One possible way to overcome this limitation without change in the API:
class SpikeDetector(base.Component):
"""Detect Spikes with alignment"""
def __init__(self, ... , waveform_src = "SignalSource"):
self.waveform_src = base.RequiredFeature(waveform_src,
base.HasAttributes("signal"))
extracelluar recordings are stored in 2d array, where first dimension are the samples and second channels (contacts). However from point of view of optimisation it is better to store the fastest changing index in the second dimension (C ordering). rewrite IO, tests and extract to use this ordering
The API changes are:
f_filter
argument in GenericSource
anymorecore.extract.py
to core.filters.py
as we agreed:
Just like "+" and "-" allow scaling of the Y-axis, it would be great if e.g., "Shift-+" and "Shift--" scaled the X-axis. That way one wouldn't have to restart the GUI every time one wanted to see a different time scale.
highlighted spikes should come up in the foreground
I often find myself spending quite some time scrolling around in the SpikeBrowser to locate a cell which has few spikes. It would be nice to have a kind of selective browsing. One possible way is to have an additional attribute, something like:
>>> browser.browse_lablel
'all'
>>> browser.browse_label=1
Pressing the next
button will then scroll around the cells with label=1 only.
on clean debian system unit test complain about missing modules:
patterns
Remove the import.
fix spike_sort.core.extract.remove_doubles()
Dashboard can sometimes throw an exception when:
components.py, line 665:
UnboundLocalError: local variable 'spt' referenced before assignment
mask = features.get('is_valid')
if mask is not None:
valid_data = data[mask,:] # <-- here it fails
cl = cluster_func(valid_data, *args, **kwargs)
labels = np.zeros(data.shape[0], dtype='int')-1
'mask' array is full-lengt and 'data' is truncated by ClusterAnalyzer to hold features only for selected cell
add LICENSE file to the root directory
When running the tutorial I get the following exception:
Traceback (most recent call last):
File "examples/sorting/cluster_beans.py", line 58, in <module>
browser.show()
File "/usr/local/lib/python2.7/dist-packages/SpikeSort-0.1-py2.7.egg/spike_beans/components.py", line 475, in show
self._draw()
File "/usr/local/lib/python2.7/dist-packages/SpikeSort-0.1-py2.7.egg/spike_beans/components.py", line 466, in _draw
self._set_data()
File "/usr/local/lib/python2.7/dist-packages/SpikeSort-0.1-py2.7.egg/spike_beans/components.py", line 488, in _set_data
labels = self.label_src.labels
File "/usr/local/lib/python2.7/dist-packages/SpikeSort-0.1-py2.7.egg/spike_beans/base.py", line 111, in __get__
self.result = self.Request(obj)
File "/usr/local/lib/python2.7/dist-packages/SpikeSort-0.1-py2.7.egg/spike_beans/base.py", line 130, in Request
% (obj, self.feature)
AssertionError: The value <spike_beans.components.ClusterAnalyzer object at 0x39f2550> of 'LabelSource' does not match the specified criteria
Using SpikeBrowser()
works fine. Not sure exactly what's going on here. Looks like the labels
attribute of instances of the SpikeBrowser
class doesn't exist, but the RequiredFeature
class makes it a necessity to pass the (currently failing) assertion.
It would be nice to have electrode labels on the spike browsing plot, since it is fairly common for electrodes to not be linearly ordered on their respective shanks, e.g., the first shank (going from left to right) from top to bottom is numbered 1, 3, 2, 6. It would be nice to see these numbers in the spike browsing plot.
is_masked is now counterintuitive - the name suggests that it is True for masked (invalid) spike, but in fact it is currently False for invalid and True for valid spikes.
Documentation says it is the correct (intuitive) way, which is obviously wrong!
this method is currently in SpikeDetector component. Applying this function results in updating all the observers, including ClusterAnalyzer, which vanishes all previous clustering results in the current session.
Support to new dataformats can be added via the neo library.
It's getting no so easy to get matplotlib 1.0.1 running with new libraries. In particular, it's already impossible to compile it against libpng>=1.1 and new tk module in python 2.7.2.
I've generated couple of patches based on the new mpl releases, so we can probably mention them in the documentation. Faced this problem in rolling-release distro (Arch).
Patches are here: https://gist.github.com/2294220
It would be extremely useful to be able to see the detected spikes highlighted on the "features" plot simultaneously as one is scrolling through them in the spike browser.
if type(thresh) is str or type(thresh) is unicode:
# ...
should be changed to
if isinstance(thresh, basestring):
# ...
Also,
type(someInstance) is TypeObject
should be avoided.
running browse_data.py example brings up the spike browser window, but with no buttons. clicking in the area where buttons should appear works!
Some of the code in, for example, src/spike_sort/core/evaluate.py uses peak_to_peak = avg_spike.max()-avg_spike.min()
. NumPy has such functionality built in as the ptp function.
I personally find it easier to grok code faster if the classes are separated into their own individual files (with related support functions if necessary). What I do in the case that they aren't in their own files is run the code and grep
until I find what I'm looking for. So really, this issue could be called "Reduce the baseline amount of grep necessary to understand the code".
A little background just in case. With the advent of OOP came one of its most useful behaviors: polymorphism. This allows for repetitive code to be abstracted into a base class from which related classes inherit and can (or must) override one or more methods. In this case, many of the filtering classes perform the same action of checking the coefficient cache and then updating it if necessary. This allows one to create a method, call it m
, that calls the method that must be overloaded, call it om
. Since m
is inherited by all subclasses, a given subclass, call it Sub
, can call m
and m
will know to call the om
method of Sub
. I've written this code, but I need to write a test (and pass it!) before pushing it.
insted of reading whole spike waveforms into memory. read them by chunks and copy to pytables/memmap array.
User-defined classes should (ultimately) inherit from Python's root object object
.
Some weird plotting artefats can be observed when using spikesort with matplotlib 1.1.0. Among those: absence of text in the Legend plot, dark (almost black) colored feature plots
Spinx 1.1 complains about not being able to import *_src attributes of SpikeBeans components. These attributes are instatiated on runtime and they give an exception when the source was not defined.
Solution:
make __src private by prefixing them with underscore, that is: ___src
This is a VERY rough idea, but I've run in to an issue where I need to threshold each channel separately (we have many recordings that are not performed with tetrodes). It might be nice to have a parameter that allows one to run an arbitrary Python callable that conforms to the interface of detect_spikes
.
cluster_manual takes forever to run and does not show any windows
add functions for basic analysis of spike trains:
use numpy docs convention for docstrings: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt#docstring-standard
class dictproperty(object):
"""implements collection properties with dictionary-like access.
Adapted from `Active State Recipe 440514:
<http://code.activestate.com/recipes/440514-dictproperty-properties-for-dictionary-attributes/>`_
"""
is imho rather a copy of the code from there so please be nice and at least list original author/license (not sure yet how PSF would play with your BSD-2, probably making the whole project under PSF...)
Make code conform to PEP8 style for easier reading.
PyTables is very specialised dependency and it may be hard to install on some systems. It may increase the "entry level" for perspective user. Replacing PyTables with memmapped numpy should not affect performance too much, so pytables can be dropped in future versions.
cdist
does the same thing as _metric_euclidean
in cluster.py
and has a more efficient C implementation. It also has different distance metrics should the need for that ever arise in the future.
A quick benchmark using IPython's timeit
functionality reveals the following:
def euc(a, b):
m, n = a.shape
p, k = b.shape
if n != k:
raise TypeError('a and b must have the same number of columns')
delta = np.zeros((m, p))
for d in xrange(n):
delta += np.subtract.outer(a[:, d], b[:, d]) ** 2
return np.sqrt(delta)
from scipy.spatial.distance import cdist
a = randn(100, 100)
b = randn(100, 100)
timeit euc(a, b) # 100 loops, best of 3: 7.72 ms per loop
timeit cdist(a, b) # 1000 loops, best of 3: 1.45 ms per loop
Around line 305 in src/spike_sort/core/extract.py
the following code spits out an exception:
spWave[:, i, :] = sp_data[contacts, sp + win[0]:sp + win[1]].T
The reason is because spWave[:, i, :]
has dimension (n, 1)
and sp_data[contacts, sp + win[0]:sp + win[1]].T
has dimension (n,)
. My version of Numpy doesn't allow such arrays to be assigned to one another.
I'll submit a fix for this: change sp_data[...].T
to np.atleast_2d(sp_data[...]).T
. This adds a dimension to any 0 or 1D arrays to make them into a 2D array and does nothing for arrays with ndim
>= 2.
Personally, I find Tk to be very ugly (no judgment on the choice to use it though, it can be MUCH easier to work with Tk than other GUI toolkits in Python), and it doesn't jive with Gnome or OS X very well.
highlighted spikes should come up in the foreground
plotting large number of spikes >10000 by PlotSpikes component takes a couple of seconds and significantly affects spike sorting experience
extract_spikes
and align_spikes
are brutally slow right now. I think it would be useful to attempt to speed up the loops in these functions using Cython. Of course, if there's something I'm missing about how to make this faster using, e.g., PyTables (somewhat) fast I/O then please let me know.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.