Git Product home page Git Product logo

clusdoc's Introduction

ClusDoC package for co-clustering analysis for single molecule localization microscopy (SMLM) data

This software was used for the following publications: SV Pageon, PR Nicovich, M Mollazade, T Tabarin, K Gaus. "Clus-DoC: a combined cluster detection and colocalization analysis for single-molecule localization microscopy data" Molecular Biology of the Cell 27 (22), 3627-3636.

Pageon, Sophie V., et al. "Functional role of T-cell receptor nanoclusters in signal initiation and antigen discrimination." Proceedings of the National Academy of Sciences (2016): 201607436.

Requirements

  • MATLAB 2014b or later
  • Distributed computing, Image Processing, and statistical analyses toolboxes

Compiled dependent MEX functions for 64 bit PC are included in the repository. Source files are included in the .\private\mexSource folder. You will need to compile these functions and replace those in the .\private\ folder to run on architectures other than 64 bit Windows.

Quick start

  • Clone all files into the desired folder, either by downloading package link or through git clone https://github.com/PRNicovich/ClusDoC.git.

  • Navigate to local cloned repository in MATLAB file path

  • Execute by calling 'ClusDoC' at command prompt.

  • Once GUI window opens, click on 'Select Input File(s)' button. In subsequent pop-up, select file .\Test dataset\1.txt. File .\Test dataset\coordinates.txt should also load.

  • Select output folder by clicking 'Set Output Path' button. The default choice of .\Test dataset\ is sufficient.

  • Proceed with choosing ROIs or downstream analysis.

clusdoc's People

Contributors

bioturbonick avatar prnicovich avatar thibaultunsw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

clusdoc's Issues

Note for reduced memory usage

I discovered that the entry to DBSCANHandler.m is the primary reason why lots of points can cause memory usage to blow up and block analysis. This modification helps get past it in most situations on my machine.

if length(Data(:,1:2)) <= 20000
        distRow = pdist(Data(:,1:2));
        nPossibleClustering = sum(distRow < DBSCANParams.epsilon);
        if nPossibleClustering >= DBSCANParams.minPts
            checkClusterTest = true;
        else
            checkClusterTest = false;
        end
    else
        checkClusterTest = true; % too many points to check with above method, just assume it works
        % confirmed that this one test is the largest barrier to memory. No
        % other method choked.
    end

Note for clusters with no overlaps

Replace lines 8-20 of ExportDBSCANDataToExcelFiles.m with:

    notemptyA = ~cellfun('isempty', A); % empty array cells are not ROIs, they were unused placeholders in the matrix
    notmissingA = ~cellfun(@(x) isstructmissing(x), A); % missing cells are ROIs that had no interactions, and do no appear in the cellROIPair table; can't use ismissing directly because it fails with structs
    not_empty_or_missingA = notemptyA & notmissingA;
    
    Percent_in_Cluster_column(not_empty_or_missingA) = cell2mat(cellfun(@(x) x.Percent_in_Cluster, A(not_empty_or_missingA), 'UniformOutput', false));
    Number_column(not_empty_or_missingA) = cell2mat(cellfun(@(x) x.Number, A(not_empty_or_missingA), 'UniformOutput', false));
    Area_column(not_empty_or_missingA) = cell2mat(cellfun(@(x) x.Area, A(not_empty_or_missingA), 'UniformOutput', false));
    Density_column(not_empty_or_missingA) = cell2mat(cellfun(@(x) x.Density, A(not_empty_or_missingA), 'UniformOutput', false));
    RelativeDensity_column(not_empty_or_missingA) = cell2mat(cellfun(@(x) x.RelativeDensity, A(not_empty_or_missingA), 'UniformOutput', false));
    TotalNumber(not_empty_or_missingA) = cell2mat(cellfun(@(x) x.TotalNumber, A(not_empty_or_missingA), 'UniformOutput', false));
    Circularity_column(not_empty_or_missingA) = cell2mat(cellfun(@(x) x.Mean_Circularity, A(not_empty_or_missingA),'UniformOutput', false));
    Number_Cluster_column(not_empty_or_missingA) = cell2mat(cellfun(@(x) x.Number_Cluster, A(not_empty_or_missingA), 'UniformOutput', false));

and then a few lines down:

    Matrix_Result = [Percent_in_Cluster_column(notemptyA)'*100 , Number_column(notemptyA)' , Area_column(notemptyA)' , Density_column(notemptyA)'*1e6 ,...
        RelativeDensity_column(notemptyA)', TotalNumber(notemptyA)', Circularity_column(notemptyA)', Number_Cluster_column(notemptyA)', Number_Cluster_column(notemptyA)'./(1e-6*cellROIPair(:,5))];

The problem is that A contains an empty entry for the cluster in cellROIPair but they're filtered out of all the parameter vectors. This change preserves a zero-value entry for each one, so that the code can continue.

This also depends on returning a missing for varargout{4} in DBSCANHandler if checkClusterTest is false, so that unprocessed ROIs and array placeholders can be differentiated.

Clus-DoC with thunderSTORM tables

Hi

I'm looking to utilize this tool in my research as keeping our own tools up to date, customizable and error free has become extremely cumbersome. However, I aquire images on a nikon camera and process with ThunderSTORM. My issue lies in not being able to run coloc analysis as my files only contain single-colour and I'd have to feed in two files to have multi-colour images of the same cell.

Is there a way to circumvent this I am missing, or is it possible to implement this?

Potentially an absolutely wonderful tool though.

Error when zero clusters identified

When running DBSCAN for All and clustering parameters are set that result in 0 clustered points I get the following error message:

Output argument "SumofContour" (and maybe others) not assigned during call to "DBSCANHandler".

Error in ClusDoC>DBSCAN_All (line 2346)
                            [~, ClusterSmoothTable{roiInc, c}, ~, classOut, ~, ~, ~, Result{roiInc, c}] = ...
 
Error while evaluating UIControl Callback.

Is it possible to output empty instead of stopping the entire batch?

Swapped columns (H, I) in excel export

My colleague (@BioTurboNick) and I found that columns H and I in the excel export are swapped. The problem code is the two marked items below from EvalStatisticsOnDBSCANandDoCResults.m:

Matrix_Result=[DensityDofC...
                    Density2...
                    AreaDofC...
                    Area2...
                    CircularityDofC...
                    Circularity2, ...
                    cell2mat(MeanNumMolsPerColocCluster(:)), ...
                    cell2mat(NumColocClustersPerROI(:)), ... ####
                    cell2mat(MeanNumMolsPerNonColocCluster(:)), ... ####
                    cell2mat(NumNonColocClustersPerROI(:))];

Lr threshold calculation may be in error; though impact may be minimal

I may have discovered an error in the Lr calculation and threshold. The result is that all localizations with any neighbors are deemed above the threshold, and only those with no neighbors are deemed below the threshold.

I discovered it as I'm rewriting ClusDoC in Julia, and was struggling with understanding the Lr function.

I don't know what the full effect is yet, I'm still investigating. It may just mean more noise ends up in the final numbers than intended. It may also end up having no real effect (see last sentence).

In brief, I believe what the Lr function is intended to do is to calculate the radius at which one would expect to find the number of neighbors actually observed around a localization, given an even distribution across the ROI.

However, the current implementation of the Lr function includes an extra SizeROI factor, which I think inflates the calculated value of Lr. (was SizeROI at one point expected to be the side of a square instead of an area?)

The threshold as currently implemented is also calculating the number of neighbors expected within Lr_radius. However, there are traces of an old definition which says the threshold calculation was essentially equal to Lr_radius, which is exactly what I would expect the threshold to be based on the above understanding of what Lr is calculating: a radius.

Thresholding on the number of neighbors within Lr_radius does make sense if you don't need something robust to different populations of localizations. It seems that latter part was the intention, but it's only used for all-vs-all within the same ROI, so such normalization may not be necessary.

And it may well be that in most cases, localizations are sparse enough that the result of a corrected calculation would end up being the same. I'm seeing Lr_threshold values of <1 in some test datasets, which would suggest that's the true in large, disperse ROIs. Very dense ROIs more likely to be affected.

questions

I am a PhD student from Hainan University, China. We recently developed a software for processing SMLM data and wrote an article for submission, and we used the test data( 1.txt) to test our software. First, there is a question I would like to ask you, which two biomolecules were imaged by SMLM to obtain this two-color data? Secondly, we will mention the use of that data in our article. Therefore, in which way would you like us to thank you for your help (acknowledgement or authorship).
Best wishs.

Error in bulk processing of ClusDoC

When processing data in bulk with the full ClusDoC tool I get the following message after the program finnishes DBSCAN on the first channel and is supposed to move on to the next:

DoC exited with errors
Matrix index is out of range for deletion.

Error in ExportDBSCANDataToExcelFiles (line 8)
    cellROIPair(cellfun('isempty', A), :) = []; % filter out empty ones

Error in DBSCANonDoCResults (line 139)
        ExportDBSCANDataToExcelFiles(cellROIPair, ResultCell, strcat(Path_name, '\DBSCAN Results'), Ch);

Error in ClusDoC>DoC_All (line 2515)
            [ClusterTableCh1, ClusterTableCh2, clusterIDOut, handles.ClusterTable] =
            DBSCANonDoCResults(handles.CellData, handles.ROICoordinates, ...
 
Error while evaluating UIControl Callback.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.