Git Product home page Git Product logo

libre's People

Contributors

ambert15 avatar antonio-ionaton avatar kaczmarj avatar p-koo avatar rohitkt10 avatar shtoneyan avatar

Watchers

 avatar  avatar  avatar  avatar

libre's Issues

improve speed of one-hot encoding

I have an implementation to one-hot encode sequences that is 3x faster than the current implementation.

https://github.com/kaczmarj/rotation-koo-lab/blob/310bdf589c59f4ef9ef8511a479e69925c4bbcf3/chip-seq-to-hdf5-dataset/chipseq_utils.py#L317-L363

This is the current implementation:

https://github.com/p-koo/cipher/blob/48b1dccdd42576b2bf556edbc7243a838d9deaf0/cipher/preprocess/singletask.py#L160-L226

My implementation does not strip or pad the sequences, but i can add that.

discuss hard-coded queries in `filter_encode_metatable`

filter_encode_metatable has some hard-coded queries, which might be better suited as variables. for example, the function keeps rows with File assembly GRCh38, but sometimes I can imagine something else might be desired.

perhaps we can re-work this function to allow the user to enter a query? pandas.DataFrame.query() might be useful here.

https://github.com/p-koo/cipher/blob/03e184824cc64d38bb626e80ecc326f1f45653c7/cipher/preprocess/wrangle.py#L59-L60

discuss behavior of enforce_constant_size

The function enforce_constant_size accepts a path to a bed file and writes a bed file. This feels more like command-line behavior, where input and output are files. I would suggest that this function should take in a pandas dataframe representation of a bed file, and it should return a modified dataframe. I suggest that the processing script that uses this function should take care of loading and saving files, if it is necessary.

I have an implementation of this here: https://github.com/kaczmarj/rotation-koo-lab/blob/310bdf589c59f4ef9ef8511a479e69925c4bbcf3/chip-seq-to-hdf5-dataset/chipseq_utils.py#L102-L132

https://github.com/p-koo/cipher/blob/bbfaaf454eff4c700a0d6eaec50997a072c2e2ce/cipher/preprocess/wrangle.py#L7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.