Git Product home page Git Product logo

datagrok-ai / public Goto Github PK

View Code? Open in Web Editor NEW
43.0 15.0 26.0 1.89 GB

Public package repository for the Datagrok.ai platform

License: MIT License

JavaScript 17.45% TypeScript 65.72% Dockerfile 0.12% Batchfile 0.11% Shell 0.35% Java 8.29% Kotlin 0.04% Dart 0.02% CSS 2.18% HTML 0.09% Python 4.96% C 0.02% R 0.26% Julia 0.04% MATLAB 0.03% HCL 0.02% PLSQL 0.12% HiveQL 0.08% Cypher 0.08% SCSS 0.01%

public's Introduction

Datagrok package repository

This is a public repository for the API, tools, and packages available for Datagrok™, a next-generation web-based data analytics platform. The platform is very extensible, and almost anything could be implemented as a package:

These open-source packages are free to use by anyone, although for the public environment there are some restrictions related to the server computational capacities. Organizations that deploy Datagrok on their premises also can access public packages. In addition to that, enterprises typically establish their own private repositories that contain proprietary extensions.

For developers: check out getting started and contributor's guide.

Academia

Datagrok grants free license to academic institutions to use it in any context, either research or educational. Moreover, publishing scientific methods as Datagrok packages provides a number of unique benefits that are specifically important to academia:

For academic collaborations, please email [email protected].

Ideas for contributions

If you want to get familiar with the platform, here are some ideas. Pick whatever interests you, and reach out to Andrew ([email protected]) or post on our community forum.

  • Visualizations
    • Gantt chart
    • Port visjs-based network diagram from Dart to JavaScript
    • WebGL-based rendering of the 2D scatter plot to work with 10M+ points
    • Event drops
  • Scientific methods
    • Statistical hypothesis testing
    • Bayesian statistics
    • Computer vision
    • NLP
  • File editors and viewers
  • File metadata extractors (see Apache Tika)
  • WASM-based support for digital signal processing
  • Domain-specific algorithms
  • Connectors to web services and open datasets
  • Bioinformatics
  • Telecom
  • Fintech

See also

public's People

Contributors

adrkn avatar aleksashka11 avatar alex-aprm avatar annamuza avatar aparamonov-datagrok avatar aufarzakiev avatar chopovsky avatar dependabot[bot] avatar dnillovna avatar drizhina avatar dskatov avatar github-actions[bot] avatar illarionow avatar laykdimon avatar lbankurova avatar mariadolotova avatar nikolay-alemasov avatar nuradilk avatar onuf avatar osakhniuk avatar pavlopolovyi avatar simpleprofi avatar skalkin avatar sssavenko avatar stleonidas avatar tanas80 avatar vadymkovadlo avatar vasner avatar vdyma avatar vmakarichev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

public's Issues

ClinicalCase: Timelines view

The ability to view specified events (including time intervals) for the specified cohort on one chart with the X axis set to event date, aligned to the beginning of the study

Charts: Histogram: mark quartiles range

Mark the quartiles Q1 and Q3 on the Histogram axe, highlight the area outside this range with filled rectangles. Set default filter markers to these values.

datagrok-tools: package content validation: check scripts location

Scripts supposed to be executed on the server have a special place in a package -- the /scripts directory. If script files are somewhere else, they may not be loaded properly into the platform. To prevent such cases, the location of files can be checked during the package publication step (grok publish).

#98: Core: Dataframe view

Implement a way to create view of DataFrame.
View creation function should accept BitSet / list of row indices along with list of column id's.

df = DG.FromJson({'a': [1, 2, 3, 4], 'b': [5,6,7,8]})
indices = [0,2,3] 
cols = ['b']
v = df.view(indices ,cols )

v => 'b'
         5
         7
         8

Chem: malformed molString

сруь

chem_error throws error: malformed molString for arguments that are seemingly good

`
Mrv1902 02241904102D

11 10 0 0 0 0 999 V2000
0.9818 0.5834 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
0.9818 -0.5834 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0
0.3984 0.0000 0.0000 Pt 0 0 0 0 0 0 0 0 0 0 0 0
0.0286 1.3803 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.9818 -0.3699 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.9818 0.3698 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0285 -1.3803 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.1849 -0.5834 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.3984 -1.3803 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.1849 0.5834 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.9818 0.7969 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
3 1 1 0 0 0 0
3 2 1 0 0 0 0
3 8 1 0 0 0 0
3 10 1 0 0 0 0
10 6 1 0 0 0 0
10 4 1 0 0 0 0
8 5 1 0 0 0 0
8 7 1 0 0 0 0
8 9 1 0 0 0 0
10 11 1 0 0 0 0
M END
`

Full-screen modal dialog works incorrectly

If a dialog is set to be modal and full screen with .showModal(true), clicking outside of the dialog will result in closing it.

Example of such behavior can be found in JS API Examples: ui/dialogs/full-screen.js.

#96: Core: input focus loses in train model

When trying to change string/numeric input value in ML -> Train model... after opening a feature selection menu, focus is lost and the value is not changed. It happens only once exactly after using the feature selection menu, trying to change the value again works as expected.

Tutorials package

A collection of interactive tutorials

  • Interactive hints
  • Tracks (collections of tutorials)
    • Exploratory data analysis
    • Machine learning
    • Cheminformatics
  • Badges

Help in dialogs doesn't work

When question mark is clicked in a dialog window, it doesn't lead anywhere even if helpUrl is set and valid.

Example of such behavior can be found in JS API Examples ui/dialogs/dialogs.js.

Bio: port Clustal algorithm to TypeScript

This will be used for the client-based multiple sequence alignments across our bioinformatics solutions when it's computationally feasible (examples: peptides).

@StLeonidas , which algorithms are we implementing exactly (ClustalV / ClustalW / ClustalOmega / etc)?

OpenChemLib package

  • Integrate OpenChemLib sketcher as one of the molecular sketchers
  • Other cheminformatics functionality

#70: PowerPack: Power Search

Ability to search for anything from the start screen, with the special support for the following:

  • Widgets
  • Functions
  • Applications
  • User-defined external apps (via iframe)
  • Entities (connections, queries, etc)

Core: Charts: Histogram: reference distributions

It is useful to be able to render reference pre-calculated distributions on top of the real, calculated ones. It would make sense to keep them with the dataframe in column tags. This would enable many scenarios, such as constructing a dataframe on a server or in a Python script.

Once a reference distribution is defined for a column, then the histogram (and filter) would automatically pick it up and show.

I propose the following approach:

demog.weight.tags['.ref-distributions'] = `{
  'healthy': [2, 4, 45, 76, 45, 5, 5, 6, 2, 1],
  'sick': [3, 4, 56, 76, 45, 5, 5, 6, 2, 1],
}`;

#89: Add min, max, step options to viewer's numeric properties

Currently it is not possible to set int and float viewer properties to be in specific range and they all default to range from 0 to 100 on a slider.

It would be useful to set min, max and step options for int and float properties like this:
this.myIntProperty = this.int('myIntProperty', 0, {min: -10, max: 10, step: 2});
this.myFloatProperty = this.float('myFloatProperty', 0, {min: -1, max: 1, step: 0.2});

#76: Tutorials: Badges

As users go through the tutorials, we'd like to track their progress and give rewards for their achievements. For example, they can be shown in a separate pane of the user's profile.

Ketcher package

Integrate Ketcher as one of the molecule sketchers supported by the platform.

https://lifescience.opensource.epam.com/ketcher/index.html

Example of a similar integration with the OpenChemLib's sketcher: https://github.com/datagrok-ai/public/blob/master/packages/OpenChemLib/src/ocl-sketcher.ts

Existing prototype (works but does not synchronize with smiles): https://github.com/datagrok-ai/public/tree/master/packages/Ketcher

To open a dialog with the sketcher:
grok.chem.showSketcherDialog();

Peptides: SAR

Wiki: Structure-Activity Relationship (SAR) is an approach designed to find relationships between chemical structure (or structural-related properties) and biological activity (or target property) of studied compounds.

We need to build an interactive analysis tool to help identify point mutations that cause large change in activity.

Input is a dataframe with the following columns:

  1. peptide: aligned sequences (max length N, number of unique amino acids A)
  2. activity: measured activity

Possible outputs:

An N*A dataframe where (n, a) element contains a measure of the change of activity caused by mutating the amino acid at position n to a.

Peptides: peptide space

Visualize a collection of peptides in 2-dimensional space, using t-SNE with the edit distance function. Implement Peptide space feature similar to Chemical space feature in Chem package:

  • Implement an algorithm to calculate similarities for a pair of sequences (semtype - alignedSequence)
  • Use SPE algorithm to get 2d map of peptide sequences
  • Implement a new feature in the property panel of each column with alignedSequence values
  • Implement embedding as a web worker to not slow down the main thread.
  • Benchmark UMAP performance on dataset of 40k sequences.

#71: PowerPack: Power Widgets

A start page that contains widgets (annotated with the dashboard tag) that are dynamically discovered from the packages available to the current user.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.