Git Product home page Git Product logo

intermine / intermine Goto Github PK

View Code? Open in Web Editor NEW
248.0 19.0 350.0 650.93 MB

A powerful open source data warehouse system

Home Page: http://intermine.org

License: Other

Java 64.88% Perl 0.36% Python 0.25% Shell 0.08% HTML 4.99% PHP 0.01% CSS 1.90% JavaScript 26.57% XSLT 0.06% GAP 0.39% Ruby 0.03% Groovy 0.44% SCSS 0.01% Sass 0.04%
data-warehouse java tomcat genomics open-source opensource lgplv3 postgresql clojure clojurescript

intermine's Introduction

InterMine

Master: InterMine CI Dev: InterMine CI Version License Research software impact Conda Documentation Status f A powerful open source data warehouse system. InterMine allows users to integrate diverse data sources with a minimum of effort, providing powerful web-services and an elegant web-application with minimal configuration. InterMine powers some of the largest data-warehouses in the life sciences, including:

For the full list of InterMines, please see the registry

For details, please visit: InterMine Documentation

If you run an InterMine, or use one in your research, in order to improve the chance of continued funding for the project it would be appreciated if groups that use InterMine or parts of InterMine would let us know.

Getting Started With InterMine

For a guide on getting started with InterMine, please visit: tutorial

3min bootstrap

As long as you have the prerequisites installed (Java, PostgreSQL), you can get a working data-warehouse and associated web-application by running an automated bootstrap script:

  # For the testmodel
./testmine/setup.sh

For a genomic application, with test data from Malaria, see BioTestMine

Docker

You can build InterMine using Docker. See https://github.com/intermine/docker-intermine-gradle

Copyright and Licence

Copyright (C) 2002-2022 FlyMine

See LICENSE file for licensing information.

This product includes software developed by the Apache Software Foundation

InterMine Development Roadmap

For more information about the upcoming releases, please visit the InterMine Development Roadmap. For the roadmap, please see here.

Please cite

InterMine: extensive web services for modern biology.
Kalderimis A, Lyne R, Butano D, Contrino S, Lyne M, Heimbach J, Hu F, Smith R, Stěpán R, Sullivan J, Micklem G.
Nucleic Acids Res. 2014 Jul; 42 (Web Server issue): W468-72
doi pubmed

InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.
Smith RN, Aleksic J, Butano D, Carr A, Contrino S, Hu F, Lyne M, Lyne R, Kalderimis A, Rutherford K, Stepan R, Sullivan J, Wakeling M, Watkins X, Micklem G.
Bioinformatics (2012) 28 (23): 3163-3165.
doi pubmed

See zotero for the full list of InterMine publications.

intermine's People

Contributors

ahmedihafez avatar alexkalderimis avatar arremb avatar arunans23 avatar asherpasha avatar boboppie avatar cmdcolin avatar danielabutano avatar david-sharkey avatar flymine avatar heralden avatar jervenbolleman avatar joecarlson avatar joelrichardson avatar jogoodma avatar joshkh avatar justinccdev avatar mlyne avatar nauer avatar nikhil-vats avatar nuin avatar rachellyne avatar radekstepan avatar rnsmith avatar sammyjava avatar sergiocontrino avatar sneuhaus avatar vivekkrish avatar widdowquinn avatar yochannah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

intermine's Issues

reportdisplayer graphs clickable to silces of data

I was wondering if it is possible to provide links from the bars of a histogram (to slices of data) that appear on the report page? This is possible for a list widget where you get the option to view results or create list.

Integrated Region Searching

Would be really nice if region searches could be fully integrated with queries/templates/lists.

  • Use case: want to find candidate genes in the chromosomal regions defined by QTL. Right now user can easily query for the QTL and get their coordinate ranges. But they cannot feed those ranges directly into the region search. (The user would have to reformat the results from the first query in order to paste the ranges into the region search box.)
  • Use case: want to keep a list of regions that I can use to drive the region search.
  • Use case: want to incorporate region search into a query (analogous to how Lookup is integrated?)

Export sequence for provided coordinates

For given genome coordinates, export sequence.
Or export sequences of features with extensions/flankings set by a user (add a input box in results table export popup)

Move docs to github

The docs on intermine.org are in a trac wiki which is partially broken now. Move to github:

  1. reorganise
  2. identify missing bits
  3. docs "closer" to code, more likely to keep updated
  4. run a script to update markup

Keyword search - organism facet should be on parent class

Joel is creating protein coding genes for mouse but genes for human and rat. the keyword search is creating a organism facet for the most specific term (protein coding) instead of the most inclusive (gene or even sequence feature).

Multi-select feature types for export

On the Genomic Regions Search Results page user may find it useful to be able to pick multiple types or the whole list when exporting. This is a feature request.

Recommendations - what is like my gene(s)? what is the same about my gene(s)/lists?

If I have a gene (or a list of genes) I would like to see genes that are similar to my gene - eg. have a similar expression pattern, go annotation, pathways, etc.

Also, given multiple lists, show me where they intersect through a given element eg. Pathways. This is similar, in principle, to the InterMod tool which does this across species. Here the lists may come from one (or more) species.

Regions Tab not highlighted

The Regions Tab is not highlighted after clicking on it. All other tabs are highlighted when clicked on; discovered with FireFox.

Table element style 'max width' too small

Source: Flymine

Problem: In-line table text truncated - element style 'max-width' too small.

To reproduce:
Gene report page - D. mel Cks30A

Interactions --> Toggle table. All table text truncated at 2nd character.

<a class="im-cell-link" href="http://www.flymine.org/release-34.0/report.do?id=103034350" data-original-title="" style="max-width: 9.285714285714285px; ">

        cdc2
    </a>

GOMine: "with" depends on publication and evidence code

Current model assumes (wrongly) that there is a single unique "withText" for a given GO annotation and that with Objects are pipe separated that need to be added to "with" collection.

This is probably the way to model it (check 2.0 GAF format guide)

  • GOID can have one or more evidence codes that can come from one or more publications.
  • Each evidence code can have one more "with" information.
  • evidence code + "with" is specific to that publication.

example data:

gene_association.goa_human:UniProtKB Q13426 XRCC4 GO:0005515 PMID:16275660 IPI UniProtKB:O00499 F DNA repair protein XRCC4 XRCC4_HUMAN|XRCC4 protein taxon:9606 20120723 IntAct
gene_association.goa_human:UniProtKB Q13426 XRCC4 GO:0005515 PMID:16439205 IPI UniProtKB:Q7Z2E3 F DNA repair protein XRCC4 XRCC4_HUMAN|XRCC4 protein taxon:9606 20060203 UniProtKB
gene_association.goa_human:UniProtKB Q13426 XRCC4 GO:0005515 PMID:16439205 IPI UniProtKB:Q9H9Q4 F DNA repair protein XRCC4 XRCC4_HUMAN|XRCC4 protein taxon:9606 20120723 IntAct
gene_association.goa_human:UniProtKB Q13426 XRCC4 GO:0005515 PMID:16439205 IPI UniProtKB:Q9H9Q4 F DNA repair protein XRCC4 XRCC4_HUMAN|XRCC4 protein taxon:9606 20061120 UniProtKB
gene_association.goa_human:UniProtKB Q13426 XRCC4 GO:0005515 PMID:17396150 IPI UniProtKB:Q8IW19 F DNA repair protein XRCC4 XRCC4_HUMAN|XRCC4 protein taxon:9606 20090814 UniProtKB
gene_association.goa_human:UniProtKB Q13426 XRCC4 GO:0005515 PMID:9809069 IPI UniProtKB:P49917 F DNA repair protein XRCC4 XRCC4_HUMAN|XRCC4 protein taxon:9606 20070820 UniProtKB

Interactions remodel

The goal is to make interaction data:

  1. easy to query
  2. easy to understand results
    • reduce duplication of results

Solution:

Moved all duplicate data to "details" collection, eg. Gene.interactions.experiment moved to Gene.interactions.details.experiment. Now gene1 and gene2 pair are unique.

See http://preview.flymine.org for an example.

Further work:

Moving to "details" has removed duplication but our other problems are not solved.

galaxy export not working in YeastMine production (0.98 version)

09/08/12,11:49:44 ERROR org.apache.struts.action.RequestProcessor - Invalid path was requested /galaxyExportOptions

I did not change any galaxy related settings. I see the above error in logs and 'Oops page not found' error when I export to galaxy. Not sure when it stopped working.

Same thing in 1.0 beta as well.

Allow users to share lists

Currently users can create and save lists in their own personal user account. It would be nice if users could share their lists with others.

Javascript query builder

A simple javascript query builder would be nice, something similar to accordb (accordb-intermine.rhcloud.com) for a single data set. Maybe pathways? With other data sets to follow.

The current querybuilder allows the querying of all tables in the database. this querybuilder would only query some fields in some data sets, and only a little more complex than a template query form.

The difficult part would be that this query builder should have the option to query across all of the MODs' intermines.

  1. start with a single data set, eg. pathways
  2. query across all mines (optionally)
  3. looks like accordb
  4. put on crossmodel.org
  5. simple interface, like template query forms
  6. use results tables
    • results / creating lists should link back to the mines though
  7. eventually this will be part of the intermine webapp but not any time soon, very beta
  8. limited functionality (compared to current querybuilder)

Jumpy display: new table controls on report pages

If you have a template displaying on a report page (as a new-style result table), clicking any of the column controls on that table causes the browser to jump to the top of the page.

To reproduce: On FlyMine, go the the gene report page for adh. Scroll down to "Gene --> Orthologues + GO terms of these orthologues." and open the table display. Then sort or toggle a column; when you click, the display jumps to the top of the page.

Special characters (in source data) causing problems with results pages

Source: Flymine preview

Special character (�) - present in source data - breaks results page filtering.

Problem:
Flymine 'Interactions Experiment '
Externally curated molecular interaction network in Drosophila melanogaster.

contains field 'contact-comment' with non-ASCII characters
CNRS, Equipe Bioinformatique des r�seaux de r�gulation, Institut de Biologie du D�veloppement de Marseille-Luminy, 13288 Marseille Cedex 9 (http://www.ibdm.univ-mrs.fr/)

To reproduce:

<query name="" model="genomic" view="Gene.symbol Gene.interactions.details.confidenceText Gene.interactions.details.type Gene.interactions.details.role2 Gene.interactions.details.role1 Gene.interactions.details.confidence Gene.interactions.details.name Gene.interactions.details.experiment.name Gene.interactions.details.experiment.description Gene.interactions.details.experiment.comments.type Gene.interactions.details.experiment.comments.description" longDescription="" sortOrder="Gene.symbol asc">
  <constraint path="Gene.symbol" op="=" value="cdc2"/>
</query>

Column summary on 'Gene.interactions.details.experiment.comments.description' and apply filter on 'CNRS, Equipe Bioinformatique des... ' etc.

No rows are displayed but filtering on other values works fine.

Remove "Choose" from Columns dialog

In the Manage Columns dialog, when you browse for and add a column to the results, you have to click "Choose" and then click "Apply". I persistently forget to click "Choose" (despite the repeated pain it causes) Is it necessary? Could we omit that button? Thanks!

Queries with outer joins fail

Go to the report page for a chromosome. From the on-page table of located features, click "Show all in a table". You get an "oops" error message. (Verified locally and at flymine.)

Here's the error log message:

Caused by: java.lang.IllegalAccessException: Couldn't get field "[interface org.intermine.model.bio.Location].primaryIdentifier" (available fields are [dataSets, end, feature, id, locatedOn, start, strand])
at org.intermine.util.TypeUtil.getFieldValue(TypeUtil.java:105)
at org.intermine.model.bio.LocationShadow.getFieldValue(LocationShadow.java:89)
at org.intermine.webservice.server.core.TableCell.getField(TableCell.java:37)
... 32 more
Caused by: java.lang.NullPointerException
at org.intermine.util.TypeUtil.getFieldValue(TypeUtil.java:96)
... 34 more

"Summarise" link next to templates is still active when webapp is released

  1. run summarise-objectstore postprocess
  2. release webapp
  3. log in as superuser
  4. navigate to my mine > saved templates

All of the "summarise" links next to each postprocess should be inactive / not-clickable.

This is working correctly in FlyMine but not modMine.

  1. Are the templates actually summarised but just the links are wrong?
  2. If not, did the "do-summarise" ant target (run as part of release-webapp) not run?
  3. If the do-summarise target ran, why did it fail?
  4. Did the summarise-objectstore postprocess fail for some reason?

Create list on widgets does not worky

when you click on a widget and click on "create list", nothing happens.

Also, when you click on download at the top, nothing happens. download works correctly when you click on the individual items however; creating a list never works.

Customise export options in webapp

Regarding the 1.0 "Download Pop-Up" when viewing the Query Results page. Currently the download options display all formats (tab, xml, gff3, bed, fastsa …), but given the context of the query results data some of the displayed formats don't make sense (ie, if there is no sequence data, then selecting the fasta format produces meaningless results).

Also, a way to configure the format options, so that "fasta" isn't displayed (we're not planning on providing sequence data anytime soon)?

Query for empty collections / references

In the query builder you can query for attributes being NULL but not references or collections.

I understand the code for this is ready to be tested but we need docs.

OverlappingFeaturesDisplayer may not display all features when selecting 'show all in a table'

We have configured our SequenceFeature class with the field config

fieldconfig fieldExpr="sequenceOntologyTerm.name" label="Type"

This causes a javascript error* when GeneFlankingRegions are displayed in a table using the 'show all in a table' link from the OverlappingFeaturesDisplayer. The resulting table is empty or missing rows.

  • From FireBug:
    error imjs.js (line 943)
    (an empty string) imjs.js (line 945)

Extra character escaping bug

It looks like the new result table display is applying an extra round of character escaping in some circumstances.

Here's the scenario. Mouse allele nomenclature uses angle brackets. These strings are encoded at load time, so that "Cftr<tm1Bay>" is stored as "Cftr&lt;tm1Bay>". For the most part, these display fine, I.e., showing the left angle bracket. For example, on the Cftr gene report page, its table of alleles looks fine. Also, on the allele report page, index search results, etc.

However there is one case where (it looks like) the strings are escaped again, so that the left angle brackets display as "&lt;". This is when I use the query builder. If I return allele symbols, they all come out like this: "Cftr&lt;tm1Bay>"

Configure some attributes/references to hide in the QueryBuilder

Specifically, mitominer wants to hide Gene.ontologyAnnotations because this collection is a duplicate of Gene.goAnnotations. This confuses their users.

This didn't work:

  <class className="org.intermine.model.bio.SequenceFeature">
    <fields>
      <fieldconfig fieldExpr="primaryIdentifier" />
      <fieldconfig fieldExpr="organism.name" name="Organism"/>
      <fieldconfig fieldExpr="ontologyAnnotations" hideInQueryBuilder="true" />
    </fields>
  </class>

Suggestion: QueryBuilder, model browser tree

In the QueryBuilder, it would be very handy if the model browser (left hand) tree would expand to the corresponding point when a node is clicked in the query (right hand) tree. I realize the whole QueryBuilder is up for redesign, but I do think this simple enhancement would be extremely helpful in many circumstances.

Result Table popups extend beyond visible area

The popups that show additional information about an object sometimes extend beyond the visible area. Eg, the popup for the bottom row extends below the visible area. Sometimes (but not always) they extend off the right or left side.

Also, the text within the popup often extends out of the visible area. Long paths exacerbate this.

Column dialog freezes

From a result table, click on "Manage Columns".
Browse for a column and select it.
Click Choose but do NOT click Apply.
Click Browse for column again.
The button freezes.

You can close/repen the dialog to get it working again.

Keyword search autocomplete enabled

Would it be possible to have autocomplete be available on keyword search? Is this something that can be configured or does it need to be implemented. If so, would it be possible. We think it is a useful feature to have.

Multiple admins

Currently you can only have one user be the superuser but there should be a way to have more than one.

Error running query with "ONE OF" constraint

The following query (return basic gene info for (lookup) Fgf4 where species is one of mouse or human) does not display. The javascript console shows lots of 500 server errors. Same thing happens at FlyMine.

<query name="" model="genomic" view="SequenceFeature.symbol SequenceFeature.name 
SequenceFeature.primaryIdentifier SequenceFeature.sequenceOntologyTerm.name 
SequenceFeature.locations.locatedOn.primaryIdentifier SequenceFeature.locations.start SequenceFeature.locations.end 
SequenceFeature.locations.strand SequenceFeature.organism.shortName" longDescription="Return genome features that 
match a specified string. (ID, symbol, name, synonym) for the specified species." sortOrder="SequenceFeature.symbol asc" 
constraintLogic="A and B"><constraint path="SequenceFeature.organism.shortName" code="B" op="ONE OF"><value>H. 
sapiens</value><value>M. musculus</value></constraint><constraint path="SequenceFeature" code="A" op="LOOKUP" 
value="Fgf4"/></query>

New widget type - scatterplot

Joel would like to have a scatterplot.

x-axis: human genome y-axis: mouse genome

dots would be where there is a mouse-human orthologue pair

getcode file empty?

From Results Table --> get code --> Javascript. The file that is save called code looks to be empty.

GOMine: Store all publications not just ones with pubmed IDs

Few issues:

  1. It looks like the current GoConverter only stores publications where a pipe "I" appears i.e. only such references are stored.

SGD_REF:S000073372|PMID:6759872

If there is just a PMID I think it won't be stored. for e.g. in PomBase file..

PomBase SPCC1884.02 nic1 GO:0005887 PMID:10748059

  1. Non PMID references are not stored for e.g.

ZFIN:ZDB-PUB-020724-1
FB:FBrf0159398

So, for GOMine I want to be able to store non PMID references and PMID references. It looks like sometimes there is a database reference for PMID (mod files), and a given PMID might have different databasereference IDs in different GAF files. for example PMID x refers to MGI:Ref1 and SGD:Ref2

I am trying to model like so...(xml below is not showing up...)

<class name="GOEvidence" is-interface="true">
  <reference name="code" referenced-type="GOEvidenceCode"/>
  <collection name="publications" referenced-type="Publication"/>  
  <collection name="dbreferences" referenced-type="DatabaseReference"/> 
  <collection name="pubdbrefs" referenced-type="PublicationDatabaseReference"/> 
</class>
 <class name="PublicationDatabaseReference" is-interface="true">
  <reference name="pubRefId" referenced-type="Publication"/>
  <reference name="dbRefId" referenced-type="DatabaseReference"/>
</class> 
<class name="DatabaseReference" is-interface="true">
  <attribute name="databaseReferenceId" type="java.lang.String"/>
</class> 

This requires modifying GoConverter (storeEvidence() method) to add the right collection and not just "publications"...

if (!publicationEvidence.isEmpty()) {
goevidence.setCollection("publications", publicationEvidence);
}

Skyped Julie about this, making a ticket.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.