Git Product home page Git Product logo

intermine-ws-python's Introduction

The InterMine Python Webservice Client

Build Status Version License Research software impact

Conda Anaconda-Server Badge Downloads

An implementation of a webservice client for InterMine webservices, written in Python

Who should use this software?

This software is intended for people who make use of InterMine datawarehouses (ie. Biologists) and who want a quicker, more automated way to perform queries. Some examples of sites that are powered by InterMine software, and thus offer a compatible webservice API, are:

See the InterMine registry for the full list of available InterMines.

Queries here refer to database queries over the integrated datawarehouse. Instead of using SQL, InterMine services use a flexible and powerful sub-set of database query language to enable wide-ranging and arbitrary queries.

Requirements

This package is compatible with both Python 2.7 and 3.x. We plan to drop 2.7 support next year.

Downloading:

The easiest way to install is to use pip:

  pip install intermine

The client is also available via bioconda.

Running the Tests:

If you would like to run the test suite against a local mock server, you can do so by executing the following command: (from the source directory)

  python setup.py test

If you want to run the suite against a live server (e.g. as deployed by the testmodel/setup.sh in the InterMine distribution, then you can run:

  python setup.py livetest

By default this will use the location http://localhost:8080/intermine-demo/service. If you want it to use a different service, set the service URL in the TESTMODEL_URL shell environment variable.

Installation:

Once downloaded, you can install the module with the command (from the source directory):

  python setup.py install

Further documentation:

We have detailed tutorials:

Extensive documentation is available by using the "pydoc" command, eg:

  pydoc intermine.query.Query

Also see:

Before Making PRs:

Please run autopep8 on your files so your code will follow the pep8 conventions. Do it as follows:

  autopep8 <filenames> --in-place

For example:

  autopep8 file1.py file2.py  --in-place

Changes:

1.13.00   Added get_template_by_user method
		  Added property view_types to the template
1.12.00   Added organism to list upload
          Support for Python 3.7
          Added getVersion for registry mines
          Added display method for list object
          Added get_anonymous_token method for service object
          Make registry getMines print all when no organism is given
          Added dataframe method for query object
1.11.00   Added Query Manager
1.10.00   Added registry features
1.09.09   Add Python 3 support
1.09.06   Dual license under BSD as well as LGPL
1.07.00   Provide ListManagers as context managers, where users need to create
          temporary lists and clean up after themselves.
          Added ID resolution.
1.05.00   Allowed constraints to be added on root paths implicitly, eg:
          q.where('LOOKUP', 'eve') or q.where('IN', 'My List')
1.01.00   Added widget listing requests.
1.00.00   Added widget enrichment requests.
0.99.08   Added simpler constraint definition with kwargs.
0.99.07   Fixed bugs with lazy reference fetching handling empty collections and null references.
0.99.06   Fixed bug whereby constraint codes in xml were being ignored when queries were deserialised.
0.99.05   Allow template parameters of the form 'A = "zen"', where only the value is being replaced.
0.99.04   Merged 'list.to_query and 'list.to_attribute_query' in response to the changes in list upload behaviour.
0.99.03   Allow query construction from Columns with "where" and "filter"
          Allow list and query objects as the value in an add_constraint call with "IN" and "NOT IN" operators.
          Ensure lists and queries share the same overloading
0.99.02   Allow sort-orders which are not in the view but are on selected classes
0.99.01   Better representation of multiple sort-orders.
0.99.00   Fixed bug with subclasses not being included in clones 
          Added support for new json format for ws versions >= 8.
0.98.16   Fixed bug with XML parsing and subclasses where the subclass is mentioned in the first view.
          better result format documentation and tests
          added len() to results iterators
          added ability to parse xml from the service object (see new_query())
          improved service.select() - now accepts plain class names which work equally well for results and lists
          Allowed lists to be generated from queries with unambiguous selected classes.
          Fixed questionable constraint parsing bug which lead to failed template parsing
0.98.15   Added lazy-reference fetching for result objects, and list-tagging support
0.98.14   Added status property to list objects
0.98.13   Added query column summary support

Copyright and Licence

Copyright (C) 2002-2018 InterMine

All code in this project is dual licensed under the LGPL version 3 license and the BSD 2-clause license

Please cite

InterMine: extensive web services for modern biology.

Kalderimis A, Lyne R, Butano D, Contrino S, Lyne M, Heimbach J, Hu F, Smith R, Stěpán R, Sullivan J, Micklem G.

Nucleic Acids Res. 2014 Jul;42(Web Server issue):W468-72

doi pubmed

intermine-ws-python's People

Contributors

alexkalderimis avatar arun-y99 avatar asherpasha avatar barhenkro avatar heralden avatar mbasil09 avatar niveditarufus avatar nkengawoh avatar nupurgunwant avatar sanat-mishra avatar svengato avatar theantisnipe avatar yochannah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

intermine-ws-python's Issues

[documentation update - easy task] easy_install is deprecated - remove from readme instructions

let's tell people to use bioconda or pip instead of easy_install.

Deprecation notice: https://setuptools.readthedocs.io/en/latest/easy_install.html

if you want to pick up this task

  1. Take a look at the InterMine contributing guidelines
  2. Comment on this issue stating that you intend to work on the task
  3. When you're ready, add your work to the repo and create a pull request.

What to do if you need help

Mention @yochannah, tweet @yoyehudi, pop by to say hi on chat or if needed email [email protected]. Don't forget we're usually only available during uk office hours and will not be able to respond at other times :)

Allow to execute a template modifying an existing editable constraint

It's not possible to run a template modifying the value of an editable constraint.
If a template has an editable constraint with code A, using the method template.add_constraint (same constraint field, same constraint operator, different constraint value, code="A") will add a new constraint with code B, and when you execute template.results it will return an error (which is correct because it's not possible to add a new constraint in a template)

List upload: don't create empty lists

The InterMine python client allows users to create lists of things (e.g. lists of genes). If the list size is zero, the list should not be created.

See https://mybinder.org/v2/gh/intermine/intermine-ws-python-docs/master?filepath=09-tutorial.ipynb for more info on the process to create a list, or look at the full set of tutorials here: https://github.com/intermine/intermine-ws-python-docs.

If someone uses intermine-ws-python to try to create a list with 0 items, we should prevent this and give them a sensible clear error, e.g. "Lists must have one or more elements - the current list has 0".

Update of Documentation

At present we have the docs hosted via pydocs, which will be in html files in a separate folder named 'docs' after the merge of #15.

Originally, the documentation was hosted here but it hasn't been updated with the current registry update. This is because the documentation aspect of pypi is undergoing some changes and doesn't allow updation of documentation at present. Refer to pypi/warehouse#582 .

As soon as this gets fixed, pypi can again be used as a way to keep the documentation.

Registry methods - get all intermines

The registry methods should allow us to get a list of all intermines without requiring an organism name. Can we add this method? It would have been useful when we were working on a Galaxy tool for the registry.

:)

travis ci build - upgrade pypy version to be 2.9 or greater

Our travis ci build is failing on the two pypy settings because the default version of pypy for trusty is too old - it's 2.8 and needs to be at least 2.9.x in order to install pandas correctly. See https://travis-ci.org/intermine/intermine-ws-python/jobs/580760283 for an example of the failure.

This is a link of different python and pypy versions: travis-ci/docs-travis-ci-com#1249 (comment)

To test this, you could enable travis ci for your fork of this repo, or create a WIP (work in progress) PR - either is fine.

if you want to pick up this task

  1. Take a look at the InterMine contributing guidelines
  2. Comment on this issue stating that you intend to work on the task
  3. When you're ready, add your work to the repo and create a pull request.

What to do if you need help

Mention @yochannah, tweet @yoyehudi, pop by to say hi on chat or if needed email [email protected]. Don't forget we're usually only available during uk office hours and will not be able to respond at other times :)

List upload: add organism


I am currently trying to create lists with Python, so not by uploading them here, and I am facing the following issue, that is, I lose genes along the way. I found the problem by copying and pasting manually my list in this two pages:
-http://www.humanmine.org/humanmine/begin.do
-http://www.humanmine.org/humanmine/bag.do?subtab=upload
In the first page, I get only a few genes automatically when I create a human list, and I should select the other manually, because it also wants to give me genes from M. musculus and R. norvegicus. In Python everything is automatic and so I do not select and lose the genes, getting a list with few genes.
In the second page, I get the proper list because I can select the organism (H. Sapiens) before submitting the list. Unfortunately, I do not seem to find a way to select the proper organism when creating a list with Python using .createlist(). 

List upload: permissions issue

From @rachellyne

I created  two lists in humanmine using  lm.create_list(file, list_type="Gene", name=listname, description=description) where file is the path to the file.   It created two empty lists that I can't delete - it says "The query is well formatted but you do not have access to the following mentioned lists: PL_GenomicsEngland_GenePanel:Stickler_syndrome".  

Issue with function query_to_barchart_log in interrmine/bar_chart.py

--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-4-ea8f991734d6> in <module> ----> 1 b.query_to_barchart_log('<query model="genomic" view="Gene.name Gene.symbol Gene.length" sortOrder="Gene.name ASC" ><constraint path="Gene.length" op="&lt;" value="450074" /><constraint path="Gene.name" op="=" value="translation initiation factor IF-2-like" /></query>','true') ~/virtualenv/python3.5.6/lib/python3.5/site-packages/intermine/bar_chart.py in query_to_barchart_log(xml, resp) 237 ax = y.plot(kind='bar') 238 ax.set_title(list[0][0]) --> 239 ax.set_xlabel(l[1]) 240 if resp == 'true': 241 ax.set_ylabel('log(' + l[2] + ')') NameError: name 'l' is not defined NameError: name 'l' is not defined

Intermine and Python 3.11

Love the package! Noticed Intermine isn't compatible with Python 3.10 or 3.11 (same issue as in #96). Any chance of compatibility for both 3.11 and 3.10?
Thanks in advance.
Best Lucas

Web service change in 2.0: xml param --> query param

I changed my mind! See intermine/intermine#1676

To remind you, we have a parameter called xml that is for the InterMine query. However, we now accept JSON queries! I have changed the name of the parameter to be query instead.

However, this messes up your code as you are using the xml parameter. Instead, can you do this:

if version >= 27, then parameter = query
else parameter = xml

This will use xml for current mines, and switch to query for newer mines.

I know it's not very elegant!

[python + travisCI] Automatically deploy and generate documentation

HTML documentation for our library can be generated using the pydoc -w command.

It would be nice to use travis or github actions to automatically generate docs and deploy them to the gh-pages branch of this repo.

This should only happen for the master branch.

if you want to pick up this task

  1. Take a look at the InterMine contributing guidelines
  2. Comment on this issue stating that you intend to work on the task
  3. When you're ready, add your work to the repo and create a pull request.

What to do if you need help

Mention @yochannah, tweet @yoyehudi, pop by to say hi on chat or if needed email [email protected]. Don't forget we're usually only available during uk office hours and will not be able to respond at other times :)

Go right from query results to a data frame

Right now when we return query results, they're returned as in iterator. If you wish to work with query results in a dataframe, you have to manually build the dataframe step by step in a for loop, which can be a pain.

We should add a backwards-compatible way of getting query results in a dataframe - so our old tutorials and code should still work, but perhaps add another argument or use different methods to fetch the results as a dataframe.

Presumably this will make us require pandas by default.

We have a tutorial here that shows how to handle intermine query data: https://mybinder.org/v2/gh/intermine/intermine-ws-python-docs/master?filepath=02-tutorial.ipynb and generally there is more info in out other tutorials as well: (scroll down and look at the list of tutorials in the readme https://github.com/intermine/intermine-ws-python-docs/ )

small import issues. collections.abc

I don't see any specific guidance on using a specific python version but, using Python 3.10.8, things don't seem to work out of the box. Using a query I made from the MouseMine website (which seem to provide python 2-style print...), it has:

from intermine.webservice import Service

Which fails in its import block, I had to manually edit webservice.py to use "collections.abc" vs. "collections".

Not sure if developer intent is to use a specific python version, or if this should be fixed, or handled for different python versions.

Set up autopep8 as a task and add guidelines to the readme

Since we have the pep8speaks bot commenting on our issues, we ought to make it easy for people to format things in a way that makes pep8 happy. This project looks like it would do it: https://pypi.org/project/autopep8/

task

  • add autopep8 to requirements.txt
  • add instructions to the readme file telling people how to run it before making PRs.

if you want to pick up this task

  1. Take a look at the InterMine contributing guidelines
  2. Comment on this issue stating that you intend to work on the task
  3. When you're ready, add your work to the repo and create a pull request.

What to do if you need help

Mention @yochannah, tweet @yoyehudi, pop by to say hi on chat or if needed email [email protected]. Don't forget we're usually only available during uk office hours and will not be able to respond at other times :)

how to dowload fasta file through python API?

Hi,

I would like to download the transcripts sequences from a list of genes on the PhytoMine?

would anyone give me some suggestions on how to fetch the sequence from server by python API?

My best,

Ming

Update README to include new ways of installing client

We want to update the README to have instructions on how to install the client.

  1. Link to bioconda docs
  2. Show how to install via PyPi sudo easy_install intermine

Don't make people download directly. If they are at github, they'll know they can make a clone of the repo!

[easy task] Create getVersion function to return the version of a given InterMine

We have a method, getInfo, that fetches the version of a given InterMine from the registry here: https://github.com/intermine/intermine-ws-python/blob/dev/intermine/registry.py#L11

getInfo method only prints results to the screen. I would be nice to have a similar method called getVersion that returns the result (rather than printing) so it can be used programmatically, and without the extra bits printed to the screen.

Labelled as easy since much of this method can be copied from getInfo.

Ideally the return response should contain keys for the following three version options:

API Version: 25
Release Version: 45.1 2017 August
InterMine Version: 1.8.5

if you want to pick up this task

  1. Take a look at the InterMine contributing guidelines
  2. Comment on this issue stating that you intend to work on the task
  3. When you're ready, add your work to the repo and create a pull request.

What to do if you need help

Mention @yochannah, tweet @yoyehudi, pop by to say hi on chat or if needed email [email protected]. Don't forget we're usually only available during uk office hours and will not be able to respond at other times :)

Does python API support query with wildcard?

My code is like this:

from intermine.webservice import Service
service = Service("https://phytozome.jgi.doe.gov/phytomine/service")
query=service.new_query()
query.select("*")
query.add_constraint("Gene","LOOKUP","AT5G67180")

this code works fine. but if i wrote using wildcard,like,

query.add_constraint("Gene","LOOKUP","*5G67180")

the error comes up,

Traceback (most recent call last):
File "/Users/myang/Ananconda/anaconda3/lib/python3.7/site-packages/intermine/results.py", line 640, in open
return urlopen(req)
File "/Users/myang/Ananconda/anaconda3/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/Users/myang/Ananconda/anaconda3/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
File "/Users/myang/Ananconda/anaconda3/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "/Users/myang/Ananconda/anaconda3/lib/python3.7/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/Users/myang/Ananconda/anaconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/Users/myang/Ananconda/anaconda3/lib/python3.7/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error

Could anyone help to fix it? thanks a lot.

create a list using synonyms

There is no way to create a list using synonyms
The method create_list(content=genes,list_type="Gene",name="a random name") in the list_manager, only adds exact matches.
We should add an additional group 'SYNONYM' to the inut param add

intermine and Python 3.10

Using the Python API with Python 3.10 leads to the following error

from intermine.webservice import Service
Traceback (most recent call last):
  File "/Users/jmarie/opt/anaconda3/envs/intermine-test/lib/python3.10/site-packages/intermine/webservice.py", line 9, in <module>
    from urlparse import urlparse
ModuleNotFoundError: No module named 'urlparse'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jmarie/opt/anaconda3/envs/intermine-test/lib/python3.10/site-packages/intermine/webservice.py", line 16, in <module>
    from collections import MutableMapping as DictMixin
ImportError: cannot import name 'MutableMapping' from 'collections' (/Users/jmarie/opt/anaconda3/envs/intermine-test/lib/python3.10/collections/__init__.py)

No problem with Python version <3.10

Merge intermine-bio package into intermine-ws-python package

We have an intermine-bio package. Why?

https://github.com/intermine/intermine-ws-bio-python

I don't see a utility in keeping these separate. Just making another dependency. (@justinccdev may disagree?).

Let's merge the functionality into the main intermine package and delete this repo.

What does the bio package do? Here are the docs:

# Get all sequences for proteins on "h", "r", "eve", "bib" and "zen":

    from intermine.webservice import Service
    from interminebio import SequenceQuery

    s = Service("www.flymine.org/query")
    q = SequenceQuery(s, "Gene")

    syms = ["h", "r", "eve", "bib", "zen"] 

    print q.select_sequence("proteins").where(s.model.Gene.symbol == syms).fasta()

# Process the locations of these genes one at a time:

    for line in q.select_sequence("Gene").where(s.model.Gene.symbol == syms).bed():
        process(line)

Here are the end points it uses:

query/fasta
query/gff
query/ned
  • region search (already available in main client?)
    LIST_PATH = "/regions/list"
    BED_PATH = "/regions/bed"
    FASTA_PATH = "/regions/fasta"
    GFF3_PATH = "/regions/gff3"

TODO

  1. copy over init and iterators files
  2. rename them something bio specific (I think we want a bio directory?)
  3. write tutorial
  4. test!

Update docs to indicate correct row formats available

I was messing about with pandas and plotly for fun at a study group and I thought I'd pull in some familiar data into a data frame. I wanted to get the results in a raw-ish format, so I delved into the massive tutorial-in-comments in query.py and thought I'd found what I wanted: a nice way to get the results as a simple string rather than an iterator, based on this comment here:

>>> for row in query.result("string")

         for row in query.result("string")
             print(row)

I think it was a trick, though! query.result("whatever") isn't a method and query.results("whatever") (note the additional s) IS a method but string isn't a valid arg.

Maybe we just need to remove that line ;)

[easier issue if you know python] apply pep8 formatting to all files

We have pep8 speaks set up to check files that are edited in PRs - but we still have some files that aren't checked because they haven't been edited in a while.

Task: Apply pep8 formatting to all files in this repo. You probably want to try a command something like autopep8 . -r --in-place.

Note that this command might not fix all long lines - some might need to be done manually.

if you want to pick up this task

  1. Take a look at the InterMine contributing guidelines
  2. Comment on this issue stating that you intend to work on the task
  3. When you're ready, add your work to the repo and create a pull request.

What to do if you need help

Mention @yochannah, tweet @yoyehudi, pop by to say hi on chat or if needed email [email protected]. Don't forget we're usually only available during uk office hours and will not be able to respond at other times :)

calling `service.get_template()` triggers an error in Python 3.7

Executing service.get_template() in Python 3.7 results in RuntimeError: generator raised StopIteration. The example is below is when accessing FlyMine; however, I have seen the same thing happens for YeastMine.
Demonstrating the issue:
Go to here and scroll fown to the 'Tutorial 7: Templates' section. Under there click the launch binder badge and try running the first cell after the session spins up. The error will come up. as currently the Binder instances are defaulting to Python 3.7.

Example output from the demonstration:
stopiteration_ error_with_template_example

I suspect it is related to the second item under 'Changes in Python Behavior' here as currently the Binder instances are defaulting to Python 3.7.. A possible fix to a related issue is shown here. It looks like incorporating that approach to line 477 of intermine/query.py may fix it.

Anonymous session tokens?

Does the Python client have any way to create anonymous session tokens? It could be useful for demo code where I didn't want to expose the token associated with my account and I didn't need the list operations I was performing to persist, long-term.

I couldn't see one, and ended up writing something like this (it's probably awful, given my python skills... ;)

service_url = "http://www.humanmine.org/humanmine/"
token = requests.get(url=service_url + "/service/session")
token = token.json()["token"]
service = Service(service_url, token=token)

Take ownership of my repo?

I had an issue created against my repo since to the naive outsider it appears to be the main one. Would you like to take ownership of it (after merging back your progress of course)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.