anergictcell / pyhpo Goto Github PK

View Code? Open in Web Editor NEW

24.0 24.0 2.0 125.48 MB

A Python library to work with, analyze, filter and inspect the Human Phenotype Ontology

License: MIT License

Makefile 0.35% Python 99.65%

bioinformatics hpo hpo-similarity ontology

pyhpo's People

Contributors

Stargazers

Watchers

Forkers

aktanipek zeromtmu

pyhpo's Issues

Translating existing strings into HPO terms

Hi,

I have sets of strings which are not yet in HPO form, but should be translated into HPO terms. In many cases they are (when comparing strings) already very close.

What would be the best way to map these strings to their corresponding HPO term (with uncertainty estimate maybe aka number of character mismatches)?

If I search via

for term in Ontology.search(MYOWNTERM):
print(term.name)

will I get the best matches or are they sorted alphabetically?
Any pointers in general?

Calculating information content from different datasets

Hello,

I have been using pyHPO to calculate the similarity scores between patients within a dataset using their clinical phenotype lists as HPOSets. ie - HPOSet1.similarity(HPOSet2)
However, I am worried my analyses may be skewed because pyHPO calculates the information contents used in the scoring algorithms based on the "kind" parameter - OMIM, orpha, decipher, or gene. I am wondering if there is any way to create a "custom kind" of sorts so that my patients' similarity scores are calculated using information contents derived from my dataset of choice instead of these publicly available ones?

Any feedback would be greatly appreciated. Thanks!

After install problem

Dear all.
I've just installed the package (via 'pip installl pyhpo', also tried 'pip install pyhpo[all]') on two different platforms: Linux Mint, Python 3.8, Spyder IDE and independently on a W11, Python 3.11.4 , IDLE environment.
On both devices I get this error when trying to use the library (Ontology, HPOSet...)

Traceback (most recent call last):
File "<pyshell#0>", line 1, in
from pyhpo import Ontology
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pyhpo_init_.py", line 5, in
from pyhpo.term import HPOTerm
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pyhpo\term.py", line 9, in
from pyhpo.similarity import SimScore
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pyhpo\similarity_init_.py", line 1, in
from pyhpo.similarity.base import SimScore
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pyhpo\similarity\base.py", line 8, in
class _Similarity(BaseModel):
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pydantic_internal_model_construction.py", line 95, in new
private_attributes = inspect_namespace(
File "C:\Users\XXX\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pydantic_internal_model_construction.py", line 328, in inspect_namespace
raise PydanticUserError(
pydantic.errors.PydanticUserError: A non-annotated attribute was detected: kind = 'omim'. All model fields require a type annotation; if kind is not meant to be a field, you may be able to resolve this error by annotating it as a ClassVar or updating model_config['ignored_types'].

For further information visit https://errors.pydantic.dev/2.0.1/u/model-field-missing-annotation

Thanks!

Unexpected values using JC and JC2 for hpo similarity

Values between JC and JC2 are not similar in pyhpo, but are similar in hpo3. Also, the dataset I am using produces values that range from .1 to 134 in JC and -.1 to -9 in JC2. I've read that the JC score is supposed to be between 0 and 1, so these values wouldn't make sense. Are values outside of 0-1 expected?

I ran the code below using pyhpo versions 3.2.4, 3.2.5, 3.2.6 and hpo3 version 1.0.3.

I made the small example below to demonstrate the differences.

from pyhpo import HPOSet, Ontology

disease_name = "Neurodevelopmental disorder with central hypotonia and dysmorphic facies"
hpo_terms_to_compare = ['HP:0000218', 'HP:0000384', 'HP:0000842', 'HP:0001212', 'HP:0001274', 'HP:0009796']

 _ = Ontology()
 omim_diseases = list(Ontology.omim_diseases)

omim_disease_hpo= [list(x.hpo) for x in omim_diseases if disease_name == x.name]
 omim_query = HPOSet.from_queries(omim_disease_hpo[0])
 hpo_query = HPOSet.from_queries(hpo_terms_to_compare)

 jc = hpo_query.similarity(omim_query, kind="omim", method="jc")
 jc2 = hpo_query.similarity(omim_query, kind="omim", method="jc2")

 print(jc)
 print(jc2)

This code gives values of 139.57 and -4.31 when using pyhpo v3.2.6 and -4.31 and -4.31 when using hpo3. In addition, there seems to be a discrepancy between versions in pyhpo. Versions 3.2.4 and 3.2.5 both give values of 54.24 and -4.3.

Different number of children for term than ebi HPO browser

pyhpo gives a different number of children for a term than the ebi browser.

For example, looking at term HP:0003674, the ebi HPO browser lists ~27 children and sub-children terms, but pyhpo appears to only list children with additional subchildren, and doesn't list the children terms:

for p in term.children:
     print(p)
 
HP:0003577 | Congenital onset
HP:4000040 | Puerpural onset
HP:0030674 | Antenatal onset
HP:0003623 | Neonatal onset
HP:0410280 | Pediatric onset
HP:0003581 | Adult onset

I'm not sure if I'm misusing phypo, or if it's using a different HPO version than the ebi browser, perhaps?

New version of HPO

Hello,

last month HPO released an update. I tried to use the new version with this library but it seems that it cannot parse the ontology.

I tried to update the files with

from pyhpo.update_data import download_data
download_data()

and I also created a directory with the new files manually downloaded, but in both cases I get an error:

Traceback (most recent call last):
  File "*/pruebahpo.py", line 2, in <module>
    _ = Ontology()
  File "*/lib/python3.10/site-packages/pyhpo/ontology.py", line 51, in __call__
    self._load_from_obo_file(data_folder)
  File "*/lib/python3.10/site-packages/pyhpo/ontology.py", line 380, in _load_from_obo_file
    for term in terms_from_file(data_folder):
  File "*/lib/python3.10/site-packages/pyhpo/parser/obo.py", line 121, in terms_from_file
    yield parse_obo_section(term_section)
  File "*/lib/python3.10/site-packages/pyhpo/parser/obo.py", line 137, in parse_obo_section
    key, value = line.split(':', 1)
ValueError: not enough values to unpack (expected 2, got 1)

This error appears when I try to do:

from pyhpo import Ontology
_ = Ontology()

term = Ontology.get_hpo_object('Scoliosis')
print(term)

It would be a pity that this library could not be used with the new versions of HPO. It could also be that I did something wrong, but this exact script worked with the previous version of the ontology.

Thank you for your help.

anergictcell / pyhpo Goto Github PK

pyhpo's People

Contributors

Stargazers

Watchers

Forkers

pyhpo's Issues

Translating existing strings into HPO terms

Calculating information content from different datasets

After install problem

Unexpected values using JC and JC2 for hpo similarity

Different number of children for term than ebi HPO browser

New version of HPO

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent