ncbi-nlp / negbio Goto Github PK

View Code? Open in Web Editor NEW

154.0 5.0 41.0 453 KB

:newspaper: High-performance tool for negation and uncertainty detection in radiology reports

License: Other

Python 100.00%

nlp clincal reports clincal-nlp clinical-notes clinical-research

negbio's People

Contributors

Stargazers

Watchers

negbio's Issues

Addition of Jpype1 in requirements.txt

jpype is preferred as backend for StanfordDependencies
https://github.com/ncbi-nlp/NegBio/blob/master/negbio/pipeline/ptb2ud.py#L78

But pip install -r requirements.txt doesn't install jpype.

Even though jpype1 is mentioned as part of extras_require in https://github.com/dmcc/PyStanfordDependencies/blob/master/setup.py#L61
pip install from pypi package of PyStanfordDependencies doesn't install jpype.

Hence its better to mention jpype1 in requirements.txt

jpype does make a significant improvement in processing speed.

phrases for the chexpert

How the phrases are decided mentioned in /neg/phrases/mention/ or /unmention/
Are those extracted using keyword extraction or manually decided based on the reports content and disorder relevancy?

convert_tree fails with subprocess backend for StanfordDependencies

Due to the add_lemmas=True change, convert_tree now fails when using subprocess backend for StanfordDependencies.

Version for Portuguese-BR

Does anyone know any version adapted for Brazilian Portuguese?

Run Stanford CoreNLP lemmatizer for jpype backend

JPypeBackend.py of StanfordDependencies allows option to use Stanford CoreNLP lemmatizer using the input parameter: add_lemmas
https://github.com/dmcc/PyStanfordDependencies/blob/master/StanfordDependencies/JPypeBackend.py#L86

But NegBio isn't utilizing this option for backend=jpype

I have compared NLTK wordnet vs CoreNLP lemmatization in terms of speed on few sentences.
CoreNLP is much faster(almost 10 times).
For this had made the following changes:

passed add_lemmas=True
populated ann.infons['lemma'] from dependency graph (https://github.com/ncbi-nlp/NegBio/blob/master/negbio/pipeline/ptb2ud.py#L107)

How about making this change to utilize CoreNLP lemmatization?

Install Error

ERROR: Could not find a version that satisfies the requirement negbio (from versions: none)
ERROR: No matching distribution found for negbio

I have tried different versions Ubuntu system, all get this error.

Thanks for help.

Make GitHub releases that match with PyPi

Great to see this installable on PyPi. I just installed it but found there have been some updates to NegBio since and it wasn't really clear to me when it was put up on PyPi. It would be great to created releases on GitHub which are tagged with the same version as on PyPi (or maybe you can point me to where that's already done, if so).

How to write patterns based on other medical data

@kaushikacharya How are the pre, post and neg patterns have been written and how one can write it's own pattern based on some other data.?

AttributeError: Java package 'edu' is not valid

I install NegBio by cloning the repo and try running main_chexpert text --output=examples examples/00000086.txt examples/00019248.txt which downloads a .jar file from http://search.maven.org/remotecontent?filepath=edu/stanford/nlp/stanford-corenlp/3.5.2/stanford-corenlp-3.5.2.jar and puts it in /root/.local/share/pystanforddeps/, however get an error from StanfordDependencies/JPypeBackend.py:
AttributeError: Java package 'edu' is not valid
Please assist. Thanks.

A BioCAnnotation no longer has the attribute get_total_location() in the latest bioc version(1.3)

For instance, in NegBio/negbio/pipeline/negdetect.py, in detect(document, detector)

locs = []
for ann in passage.annotations:
     total_loc = ann.get_total_location()

get_total_location() no longer exists and is instead replaced by total_span().

strange error occur, when I run python negbio/main_chexpert.py text --output=examples/test.neg.xml examples/00019248.txt examples/00000086.txt

{'--bllip-model': '~/.local/share/bllipparser/GENIA+PubMed',
'--mention_phrases_dir': 'negbio/chexpert/phrases/mention',
'--neg-patterns': 'negbio/chexpert/patterns/negation.txt',
'--newline_is_sentence_break': False,
'--output': 'examples/test.neg.xml',
'--post-negation-uncertainty-patterns': 'negbio/chexpert/patterns/post_negation_uncertainty.txt',
'--pre-negation-uncertainty-patterns': 'negbio/chexpert/patterns/pre_negation_uncertainty.txt',
'--split-document': False,
'--unmention_phrases_dir': 'negbio/chexpert/phrases/unmention',
'--verbose': False,
'SOURCE': None,
'SOURCES': ['examples/00019248.txt', 'examples/00000086.txt'],
'bioc': False,
'text': True}
/home/bo/.local/lib/python3.6/site-packages
/home/bo/.local/lib/python3.6/site-packages
/home/bo/.local/lib/python3.6/site-packages
/home/bo/.local/lib/python3.6/site-packages
/home/bo/.local/lib/python3.6/site-packages
/home/bo/conda/envs/negbio/lib/python3.6/site-packages/StanfordDependencies/JPypeBackend.py:160: UserWarning: This jar doesn't support universal dependencies, falling back to Stanford Dependencies. To suppress this message, call with universal=False
warnings.warn("This jar doesn't support universal "
ERROR:root:Cannot process sentence 0 in 00019248
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 72 in 00019248
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 123 in 00019248
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 142 in 00019248
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 201 in 00019248
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 0 in 00000086
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 73 in 00000086
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 94 in 00000086
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 122 in 00000086
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 139 in 00000086
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String
ERROR:root:Cannot process sentence 204 in 00000086
Traceback (most recent call last):
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 120, in convert_doc
has_lemmas=self._backend == 'jpype')
File "/home/bo/.local/lib/python3.6/site-packages/negbio/pipeline/ptb2ud.py", line 171, in convert_dg
index = text.find(node_form, start)
TypeError: must be str, not java.lang.String

setup.py not able to install packages from requirements.txt

@alistairewj @yfpeng @kaushikacharya
Python3.7

A. The suggested method for setup.

Cloned the repo.
Run the setup.py file using python setup.py install --user but it doesn't install all the packages from requirements.txt that is why I tried running the file explicitly.
Checked the list of packages installed using conda list
Then Explicitly run the requirements.txt using pip install -r requirements.txt but it failed to download the blliparser and JPype1==0.6.3 in MacOS 10.14.3.

B. Different method

Created a new conda environment using environment3.7.yml file.
checked the packages in the environment using conda list
Tried running the CheXpert algo on example dataset using: python negbio/main_chexpert.py text --output=examples/test.neg.xml examples/00000086.txt examples/00019248.txt
It throws an error

{'--bllip-model': 'None',
 '--mention_phrases_dir': 'negbio/chexpert/phrases/mention',
 '--neg-patterns': 'negbio/chexpert/patterns/negation.txt',
 '--newline_is_sentence_break': False,
 '--output': 'examples/test.neg.xml',
 '--post-negation-uncertainty-patterns': 'negbio/chexpert/patterns/post_negation_uncertainty.txt',
 '--pre-negation-uncertainty-patterns': 'negbio/chexpert/patterns/pre_negation_uncertainty.txt',
 '--split-document': False,
 '--unmention_phrases_dir': 'negbio/chexpert/phrases/unmention',
 '--verbose': False,
 'SOURCE': None,
 'SOURCES': ['examples/00000086.txt', 'examples/00019248.txt'],
 'bioc': False,
 'text': True}
Your Java version: 14
Traceback (most recent call last):
  File "/Users/kaushikjaiswal/anaconda3/envs/negbio3.7/lib/python3.7/site-packages/StanfordDependencies/JPypeBackend.py", line 46, in __init__
    self.acceptFilter = self.corenlp.util.Filters.acceptFilter()
  File "/Users/kaushikjaiswal/anaconda3/envs/negbio3.7/lib/python3.7/site-packages/jpype/_jpackage.py", line 62, in __call__
    raise TypeError("Package {0} is not Callable".format(self.__name))
TypeError: Package edu.stanford.nlp.util.Filters.acceptFilter is not Callable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "negbio/main_chexpert.py", line 132, in <module>
    main()
  File "negbio/main_chexpert.py", line 86, in main
    ptb2dep = NegBioPtb2DepConverter(lemmatizer, universal=True)
  File "/Users/kaushikjaiswal/.local/lib/python3.7/site-packages/negbio/pipeline/ptb2ud.py", line 103, in __init__
    lemmatizer, representation, universal)
  File "/Users/kaushikjaiswal/.local/lib/python3.7/site-packages/negbio/pipeline/ptb2ud.py", line 70, in __init__
    self.__sd = StanfordDependencies.get_instance(backend=self._backend)
  File "/Users/kaushikjaiswal/anaconda3/envs/negbio3.7/lib/python3.7/site-packages/StanfordDependencies/StanfordDependencies.py", line 243, in get_instance
    return JPypeBackend(**extra_args)
  File "/Users/kaushikjaiswal/anaconda3/envs/negbio3.7/lib/python3.7/site-packages/StanfordDependencies/JPypeBackend.py", line 51, in __init__
    self._report_version_error(version)
  File "/Users/kaushikjaiswal/anaconda3/envs/negbio3.7/lib/python3.7/site-packages/StanfordDependencies/JPypeBackend.py", line 202, in _report_version_error
    raise JavaRuntimeVersionError()
StanfordDependencies.StanfordDependencies.JavaRuntimeVersionError: Your Java runtime is too old (must be 1.8+ to use CoreNLP version 3.5.0 or later and 1.6+ to use CoreNLP version 1.3.1 or later)

Conda list result for method A:

# Name                    Version                   Build  Channel
atomicwrites              1.3.0                     <pip>
attrs                     19.3.0                    <pip>
bioc                      1.3.1                     <pip>
bllipparser               2016.9.11                 <pip>
ca-certificates           2020.1.1                      0  
certifi                   2020.4.5.1               py37_0  
decorator                 4.4.2                     <pip>
docopt                    0.6.2                     <pip>
docutils                  0.14                      <pip>
future                    0.16.0                    <pip>
importlib-metadata        1.6.0                     <pip>
jsonlines                 1.2.0                     <pip>
libcxx                    4.0.1                hcfea43d_1  
libcxxabi                 4.0.1                hcfea43d_1  
libedit                   3.1.20181209         hb402a30_0  
libffi                    3.2.1                h475c297_4  
lxml                      4.2.5                     <pip>
more-itertools            8.2.0                     <pip>
ncurses                   6.2                  h0a44026_0  
networkx                  1.11                      <pip>
nltk                      3.4.5                     <pip>
openssl                   1.0.2u               h1de35cc_0  
pathlib2                  2.3.5                     <pip>
pip                       20.0.2                   py37_1  
pluggy                    0.13.1                    <pip>
ply                       3.10                      <pip>
py                        1.8.1                     <pip>
pymetamap                 0.1                       <pip>
PyStanfordDependencies    0.3.1                     <pip>
pytest                    4.4.1                     <pip>
python                    3.7.0                hc167b69_0  
readline                  7.0                  h1de35cc_5  
setuptools                46.1.3                   py37_0  
six                       1.14.0                    <pip>
sqlite                    3.31.1               ha441bb4_0  
tk                        8.6.8                ha441bb4_0  
tqdm                      4.19.5                    <pip>
wheel                     0.34.2                   py37_0  
xz                        5.2.4                h1de35cc_4  
zipp                      3.1.0                     <pip>
zlib                      1.2.11               h1de35cc_3

Conda list result for method B:

# Name                    Version                   Build  Channel
atomicwrites              1.3.0                    py37_1    anaconda
attrs                     19.3.0                     py_0    anaconda
bioc                      1.3.1                     <pip>
blas                      1.0                         mkl    anaconda
bllipparser               2016.9.11                 <pip>
ca-certificates           2020.1.1                      0    anaconda
certifi                   2020.4.5.1               py37_0    anaconda
decorator                 4.4.2                      py_0    anaconda
docopt                    0.6.2                    py37_0    anaconda
docutils                  0.14                     py37_0    anaconda
future                    0.16.0                    <pip>
importlib_metadata        1.5.0                    py37_0    anaconda
intel-openmp              2020.0                      166    anaconda
jpype1                    0.6.3           py37hbf1eeb5_1001    conda-forge
jsonlines                 1.2.0                     <pip>
libcxx                    10.0.0                        0    conda-forge
libedit                   3.1.20181209         hb402a30_0    anaconda
libffi                    3.2.1                h475c297_4    anaconda
libgfortran               3.0.1                h93005f0_2    anaconda
lxml                      4.2.5                     <pip>
mkl                       2019.5                      281    anaconda
mkl-service               2.3.0            py37hfbe908c_0    anaconda
mkl_fft                   1.0.15           py37h5e564d8_0    anaconda
mkl_random                1.1.0            py37ha771720_0    anaconda
more-itertools            8.2.0                      py_0    anaconda
ncurses                   6.2                  h0a44026_0    anaconda
networkx                  1.11                      <pip>
networkx                  2.2                      py37_1    anaconda
nltk                      3.4.5                    py37_0    anaconda
numpy                     1.15.4                    <pip>
numpy                     1.16.6           py37h81c90fd_0    anaconda
numpy-base                1.16.6           py37h6575580_0    anaconda
openssl                   1.1.1                h1de35cc_0    anaconda
pathlib2                  2.3.3                     <pip>
pip                       20.0.2                   py37_1    anaconda
pluggy                    0.13.1                   py37_0    anaconda
ply                       3.10                      <pip>
ply                       3.11                     py37_0    anaconda
py                        1.8.1                      py_0    anaconda
pymetamap                 0.1                       <pip>
PyStanfordDependencies    0.3.1                     <pip>
pytest                    4.4.1                     <pip>
pytest                    4.2.0                    py37_0    anaconda
python                    3.7.7           hc70fcce_0_cpython    anaconda
readline                  8.0                  h1de35cc_0    anaconda
setuptools                46.1.3                   py37_0    anaconda
six                       1.14.0                   py37_0    anaconda
sqlite                    3.31.1               ha441bb4_0    anaconda
tk                        8.6.8                ha441bb4_0    anaconda
tqdm                      4.19.5                    <pip>
tqdm                      4.31.1                   py37_1    anaconda
wheel                     0.34.2                   py37_0    anaconda
xz                        5.2.4                h1de35cc_4    anaconda
zipp                      2.2.0                      py_0    anaconda
zlib                      1.2.11               h1de35cc_3    anaconda

When loading NegBioParser() the session crashes

I have learnt about NegBio from Coursera and tried to run the code on Colab.
I have installed NegBio with pip and imported the required modules including NegBioParser().
However, when I try to load parser = NegBioParser() the session crashes, the reason in logs was first NO CUDA detected. I fixed it and switched to GPU runtime. After loading all dependencies and loading parser, the session crashed again... it gives no message what's wrong, I can just see some logs such as :
gzip: stdout: Broken pipe.
OR
2022-07-13 17:54:43.937715: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
and then
Kernel restarted

negbio_pipeline parse fails silently

Whenever my BioC XML file with documents contains any sentences that BLLIP parser fails to parse negbio_pipeline parse just terminates and does not produce any output files nor error messages. The only way I know of to find out which sentence causes issues is binary search: keep splitting XML and running negbio_pipeline parse on the separate until you isolate the problematic sentence.

It would be a lot more convenient to get an error message with details and/or just skip the problematic sentence and parse the rest of the dataset.

Warning in environment3.7.yml about pip-installed dependencies without pip as a dependency

(base) mghenis@penguin:~/NegBio$ conda env create -f environment3.7.yml

Produces this warning:

Warning: you have pip-installed dependencies in your environment file, but you do not list pip itself as one of your conda dependencies. Conda may not use the correct pip to install your packages, and they may end up in the wrong place. Please add an explicit pip dependency. I'm adding one for you, but still nagging you.

Removing duplicates from locs (location range of CUI annotations)

https://github.com/ncbi-nlp/NegBio/blob/master/negbio/pipeline/negdetect.py#L73-L76

        locs = []
        for ann in passage.annotations:
            total_loc = ann.get_total_location()
            locs.append((total_loc.offset, total_loc.offset + total_loc.length))

Here location range of CUIs are collected, out of which some of them can be duplicates. This happens because MetaMap creates multiple CUIs for the same text span.

And then neg_detector.py # detect() iterates over the for loop of locs:
https://github.com/ncbi-nlp/NegBio/blob/master/negbio/neg/neg_detector.py#L44

for loc in locs:

Isn't it better if we remove duplicates from locs in negdetect.py by using

locs = list(set(locs))

An example of duplicate loc elements:

For the sentence:

There is no spinal canal hematoma.

following two CUIs are generated in the same location span:

 <annotation id="2">
    <infon key="term">Hematoma</infon>
    <infon key="semtype">patf</infon>
    <infon key="CUI">C0018944</infon>
    <infon key="annotator">MetaMap</infon>
    <location length="8" offset="25"/>
    <text>hematoma</text>
  </annotation>
  
  <annotation id="4">
    <infon key="term">Hematoma Adverse Event</infon>
    <infon key="semtype">fndg</infon>
    <infon key="CUI">C1962958</infon>
    <infon key="annotator">MetaMap</infon>
    <location length="8" offset="25"/>
    <text>hematoma</text>
  </annotation>

jpype fails when using negbio in flask

Hey guys, I wrapped NegBio in a flask app and had JPype fail within the Stanford Dependencies python library. I had to modify the JPypeBackend.py file to attach the thread to the JVM. I know you don't maintain this source code, but just a heads up. Changes start on line 45:

num_thread = jpype.isThreadAttachedToJVM()
if num_thread is not 1:
     jpype.attachThreadToJVM()

JPypeBackend.py.zip
Attached the modified file here

Error downloading model (400 Bad Request)

main_mm text --metamap=$METAMAP_BIN --output=../test/test.neg.xml ../mimic_cxr/val/p10_p10003502_s51180958_1fa79752-9ddaf5b5-2120ae82-9fec50d6-51f48d1f.txt
{'--bllip-model': None,
'--cuis': 'examples/cuis-cvpr2017.txt',
'--metamap': '/GPUFS/nsccgz_ywang_zfd/caojindong/MetaMap/public_mm_main_2020v2/public_mm/bin/metamap20',
'--neg-patterns': 'negbio/patterns/neg_patterns.txt',
'--newline_is_sentence_break': False,
'--output': '../test/test.neg.xml',
'--split-document': False,
'--uncertainty-patterns': 'negbio/patterns/uncertainty_patterns.txt',
'--verbose': False,
'--word_sense_disambiguation': False,
'SOURCE': [],
'SOURCES': ['../mimic_cxr/val/p10_p10003502_s51180958_1fa79752-9ddaf5b5-2120ae82-9fec50d6-51f48d1f.txt'],
'bioc': False,
'text': True}
Error downloading model (400 Bad Request)

Highlighting need for giving META_MAP_HOME as absolute path

Though this is not a bug as such but would suggest to highlight the importance of mentioning META_MAP_HOME as absolute path in getting started page.

NegBio uses pymetamap where there's a discussion thread on issue related to giving path as something like ~/username/path. I myself had spent quite sometime with this issue. So its better if this is highlighted so that future users don't have to struggle with this issue.

Detecting negation for one CUI but failing to detect negation for other CUIs

Environment: Using MetaMap 2016v2
Sentence:

There is no spinal canal hematoma.

Among other CUIs, these are the ones I am focusing on:

<annotation id="2">
        <infon key="term">Hematoma</infon>
        <infon key="semtype">patf</infon>
        <infon key="CUI">C0018944</infon>
        <infon key="annotator">MetaMap</infon>
        <location length="8" offset="25"/>
        <text>hematoma</text>
      </annotation>
      <annotation id="3">
        <infon key="term">spinal hematoma</infon>
        <infon key="semtype">inpo</infon>
        <infon key="CUI">C0856150</infon>
        <infon key="annotator">MetaMap</infon>
        <location length="6" offset="12"/>
        <text>spinal</text>
      </annotation>

The term "hematoma" is negated by NegBio but fails to negate "spinal hematoma".

Here's the parse tree:
<infon key="parse tree">(S1 (S (S (NP (EX There)) (VP (VBZ is) (NP (DT no) (JJ spinal) (JJ canal) (NN hematoma)))) (. .)))</infon>

There's amod dependency tag edge between "spinal" and "hematoma".

<relation id="R2">
          <infon key="dependency">amod</infon>
          <node refid="T3" role="dependant"/>
          <node refid="T5" role="governor"/>
        </relation>

where T3 represents the word "spinal" and T5 represents the word "hematoma".

How should we handle this issue?
"no spinal canal hematoma" is identified as a noun phrase which begins with "no".
Shouldn't both the term "hematoma" as well as "spinal hematoma" come up as negation?

xml dump of the collection just before executing negdetect.detect(document, neg_detector) i.e. after parse tree and dependency tree have been formed is shared here: http://collabedit.com/b2e33

TypeError: convert_trees() got an unexpected keyword argument 'add_lemmas'

Not sure how to proceed with resolving this.

AttributeError: 'BioCAnnotation' object has no attribute 'get_total_location'

Extracting mentions...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10000/10000 [00:08<00:00, 1171.10it/s]
Classifying mentions...
0%| | 0/10000 [00:00<?, ?it/s]BioCSentence[offset=0,text='no acute cardiopulmonary process.',infons=[parse tree=(S1 (S (S (NP (DT no) (JJ acute)) (VP (VBP cardiopulmonary) (NP (NN process)))) (. .)))],annotations=[BioCAnnotation[id=T0,text='no',infons=[tag=DT,lemma=no],locations=[BioCLocation[offset=0,length=2]],],BioCAnnotation[id=T1,text='acute',infons=[tag=JJ,lemma=acute],locations=[BioCLocation[offset=3,length=5]],],BioCAnnotation[id=T2,text='cardiopulmonary',infons=[tag=VBP,lemma=cardiopulmonary,ROOT=True],locations=[BioCLocation[offset=9,length=15]],],BioCAnnotation[id=T3,text='process',infons=[tag=NN,lemma=process],locations=[BioCLocation[offset=25,length=7]],],BioCAnnotation[id=T4,text='.',infons=[tag=.,lemma=.],locations=[BioCLocation[offset=32,length=1]],]],relations=[BioCRelation[id=R0,infons=[dependency=neg],nodes=[BioCNode[refid=T0,role=dependant],BioCNode[refid=T1,role=governor]],],BioCRelation[id=R1,infons=[dependency=nsubj],nodes=[BioCNode[refid=T1,role=dependant],BioCNode[refid=T2,role=governor]],],BioCRelation[id=R2,infons=[dependency=dobj],nodes=[BioCNode[refid=T3,role=dependant],BioCNode[refid=T2,role=governor]],],BioCRelation[id=R3,infons=[dependency=punct],nodes=[BioCNode[refid=T4,role=dependant],BioCNode[refid=T2,role=governor]],]],]
ERROR:stages.classify:Cannot parse dependency graph [offset=0]
Traceback (most recent call last):
File "/home/mayt/chexpert-labeler/stages/classify.py", line 39, in detect
g = semgraph.load(sentence)
File "/home/mayt/.local/lib/python3.6/site-packages/negbio/neg/semgraph.py", line 26, in load
loc = ann.get_total_location()
AttributeError: 'BioCAnnotation' object has no attribute 'get_total_location'

it happende in :

Do you know the cause of the error？ thank you very much

[Feature Request] Word Sense Disambiguation option for MetaMap

Provide WSD option:

https://github.com/ncbi-nlp/NegBio/blob/master/negbio/pipeline/dner_mm.py#L48

concepts, error = mm.extract_concepts(sents, ids)

It would be nice to have this option.

[Feature Request] Restrict/Exclude semantic types

Currently there's an option --cuis to restrict concept ids.
MetaMap provides option to restrict/exclude semantic types.
This option has been recently added in pymetamap also.
Isn't it a good idea to provide this option in NegBio also?

"pip install negbio" doesn't work

The installation instructions say

pip install negbio

but I get

ERROR: Could not find a version that satisfies the requirement negbio (from versions: none)
ERROR: No matching distribution found for negbio

and it looks like negbio isn't in pypi?

Adding Phrases for CheXpert Analysis

Hi, love this tool, thanks so much for making it available! I was just wondering if there were any plans to make it possible to add additional types (e.g. tuberculosis) to the phrases folder.

Problem loading bllipparser.

{'--bllip-model': '~/.local/share/bllipparser/GENIA+PubMed',
 '--mention_phrases_dir': 'negbio/chexpert/phrases/mention',
 '--neg-patterns': 'negbio/chexpert/patterns/negation.txt',
 '--newline_is_sentence_break': False,
 '--output': '/content/NegBio/examples/test.neg.xml',
 '--post-negation-uncertainty-patterns': 'negbio/chexpert/patterns/post_negation_uncertainty.txt',
 '--pre-negation-uncertainty-patterns': 'negbio/chexpert/patterns/pre_negation_uncertainty.txt',
 '--split-document': False,
 '--unmention_phrases_dir': 'negbio/chexpert/phrases/unmention',
 '--verbose': False,
 'SOURCE': None,
 'SOURCES': ['/content/NegBio/negbio/examples/00000086.txt', '/content/NegBio/examples/00019248.txt'],
 'bioc': False,
 'text': True}
/usr/local/lib/python3.6/dist-packages/jpype/_core.py:217: UserWarning: 
-------------------------------------------------------------------------------
Deprecated: convertStrings was not specified when starting the JVM. The default
behavior in JPype will be False starting in JPype 0.8. The recommended setting
for new code is convertStrings=False.  The legacy value of True was assumed for
this session. If you are a user of an application that reported this warning,
please file a ticket with the developer.
-------------------------------------------------------------------------------

  """)
Traceback (most recent call last):
  File "/content/NegBio/negbio/main_chexpert.py", line 132, in <module>
    main()
  File "/content/NegBio/negbio/main_chexpert.py", line 88, in main
    parser = NegBioParser(model_dir=argv['--bllip-model'])
  File "/usr/local/lib/python3.6/dist-packages/negbio/pipeline/parse.py", line 20, in __init__
    self.rrp = RerankingParser.from_unified_model_dir(self.model_dir)
  File "/usr/local/lib/python3.6/dist-packages/bllipparser/RerankingParser.py", line 864, in from_unified_model_dir
    reranker_weights_filename) = get_unified_model_parameters(model_dir)
  File "/usr/local/lib/python3.6/dist-packages/bllipparser/RerankingParser.py", line 931, in get_unified_model_parameters
    raise IOError("Model directory '%s' does not exist" % model_dir)
OSError: Model directory '/root/.local/share/bllipparser/GENIA+PubMed' does not exist

ncbi-nlp / negbio Goto Github PK

negbio's People

Contributors

Stargazers

Watchers

Forkers

negbio's Issues

Conda list result for method A:

Conda list result for method B:

Recommend Projects

Recommend Topics

Recommend Org