seermedical / seer-py Goto Github PK

View Code? Open in Web Editor NEW

27.0 27.0 10.0 1.29 MB

Python SDK for the Seer data platform

License: MIT License

Python 99.70% Dockerfile 0.12% Shell 0.18%

team-algo-squad team-research

seer-py's People

Contributors

Stargazers

Watchers

Forkers

a-pedram fenilsuchak yasminamassoud matias-seer rachelstirling kitsosfil rachel-stirling davekii vani13 lyra-li

seer-py's Issues

Some mood survey result sets are unexpectedly blank

Possible related to party_id not being available on the queries.

`get_all_bookings_dataframe()` can break depending on GQL data returned

Steps to reproduce:

from seerpy import SeerConnect
client = SeerConnect()
bookings = client.get_all_bookings_dataframe("9bd42e23-3aee-46e9-81e6-a500aa1c074a",
                                             1588636941143, 1588636941143)

Output:

TypeError                                 Traceback (most recent call last)
<ipython-input-6-7f83369ec5e2> in <module>
      1 bookings = client.get_all_bookings_dataframe("9bd42e23-3aee-46e9-81e6-a500aa1c074a",
----> 2                                              1588636941143, 1588636941143)

~/miniconda3/envs/watcher/lib/python3.7/site-packages/seerpy/seerpy.py in get_all_bookings_dataframe(self, organisation_id, start_time, end_time)
    693         bookings_response = self.get_all_bookings(organisation_id, start_time, end_time)
    694         bookings = json_normalize(bookings_response).sort_index(axis=1)
--> 695         studies = self.pandas_flatten(bookings, 'patient.', 'studies')
    696         equipment = self.pandas_flatten(bookings, '', 'equipmentItems')
    697         bookings = bookings.drop('patient.studies', errors='ignore', axis='columns')

~/miniconda3/envs/watcher/lib/python3.7/site-packages/seerpy/seerpy.py in pandas_flatten(parent, parent_name, child_name)
    110         for i in range(len(parent)):
    111             parent_id = parent[parent_name+'id'][i]
--> 112             child = json_normalize(parent[parent_name+child_name][i]).sort_index(axis=1)
    113             child.columns = [child_name+'.' + str(col) for col in child.columns]
    114             child[parent_name+'id'] = parent_id

~/miniconda3/envs/watcher/lib/python3.7/site-packages/pandas/io/json/_normalize.py in json_normalize(data, record_path, meta, meta_prefix, record_prefix, errors, sep, max_level)
    256 
    257     if record_path is None:
--> 258         if any([isinstance(x, dict) for x in y.values()] for y in data):
    259             # naive normalization, this is idempotent for flat records
    260             # and potentially will inflate the data considerably for

TypeError: 'float' object is not iterable

get_all_bookings() still works. I believe the problem may be the pandas_flatten expects a pd.DataFrame column where all cells have lists of dict values, and it might be hitting an np.NaN instead.

Need access to the dataset

Hello @levink1978, I have registered on the epepilepsyecosystem.org and SeerGP, and I have sent an email with 2 secret phrases to [email protected] but I haven't received any reply yet. My seer account is [email protected] and could you please give me the access to the datasets? Thanks a lot!

start_time and end_time must be float, not np.float

start_time and end_time arguments should be passed as either a float, int or np.float, np.int.

for example, in the case of:
start = eeg_metadata['segments.startTime'] duration = eeg_metadata['segments.duration'] end = start + duration data = client.get_channel_data(eeg_metadata, from_time=start, to_time=end)
the client.get_channel_data call fails, but:
data = client.get_channel_data(eeg_metadata, from_time=float(start), to_time=float(end))
works fine. The conversion to float should be handled in seerpy

Empty Studies list

Hello,

I've been able to get the code from this example: https://github.com/seermedical/seer-py/blob/master/Examples/Example.ipynb to run up through the point where client.get_studies is called. However, when I print "studies," I just get an empty list.

I am using the email/password approach to authenticating. I never specify a server URL to SeerConnect - is that something I need to do to make sure I'm getting the right data?

Here's my code:

import numpy as np
import pandas as pd
from pandas.io.json import json_normalize
import seerpy
from seerpy.utils import plot_eeg

client = seerpy.SeerConnect()
studies = client.get_studies()
print(studies)
for study in studies:
print(study)

Here's the output from a terminal:

(seer) Donnie:Research aaron$ python exploreSeer.py
Login Successful
[]

Thanks,
Aaron

PS - If there's a better way to address this, you can email me at the email address in my github profile.

Feature Request: Limit max items returned in paginated query

seer-py/seerpy/seerpy.py

Line 158 in ec2599f

 def get_paginated_response(self, query_string, variable_values, limit, object_path, 

Can I suggest that we add a parameter to the client.get_paginated_response() that limits the total amount of items that it returns? Currently we have limit which limits the amount of items returned per page in the paginated request. But nothing to limit the total number of items returned.

I am happy to implement this. I just want to get peoples input on this idea, and what the argument should be called. Can I suggest calling it max_items?

NOTE: for queries such as the following, which contain nested lists of items, the max_items would only affect the outer list (2 users, irrespective of how many surveys for those users).

query userSurveysLimit{
  users(offset: 82, limit:2){
    username
    surveys{
        id
    }
  }
}

Where it is useful

When getting studies, we currently have client.get_studies() which gets ALL studies that the user has access to. If a user or developer is just experimenting, they may not want to wait around for 10 or more minutes to get ALL studies. Its an unpleasant experience.

Upstream changes

If this idea is accepted, I suggest also adding the same arguments to all other functions that call get_paginated_response, such as get_studies() get_studies_dataframe(), etc.

I am happy to implement this as well.

Besides the functions in seerpy.py are there any other calls to client.get_paginated_response() that i should know about?

Error pulling data

Hello Levin,

I was trying to download the contest data from seer and got the following issue. Could you please help me with it?

Thanks a lot!

Trouble while pulling ContestData

Hi,
After calling:
allData = None
allData = client.createMetaData(study)

I'm getting the following KeyError: 'id' error:

Study: Pat1Train
Retrieving metadata...

KeyError Traceback (most recent call last)
/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2441 try:
-> 2442 return self._engine.get_loc(key)
2443 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'id'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
in ()
3
4 allData = None
----> 5 allData = client.createMetaData(study)

/anaconda3/lib/python3.6/site-packages/seerpy/seerpy.py in createMetaData(self, study)
312 channelGroupsM = channelGroups.merge(segments, how='left', on='channelGroups.id', suffixes=('', '_y'))
313 channelGroupsM = channelGroupsM.merge(channels, how='left', on='channelGroups.id', suffixes=('', '_y'))
--> 314 allData = allData.merge(channelGroupsM, how='left', on='id', suffixes=('', '_y'))
315
316 return allData

/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in merge(self, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator)
4720 right_on=right_on, left_index=left_index,
4721 right_index=right_index, sort=sort, suffixes=suffixes,
-> 4722 copy=copy, indicator=indicator)
4723
4724 def round(self, decimals=0, *args, **kwargs):

/anaconda3/lib/python3.6/site-packages/pandas/core/reshape/merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator)
51 right_on=right_on, left_index=left_index,
52 right_index=right_index, sort=sort, suffixes=suffixes,
---> 53 copy=copy, indicator=indicator)
54 return op.get_result()
55

/anaconda3/lib/python3.6/site-packages/pandas/core/reshape/merge.py in init(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator)
556 (self.left_join_keys,
557 self.right_join_keys,
--> 558 self.join_names) = self._get_merge_keys()
559
560 # validate the merge keys dtypes. We may need to coerce

/anaconda3/lib/python3.6/site-packages/pandas/core/reshape/merge.py in _get_merge_keys(self)
821 right_keys.append(rk)
822 if lk is not None:
--> 823 left_keys.append(left[lk]._values)
824 join_names.append(lk)
825 else:

/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in getitem(self, key)
1962 return self._getitem_multilevel(key)
1963 else:
-> 1964 return self._getitem_column(key)
1965
1966 def _getitem_column(self, key):

/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_column(self, key)
1969 # get column
1970 if self.columns.is_unique:
-> 1971 return self._get_item_cache(key)
1972
1973 # duplicate columns & possible reduce dimensionality

/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
1643 res = cache.get(item)
1644 if res is None:
-> 1645 values = self._data.get(item)
1646 res = self._box_item_values(item, values)
1647 cache[item] = res

/anaconda3/lib/python3.6/site-packages/pandas/core/internals.py in get(self, item, fastpath)
3588
3589 if not isnull(item):
-> 3590 loc = self.items.get_loc(item)
3591 else:
3592 indexer = np.arange(len(self.items))[isnull(self.items)]

/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2442 return self._engine.get_loc(key)
2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key))
2445
2446 indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'id'

Any help would be great, thanks.

Empatica Accelerometery data the same for all channels

I used the msg_data_downloader.pyscript to download the data and have found that the Empatica accelerometery data is exactly the same across all channels for each patient. Not sure if this is a user error or an issue with the script??

data format issues

The data provided by "My Seizure Gauge Data" is not in correct format. eg. 'MSEL_01575' - checked heart rate signal using code based on "Example.ipynb". This resulted data in the order of 1.0 e+11

Secondly, the "msg_data_downloader.py" results in multiple parquet files. Providing an example code to processes these files would be beneficial.

msg data downloader not working

I have authenticated my account, and the NeuroVista download worked just fine. For MSG, the script does not seem to work.

BTW I can get access to MSG data from the Seer website UI (34 patients). Are they the same?

Thank you!

~/De/seer-py | on master ?1  python Examples/msg_data_downloader.py                  1 err | took 7s | seer-py py | at 00:23:25
2022-12-02 00:23:46,615 Login Successful
2022-12-02 00:23:47,137 >>> {"query": "query getStudyLabelGroups($study_id: String!, $limit: PaginationAmount, $offset: Int) {\n  study(id: $study_id) {\n    id\n    name\n    labelGroups(limit: $limit, offset: $offset) {\n      id\n      name\n      description\n      labelType\n      numberOfLabels\n    }\n  }\n}", "variables": {"study_id": "fff9aaa9-b104-46e8-9227-b1b76d6f333e", "limit": 50, "offset": 0}}
2022-12-02 00:23:48,467 <<< {"errors":[{"statusCode":400,"errorCode":"INVALID_PARAMETERS","message":"Cannot query field \"labelType\" on type \"StudyLabelGroup\". Did you mean \"labels\"?","locations":[{"line":9,"column":7}]}]}

Traceback (most recent call last):
  File "/Users/mac/Desktop/seer-py/Examples/msg_data_downloader.py", line 91, in <module>
    run(SeerConnect(), args.outpath)
  File "/Users/mac/Desktop/seer-py/Examples/msg_data_downloader.py", line 58, in run
    downloader = DataDownloader(client, study_id, output_dir)
  File "/Users/mac/Desktop/seer-py/Examples/downloader/downloader.py", line 23, in __init__
    self.label_groups = self.get_label_groups()
  File "/Users/mac/Desktop/seer-py/Examples/downloader/downloader.py", line 37, in get_label_groups
    return self.client.get_label_groups_for_studies([self.study_id])
  File "/Users/mac/Desktop/seer-py/seerpy/seerpy.py", line 984, in get_label_groups_for_studies
    _results = self.get_label_groups_for_study(study_id, limit=limit)
  File "/Users/mac/Desktop/seer-py/seerpy/seerpy.py", line 956, in get_label_groups_for_study
    results = self.get_paginated_response(graphql.GET_ALL_LABEL_GROUPS_FOR_STUDY_ID_PAGED,
  File "/Users/mac/Desktop/seer-py/seerpy/seerpy.py", line 232, in get_paginated_response
    response = self.execute_query(query_string, variable_values=variable_values,
  File "/Users/mac/Desktop/seer-py/seerpy/seerpy.py", line 142, in execute_query
    response = self.graphql_client(party_id).execute(gql(query_string),
  File "/Users/mac/Desktop/seer-py/venv/lib/python3.9/site-packages/gql/client.py", line 193, in execute
    return self.execute_sync(document, *args, **kwargs)
  File "/Users/mac/Desktop/seer-py/venv/lib/python3.9/site-packages/gql/client.py", line 137, in execute_sync
    return session.execute(document, *args, **kwargs)
  File "/Users/mac/Desktop/seer-py/venv/lib/python3.9/site-packages/gql/client.py", line 447, in execute
    raise TransportQueryError(
gql.transport.exceptions.TransportQueryError: {'statusCode': 400, 'errorCode': 'INVALID_PARAMETERS', 'message': 'Cannot query field "labelType" on type "StudyLabelGroup". Did you mean "labels"?', 'locations': [{'line': 9, 'column': 7}]}

ValueError: cannot reshape array of size 58 into shape (16,400)

I get this error when i download file 160 in Test Files for Partient 3 ,Can anyone help ?

Migrate across to resources schema

I don't feel like there's any rush on this, but it's something that would certainly be valuable to do.

Feature Request: Verbose Mode for functions retrieving lots of records

For some queries that retrieve lots of records (and take a long time to run), the user experience would be improved if we allow the user to receive some feedback on the progress made so far. Eg, by adding a verbose argument.

If the data is being retrieved in batches, then it could print out the number of items it has retrieved so far.

My current thought, is to have the progress show up as a single line, which gets cleared and overwritten with every batch of data it receives. Something along the lines of

>>> client.get_studies(verbose=True)
received 150 items

And if the user specifies an upper limit to the number of items to retrieve, using the max_items argument outlined in gihub issue#130, then it could display something like this:

>>> client.get_studies(max_items=200, verbose=True)
received 150 items (out of a max of 200)

I am happy to implement this, but I want to get peoples feedback, and suggestions.

No module named 'graphql.language'; 'graphql' is not a package

While I tried to import the import seerpy module the following error apeard.

ModuleNotFoundError Traceback (most recent call last)
in ()
----> 1 import seerpy
2 from seerpy.plotting import seerPlot

~\Downloads\seer-py-0.1.1-beta\seer-py-0.1.1-beta\seerpy\seerpy.py in ()
1 # Copyright 2017 Seer Medical Pty Ltd, Inc. or its affiliates. All Rights Reserved.
----> 2 from gql import gql, Client as GQLClient
3 from gql.transport.requests import RequestsHTTPTransport
4 import numpy as np
5 import pandas as pd

~\Anaconda3\lib\site-packages\gql_init_.py in ()
----> 1 from .gql import gql
2 from .client import Client
3
4 all = ['gql', 'Client']

~\Anaconda3\lib\site-packages\gql\gql.py in ()
1 import six
----> 2 from graphql.language.parser import parse
3 from graphql.language.source import Source
4
5

ModuleNotFoundError: No module named 'graphql.language'; 'graphql' is not a packag

No module named seerpy

I get this error when i run the example

_**> ModuleNotFoundError                       Traceback (most recent call last)
> <ipython-input-1-07d729194a8a> in <module>
>       8 from scipy.io import savemat  # pylint: disable=unused-import
>       9 
> ---> 10 import seerpy
>      11 
>      12 
> 
> ModuleNotFoundError: No module named 'seerpy'**_

I downloaded the lastest version of seerpy repository

seermedical / seer-py Goto Github PK

seer-py's People

Contributors

Stargazers

Watchers

Forkers

seer-py's Issues

Study: Pat1Train Retrieving metadata...

I get this error when i run the example

Recommend Projects

Recommend Topics

Recommend Org

Study: Pat1Train
Retrieving metadata...