Git Product home page Git Product logo

arch-ontology's People

Contributors

conniez avatar jklann avatar lcphillips2 avatar matthewjoss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arch-ontology's Issues

dbo.stringpart function doesn't use @delimiter arg

This is a minor issue, I suppose. But the dbo.stringpart function in ontology_utils_mssql.sql script doesn't make use of @delimiter argument. It seems, instead of the '' it should say @delimeter in the following lines:
WHILE @num!=@el and CHARINDEX('\', @stringToSplit) > 0
SELECT @pos = CHARINDEX('\', @stringToSplit)
Please reject this issue, if this is an incorrect assumption. Thanks.

Same Dx and PX c_basecodes

Both Pcornet_Proc and Pcornet_Diag ontology table have the following c_basecodes:
ICD10:B00, ICD10:B01, ICD10:B02, ICD10:B03, ICD10:B04, ICD10:B20, ICD10:B30, ICD10:B33, ICD10:B34, ICD10:B40, ICD10:B41, ICD10:B42, ICD10:B43, ICD10:B44, ICD10:B50, ICD10:B51, ICD10:B52, ICD10:B53, ICD10:B54, ICD10:B70, ICD10:B80, ICD10:B82, ICD10:B83, ICD10:B90, ICD10:B91, ICD10:B92, ICD10:C01, ICD10:C02, ICD10:C03, ICD10:C05, ICD10:C21, ICD10:C22, ICD10:C23, ICD10:C25, ICD10:C51, ICD10:C71, ICD10:C72, ICD10:C75, ICD10:C76, ICD10:C81, ICD10:C91, ICD10:D00, ICD10:D01, ICD10:D02, ICD10:D70, ICD10:D71, ICD10:D72, ICD10:D80, ICD10:D81, ICD10:D82, ICD10:F01, ICD10:F02, ICD10:F06, ICD10:F07, ICD10:F09, ICD10:F13, ICD10:F14, ICD10:F15

This may cause incorrect counts query results. For example: If Diagnosis code 'ICD10:F15' is queried for then both diagnosis and procedure records will be included in the patient count. Likewise, if Procedure code 'ICD10:F15' is queried.

In genera, diagnosis codes should always be different from procedure codes in the ICD coding system. Please look into this.

Loading medication ontology with Oracle sqlldr results in error "column C_TOOLTIP. second enclosure string not present"

Command:
ORACLE_SID=ORCL sqlldr <user>/<password> control=PCORNET_MED.ctl data=PCORNET_MED.TXT bad=PCORNET_MED.bad log=PCORNET_MED.log errors=10000

Errors from sqlldr log file (nearly all the rows):
Record 1: Rejected - Error on table BLUEHERONMETADATA.PCORNET_MED, column C_TOOLTIP. second enclosure string not present

The control file I'm using is attached (changed to .txt since oddly github only allows specific extensions).

PCORNET_MED_CTL.txt

Umlauts Replaced by ? inside Diag Ontology

20 ICD10 codes in ForUpgradeOnly folder txt file had ? instead of 'oe' umlaut values, and in the pcornet_diag.txt v2.1.2 file they appear to have been partially fixed (c_tooltip still has ?). Was that part of the 2.1.2 fix you applied earlier?

Example:
Inside ForUpgradeOnly folder - ICD10-CM-SCILHS-2015AA.zip:
6|"\PCORI\DIAGNOSIS\10(C00-D49) Neop~k3n8(C81-C96) Mali~u9p7(C88) Malignan~9aq7(C88.0) Walden~cfpb\"|"**Waldenstr?m macroglobulinemia**"|"N"|"FAE"|0|"ICD10:C88.0"||"concept_cd"|"CONCEPT_DIMENSION"|"concept_path"|"T"|"LIKE"|"\PCORI\DIAGNOSIS\10(C00-D49) Neop~k3n8(C81-C96) Mali~u9p7(C88) Malignan~9aq7(C88.0) Walden~cfpb\"||"Diagnoses \ Neoplasms (c00-d49) \ Malignant neoplasms of lymphoid, hematopoietic and related tissue (c81-c96) \ Malignant immunoproliferative diseases and certain other b-cell lymphomas \ **Waldenstr?m macroglobulinemia**"|"@"|2015/01/01 12:00:00 AM|2016/06/29 04:46:17 PM|2016/06/29 04:46:18 PM|"RPDR_2015"|||"\PCORI\DIAGNOSIS\10(C00-D49) Neop~k3n8(C81-C96) Mali~u9p7(C88) Malignan~9aq7\"|"(C88.0) Walden~cfpb"|"C88.0"

Inside pcornet_diag.zip in ontology folder
6|"\PCORI\DIAGNOSIS\10(C00-D49) Neop~k3n8(C81-C96) Mali~u9p7(C88) Malignan~9aq7(C88.0) Walden~cfpb\"|**"Waldenstroem macroglobulinemia"**|"N"|"FAE"|0|"ICD10:C88.0"||"concept_cd"|"CONCEPT_DIMENSION"|"concept_path"|"T"|"LIKE"|"\PCORI\DIAGNOSIS\10(C00-D49) Neop~k3n8(C81-C96) Mali~u9p7(C88) Malignan~9aq7(C88.0) Walden~cfpb\"||"Diagnoses \ Neoplasms (c00-d49) \ Malignant neoplasms of lymphoid, hematopoietic and related tissue (c81-c96) \ Malignant immunoproliferative diseases and certain other b-cell lymphomas \ **Waldenstr?m macroglobulinemia**"|"@"|2015/01/01 12:00:00 AM|2016/06/29 04:46:17 PM|2016/06/29 04:46:18 PM|"RPDR_2015"|||"\PCORI\DIAGNOSIS\10(C00-D49) Neop~k3n8(C81-C96) Mali~u9p7(C88) Malignan~9aq7\"|"(C88.0) Walden~cfpb"|"C88.0"
This affects the following fields: c_name, c_tooltip, but not c_dimcode, so the other bug I found for 2.1.2 milestone doesn't seem to address this issue.

For consistency's sake, it would make sense to apply the change to the ForUpgradeOnly file as well.

Question: should this fix be applied consistently to both c_name and c_tooltip?

Minor cleanup

Some c_symbols have '/'
Some pcori basecodes have prefixes
Doesn't affect anything, just for cleanliness

c_dimcode needs a fix for several more pcornet_agetree.txt rows

@jklann , per our dicussion Monday, I checked and HIPAA requires obscuring age if greater than 89, so my original comment was wrong. However, while I was checking for that, I noticed some more bad c_dimcodes in pcornet_agetree.txt for v2.1.1 of ontology. (https://github.com/SCILHS/scilhs-ontology/tree/master/Ontology/ForUpgradingOnly/pcornet_agetree.txt)

  1. Date query in c_dimcode reversed.
    C_FULLNAME|C_OPERATOR|C_DIMCODE
    \PCORI\ENCOUNTER\Age at visit>= 65 years old\80| BETWEEN|((select birth_date from PCORI_Dev.dbo.patient_dimension where patient_num = PCORI_Dev.dbo.visit_dimension.patient_num) + (365.25 * 81)-1) AND ((select birth_date from PCORI_Dev.dbo.patient_dimension where patient_num = PCORI_Dev.dbo.visit_dimension.patient_num) + (365.25 * 80)-1)

\PCORI\ENCOUNTER\Age at visit>= 65 years old\81| BETWEEN|((select birth_date from PCORI_Dev.dbo.patient_dimension where patient_num = PCORI_Dev.dbo.visit_dimension.patient_num) + (365.25 * 82)-1) AND ((select birth_date from PCORI_Dev.dbo.patient_dimension where patient_num = PCORI_Dev.dbo.visit_dimension.patient_num) + (365.25 * 81)-1)

  1. Date query in c_dimcode reversed and wrong numbers
    C_FULLNAME|C_OPERATOR|C_DIMCODE
    \PCORI\ENCOUNTER\Age at visit>= 65 years old\89|BETWEEN|((select birth_date from PCORI_Dev.dbo.patient_dimension where patient_num = PCORI_Dev.dbo.visit_dimension.patient_num) + (365.25 * 83)-1) AND ((select birth_date from PCORI_Dev.dbo.patient_dimension where patient_num = PCORI_Dev.dbo.visit_dimension.patient_num) + (365.25 * 82)-1)

Please review and make corrections accordingly. Thank you!

Hiddens and inactives cleanup

Hiddens and inactives could be confusing. Consider making hidden anything in the data model and inactive anything for a future release. Also consider releasing without hiddens.

PCORI_BASECODE incorrect for some principal discharge diagnosis flags

The PCORI_BASECODE values are inconsistent with the [CDM v3 specification](http://www.pcornet.org/wp-content/uploads/2014/07/2015-07-29-PCORnet-Common-Data-Model-v3dot0-RELEASE.pdf] for "Principal", "Secondary", and "Unable to classify" for the principal discharge diagnosis flag %28PDX).

When running SCILHS/i2p-transform, I get values like '1' and '2' in my Diagnosis table rather than 'P' or 'S'.

The following ...

select c_fullname, c_name, c_basecode, pcori_basecode 
from "&&i2b2_meta_schema".pcornet_diag 
where c_fullname like '\PCORI_MOD\PDX\%'
and c_name in ('Principal', 'Secondary', 'Unable to classify');`

...yields

\PCORI_MOD\PDX\S\   Secondary   2   2
\PCORI_MOD\PDX\X\   Unable to classify  0   0
\PCORI_MOD\PDX\P\   Principal   DiagObs:PRIMARY_DX_YN   1

c_fullname in pcornet_lab fix. Connies issue #3.

In pcornet_lab c_fullname for Version row is actually "\PCORI\LAB_RESULT_CM\Version", so the setversion() function doesn't work on it, because it's case-sensitive (defined in ontology-utils-mssql.sql). The two solutions would be a) make the c_fullname consistent with other tables ('VERSION'), or b) adapt the ontology-utils-mssql script to disregard case when updating the Version c_name.

If these assumptions above are correct, we will go ahead and make the corrections on our side, not waiting for official corrections on GitHub, however, perhaps down the road we wish these could be made official, so we are consistent with other sites.

Also, could you please recap what the versions for each pcornet table is going to look like with the latest upgrade? Here's how we believe it should look in the end:

pcornet_demo
2.1.1
pcornet_diag
2.1.1
pcornet_enc
2.1.1
pcornet_enroll
2.1
pcornet_lab
2.1
pcornet_med
2.2
pcornet_proc
2.1
pcornet_vital
2.1

'Hispanic' demographic fix. Connies issue #2.

The insert statement for '\PCORI\DEMOGRAPHIC\HISPANIC\R' has some probably erroneous values.
line 79: INSERT INTO [PCORI_Dev].[dbo].[pcornet_demo]([C_HLEVEL], [C_FULLNAME], [C_NAME], [C_SYNONYM_CD], [C_VISUALATTRIBUTES], [C_TOTALNUM], [C_BASECODE], [C_METADATAXML], [C_FACTTABLECOLUMN], [C_TABLENAME], [C_COLUMNNAME], [C_COLUMNDATATYPE], [C_OPERATOR], [C_DIMCODE], [C_COMMENT], [C_TOOLTIP], [M_APPLIED_PATH], [UPDATE_DATE], [DOWNLOAD_DATE], [IMPORT_DATE], [SOURCESYSTEM_CD], [VALUETYPE_CD], [M_EXCLUSION_CD], [C_PATH], [C_SYMBOL], [PCORI_BASECODE])
VALUES(3, '\PCORI\DEMOGRAPHIC\HISPANIC\R', 'Refuse to Answer', 'N', 'LAE', NULL, 'ETHNICITY:R', '', 'concept_cd', 'PATIENT_DIMENSION', 'RACE_CD', 'T', 'IN', '''07'',''r'',''refused''', '', 'Non-Hispanic', '@', '20140509 11:12:04.0', '20140509 11:12:04.0', '20140509 11:12:04.0', 'PCORNET_CDM', '', '', '\PCORI\DEMOGRAPHIC\HISPANIC', 'R', 'R')

Should that instead be :
INSERT INTO [PCORI_Dev].[dbo].[pcornet_demo]([C_HLEVEL], [C_FULLNAME], [C_NAME], [C_SYNONYM_CD], [C_VISUALATTRIBUTES], [C_TOTALNUM], [C_BASECODE], [C_METADATAXML], [C_FACTTABLECOLUMN], [C_TABLENAME], [C_COLUMNNAME], [C_COLUMNDATATYPE], [C_OPERATOR], [C_DIMCODE], [C_COMMENT], [C_TOOLTIP], [M_APPLIED_PATH], [UPDATE_DATE], [DOWNLOAD_DATE], [IMPORT_DATE], [SOURCESYSTEM_CD], [VALUETYPE_CD], [M_EXCLUSION_CD], [C_PATH], [C_SYMBOL], [PCORI_BASECODE])
VALUES(3, '\PCORI\DEMOGRAPHIC\HISPANIC\R', 'Refuse to Answer', 'N', 'LAE', NULL, 'ETHNICITY:R', '', 'PATIENT_NUM', 'PATIENT_DIMENSION', 'RACE_CD', 'T', 'IN', '''07'',''r'',''refused''', '', 'Refuse to Answer', '@', '20140509 11:12:04.0', '20140509 11:12:04.0', '20140509 11:12:04.0', 'PCORNET_CDM', '', '', '\PCORI\DEMOGRAPHIC\HISPANIC', 'R', 'R')

pcornet_agetree weird c_dimcode for Age 24

@jklann , we noticed that inside https://github.com/SCILHS/scilhs-ontology/blob/master/Ontology/ForUpgradingOnly/pcornet_agetree.txt file, c_dimcode value for '\PCORI\ENCOUNTER\Age at visit\18-34 years old\24 years old' is not consistent with the other calculation formulas. For example, for 23 years old it's
((select birth_date from PCORI_Dev.dbo.patient_dimension where patient_num = PCORI_Dev.dbo.visit_dimension.patient_num) + (365.25 * 23)-1) AND ((select birth_date from PCORI_Dev.dbo.patient_dimension where patient_num = PCORI_Dev.dbo.visit_dimension.patient_num) + (365.25 * 24)-1)
But for 24 years old, it's:
getdate() - (365.25 *25) + 1 AND getdate() - (365.25 * 24) + 1

Could you please look into that and if necessary adjust the released file? Thank you!

ICD-10 needs PCS and c_name fixes

ICD-10 ontology for procedures needs a different basecode prefix to differentiate it from diagnoses. Also the names have '@' signs strewn throughout, which is cluttering.

How are the "D~s1uh" parts of Diagnosis paths computed?

I'm thinking about adopting the SCILHS pcornet diagnosis hierarchy, but I'd like to retain our ability to update our ontologies directly from UMLs. How are the D~s1uh parts of Diagnosis paths computed?

<key>\\PCORI_DIAG\PCORI\DIAGNOSIS\09\(001-999.99) D~qlur\(460-519.99) D~s1uh\(480-488.99) P~xhpo\(486) Pneumoni~ido6\</key>
<dimcode>\PCORI\DIAGNOSIS\09\(001-999.99) D~qlur\(460-519.99) D~s1uh\(480-488.99) P~xhpo\(486) Pneumoni~ido6\</dimcode>
<tooltip>ICD9CM \ Diseases and injuries \ Diseases of the respiratory system \ Pneumonia and influenza \ Pneumonia, organism unspecified</tooltip>

Release new pcornet_med ontology with pcori_ndc and pcori_rxnorm

Release new pcornet_med ontology with pcori_ndc and pcori_rxnorm already populated, to obviate the need to run my complex script that fills these in. Perhaps also release a simpler propagation script that can pull the columns forward into local children.

Diagnosis: remove ICD-9 from ICD-10 tree?

There is a feeling that ICD-9 should be removed from the ICD-10 tree (there is still a separate ICD-9 tree). Please comment if you have feelings on this!

E-mail thread:

Shawn Murphy said:
Thinking [removal of combined tree] would make it less mapping dependent for this audience. Clearly it depends of the type of user we are trying to satisfy.

From the transforms point of view, it seems that repeating codes in various ontologies is causing problems.

Thanks,
Shawn.

From: "Weber, Griffin M" [email protected]

My opinion on this is:

Partners created the mixed ICD10 and ICD9 ontology to make queries easier for investigators. This is useful in a stand alone i2b2 instance. However, I don't think this is appropriate for a federated network. There is no official ICD10-ICD9 mapping. So, different institutions might handle this differently. It is also challenging to map data across institutions. Mixing ontologies makes this more complicated. My suggestion would be to have an ICD10 ontology and a separate ICD9 ontology in SCILHS. If they have to be mixed, then update the ETL to exclude ICD9 codes within the ICD10 ontology. This can probably be done using the c_fullname in combination with a regular expression on the pcori_basecode field.

c_tooltip in pcornet_lab update. Connies Issue #1.

The pcornet_lab table upgrade, had this code:

line 47: update pcornet_lab set c_tooltip=replace(c_fullname,'Renal Function','Electrolytes') where c_fullname like '%\CREATININE%'

Should that instead be :
update pcornet_lab set c_tooltip=replace(c_tooltip,'Renal function','Electrolytes') where c_fullname like '%\CREATININE%'

We ran the original code, but while reviewing that just didn't make much sense to replace c_tooltip with data from c_fullname, because now c_tooltip looked different from other c_tooltip values (and none of the c_fullname values contained 'Renal Function', and also currently c_tooltip has lower-case 'function', so the intended replacement didn't occur anyway.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.