Git Product home page Git Product logo

snomed-boot's Introduction

Snomed Boot

Build Status codecov

An extensible Java framework for loading SNOMED CT RF2 content into any implementation.

Key Features

  • Extensible factory based design
  • Supports component streaming for better overall performance and less pressure on memory
  • Ability to load only the latest version of each component when using a collection of snapshot archives
  • Multithreaded
    • Concepts load first
    • Then Relationships and Descriptions load in parallel
    • Then all reference set memebers in parallel.
  • Loading Profiles allow content to be filtered
    • by content type
    • by reference set filename pattern
    • by reference set identifier

Component Factories

This project is oriented around the ComponentFactory and HistoryAwareComponentFactory. These interfaces allow a factory implementation to recieve the properties of every component and member. The HistoryAwareComponentFactory is useful when loading full files containing more than one release.

In-Memory Implementation

The default factory implementation loads content into memory. It builds a map of concepts with their descriptions and relationships connected. This is an extremely fast way to get hold the transitive closure for every concept.

// Create release importer
ReleaseImporter releaseImporter = new ReleaseImporter();

// Load SNOMED CT components into memory
ComponentStore componentStore = new ComponentStore();
releaseImporter.loadSnapshotReleaseFiles("release/SnomedCT_RF2Release_INT_20170131", LoadingProfile.light, new ComponentFactoryImpl(componentStore));
Map<Long, ? extends Concept> conceptMap = componentStore.getConcepts();

// Get transitive closure for concept 285355007 | Blood blister (disorder) |
Set<Long> transitiveClosure = conceptMap.get("285355007").getInferredAncestorIds();

Contribute

We welcome your suggestions and contributions. Feel free to fork this project and submit a pull request of your changes.

snomed-boot's People

Contributors

codermchu avatar dependabot[bot] avatar kaicode avatar pgwilliams avatar quyenly87 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

snomed-boot's Issues

UK release filenames are not matched for loading

@rorydavidson -
The UK edition uses the following file names for descriptions and language reference set -
sct2_Description_Snapshot-en-GB_GB1000000_20181031.txt
The notation en-GB is not incorrect, but only used by the UK. This caused the files to be skipped on import.

ConceptImpl.java throws IllegalStateException

when i test with LoadingProfile.complete config to load SnomedCT_RF2Release_INT_20160731 data,
i got an IllegalStateException.
then i found the code in ConceptImpl.java

private Set collectParentIds(ConceptImpl concept, Set ancestors, Stack stack) {
for (Concept parentInt : concept.parents) {
ConceptImpl parent = (ConceptImpl) parentInt;
final Long parentId = parent.id;
if (stack.contains(parentId)) {
stack.push(parentId);
throw new IllegalStateException("Ancestor loop detected: " + stack.toString());

}
ancestors.add(parentId);
stack.push(parentId);
collectParentIds(parent, ancestors, stack);
stack.pop();
}
return ancestors;
}

could you talk about why throws IllegalStateException there?
why not use continue?
thanks

File loading exceptions, other than in concept file, are logged but not thrown.

The concept file is loaded in the main thread. If there is an issue loading the file, for example a missing file header, an exception is thrown from the ReleaseImporter class. However all other files are loaded in threads and any loading errors do not cause an exception to be thrown. This means that there is no way for an implementation to know if the complete content has been loaded or not.

Factory method names are misleading

The factory method names "createConcept", "addRelationship" and "addDescription" do not accurately reflect the meaning of a row in an RF2 file. It may not be a new concept being created but a new state of an existing concept. Similarly for relationships and descriptions, the new state of these components could be updating or retiring and existing component.
Names like "newConceptState" and "newDescriptionState" would be much more accurate and not lead to incorrect assumptions by developers using or extending the library.

Null pointer if stated relationship or text definition files missing.

I am seeing this error if the text definition RF2 file is missing from the archive I am loading:
2018-02-13 10:09:05,893 [ERROR ] [pool-1-thread-4] org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun - Failed to read or process lines. java.lang.NullPointerException at java.nio.file.Files.provider(Files.java:97) at java.nio.file.Files.newInputStream(Files.java:152) at java.nio.file.Files.newBufferedReader(Files.java:2784) at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.readLines(ReleaseImporter.java:412) at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.access$200(ReleaseImporter.java:106) at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun$6.call(ReleaseImporter.java:363) at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun$6.call(ReleaseImporter.java:359) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

Works only with International Release?

Tried this with the Swedish release and it seems it was not designed to handle extensions. Is national releases within scope of this package?
Cheers,
Daniel

Loading RF2 with missing header row should fail

I've just been caught out loading a large RF2 reference set which had no header row. SNOMED Boot loaded the file without any error. It told me that the field names of the reference set were the values from the first row which was actually an OWL expression value!

The library should check that the first line of every RF2 file being loaded starts with "id effectiveTime active moduleId".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.