Comments (7)
The Snowstorm "semantic index" is an index of all concept parents, ancestors, attributes and attribute groups. This is used to answer the findConceptParents
call and other hierarchy and ECL queries.
Initial thoughts:
- The extension inferred relationship
739111000009126
will have been imported because a relationship with that identifier does not exist in the International Edition. Checks are based on the component id here. - The semantic index may have been updated incorrectly because the triple 81260002 is-a 321351000009104 is active in one relationship and inactive using a different relationship identifier. Snowstorm does not always catch this sort of update when importing an extension.
There is a workaround for this scenario. Could you try rebuilding the semantic index please? This can be done using the rebuildBranchTransitiveClosure
function under the Concepts area of Swagger.
(This will move to the new Admin area of Swagger in v3.x)
from snowstorm.
There are a few duplicate triples like this in the International Snapshot. When building the semantic index we sort by effectiveTime and active to get the most effective relationships in the right order for processing but it looks like avoiding duplicate triples with different relationship ids is not working when importing a delta. I would be interested to hear if rebuilding the semantic index solves this.
from snowstorm.
I just ran the rebuild. Now I do get back that parent when I query the MAIN/SNOMED-VET endpoint, but it is very slow to return (like 6-7 seconds). Before it was pretty much instantaneous. I also see this log message output when I call that endpoint now:
2019-04-30 12:04:32.967 WARN 2794 --- [/O dispatcher 1] org.elasticsearch.client.RestClient : request [GET http://localhost:9200/es-query/query-concept/_search?typed_keys=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&search_type=dfs_query_then_fetch&batched_reduce_size=512] returned 1 warnings:
[299 Elasticsearch-6.4.2-04711c2 "Deprecated: the number of terms [699096] used in the Terms Query
request has exceeded the allowed maximum of [65536]. This maximum can be set by changing the
[index.max_terms_count] index level setting." "Tue, 30 Apr 2019 17:04:25 GMT"]
that seems surprising for just a single concept parent query.
from snowstorm.
Thanks for trying that. I'm glad you are getting the desired parents back now. The semantic index rebuild on this branch had quite a performance impact didn't it!
It's slower now because we now have two full semantic indexes sitting on top of each other. One on MAIN
and the other on the MAIN/SNOMED-VET
branch. The large query and slower query time is because the query clause is excluding all the concepts in the MAIN semantic index after they were all replaced when it was rebuilt on SNOMED-VET. Just about the only weakness of Snowstorm is that if you replace tens of thousands of components on branches other than MAIN things will start to slow down.
I've marked this down as a bug. It's going to take some thought to solve this without impacting the performance of the incremental semantic index update. Thanks for reporting the issue.
from snowstorm.
@dkincaid If you would like this working now another workaround you could try is to import the vet extension into MAIN then rebuild the semantic index on MAIN and just not use the SNOMED-VET branch. That should give you fast consistent results until this bug can be fixed.
from snowstorm.
Hi @dkincaid,
In version 4.1.0 of Snowstorm we have updated the semantic index update function to use all active triples (source, type and destination concept) when processing each relationship change. This was necessary because in the US Edition there are over one hundred cases of triples being made inactive in the US module straight after the same triple is made active in the International module. The inactivation in the US module is done using a different relationship id but Snowstorm was making the triple inactive until this fix.
This should also fix the issue you were seeing where relationships were going missing because I believe this was happening for the very same reason. This fix should give you accurate child/parent/ECL results straight after the RF2 import. The workaround we tried before gives me confidence that v4.1.0 (or later) will work for you without wrecking your performance.
I just thought I should let you know in case you have time to try it again. I can recommend deleting all your Snowstorm Elasticsearch indexes and starting a fresh because some of the index mappings have changed to support better non-english search and other features. We still require just the date in the effectiveTime field so remember to simplify those if you do import the Vet Extension.
I hope you are tempted to try! 😄
Kind regards,
Kai
from snowstorm.
Closing this ticket because I believe it's fixed in 4.1.0.
Please add comments or reopen the ticket as required.
from snowstorm.
Related Issues (20)
- Elasticsearch problem in Docker HOT 6
- Valueset expansion not returning translated values HOT 7
- Search feature - slow response HOT 5
- Difficulty Integrating SNOMED CT APIs into Healthcare Application HOT 2
- Application fails to start: No setter found for property: index-prefix HOT 2
- Is it possible for a concept to exist without having ancestors? HOT 2
- Default module id not working HOT 4
- MRCM Range search - wrong language HOT 8
- ECL Bug, Zero In-Group Cardinality Matches Nothing HOT 1
- Snowstorm API SearchMode HOT 5
- Webhooks HOT 3
- Error installing Release 10.2.1 using docker-compose HOT 4
- Refset bulk update HOT 2
- Unable to load SNOMED CT into Docker instance HOT 3
- HTTPS Configuration for Apache Server and Swagger HOT 6
- CORS issue with Snowstorm despite updating nginx setup and application.properties HOT 1
- Loading data: Failed RF2 SNAPSHOT import on branch MAIN HOT 1
- SNOMED CT Browser makes API request without branch HOT 3
- Request for Clarification on Unlimited Pagination for {branch}/concepts Endpoint HOT 2
- Issue with updating DependantVersionEffectiveTime in CodeSystem Version HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from snowstorm.