Git Product home page Git Product logo

Comments (7)

cgreich avatar cgreich commented on July 17, 2024

@gkennos: Good point. But in sources the convention is weak, and roman and arabic numerals are mixed up. We explicitly cleaned that up when we put it in. The decision to use arabic was very pragmatic: Search is impossible with the roman notation (try getting stage II and not stage III). However, we should make that very clear. If we had Athena development resources, we could teach the tool to accept both.

from oncologywg.

gkennos avatar gkennos commented on July 17, 2024

Not sure if the search has been updated since that decision, but it works for me?
here

Screenshot 2024-03-15 at 4 46 01 pm

from oncologywg.

cgreich avatar cgreich commented on July 17, 2024

Well, your screenshot is nicely cutting off the right margin with the vocabulary names. All those you listed are imported vocabularies. We are not fixing, say, SNOMED. In the Cancer Modifier vocabulary we authored ourselves all stages are Arabic.

from oncologywg.

gkennos avatar gkennos commented on July 17, 2024

Apologies I was trying to crop out my user name, but see below for completeness (or in the provided link above)

The screenshot was only attached to show that search is functional, not to make any point of the included terms, but the cancer modifiers are definitely arabic numerals so I am not sure what you mean? here

Screenshot 2024-03-15 at 11 16 01 pm

from oncologywg.

cgreich avatar cgreich commented on July 17, 2024

What's wrong with your user name? :)

I see what you mean. Well, Athena is trying to be smart. Looks like in this case it is. But it generally is struggling to create a reasonable list, and we have been futzing around with the various weights for partial words, upper and lower case, edit distance, all that. It lacks Google's background information and search can only tinker with the search string.

But if you go to Atlas, which uses simple SQL, you get all the stage IIIs. Same is true for any other application that doesn't have a smart search engine.

What do you have against the Arabic numbers? What use case is suffering?

from oncologywg.

gkennos avatar gkennos commented on July 17, 2024

Nothing - just habit of cropping out anything that is my full legal name if not necessary.

The use-case that is suffering IMO, is the one of someone who is creating a new mapping and searches for 'Stage IV', as that is the term in the actual vocabulary, but cannot find it because it's been changed to 'Stage 4' as a technical workaround known only to a few people.

I understand that when you say both of those terms out loud, they are both 'stage four' of course. I would, however, argue there is a semantic difference, as the roman numerals are the ones expressly used by the AJCC standard (which the concatenation rules explicitly state is used as the source), and this is done to some degree for the purposes of differentiating those terms from other common staging measures for clarity, as 'stage' is quite an overloaded term.

If it is required as a technical workaround, would the more appropriate way to handle it be to either insert 'Stage 4' as a synonym or create a non-standard term that maps to the standard (vocabulary-specified) term?

from oncologywg.

cgreich avatar cgreich commented on July 17, 2024

All understood and agreed, @gkennos. But the Arabic/Roman craziness is something we inherent from the existing vocabularies. So, all troubles you complain about not finding what you want is already there. For example, there are 798 "stage IV" records across all vocabularies except Cancer Modifier, and 207 "stage 4" containing ones. It's true, in oncology the Roman dominates over the Arabic, but even in NAACCR you have 6 "stage 4" in addition to 100 "stage IV". Now what?

In Cancer Modifier, we decided to at least have it one way. And yes, it might be surprising to someone used to the Roman as it is used more. But if your search returns no result for "stage IV" you will probably figure it out (e.g. trying just "stage").

Your idea with the synonym is good. However, then we may as well go Roman (and deal with not being able to distinguish between Roman I, II, III, IV, VI and VII when searching for "I"), because if we put them into the synonyms they will turn up in the search results of Athena. We could also add special code to the Athena search engine to do automatic search term modification for Roman numbers. To have it documented would also be a win, but where would that documentation help? Nobody reads the manual before conducting a search.

Not sure what the best solution is other than letting people figure it out.

from oncologywg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.