Git Product home page Git Product logo

cord19q's People

Contributors

davidmezzetti avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cord19q's Issues

including embeddings.py, scoring.py, pipeline.py etc. in cord19q

Hi David,
Thank you for helping us run paperai v1.2.1. successfully. Unfortunately, this version does not include embeddings.py, scoring.py, pipeline.py etc., which significantly improve the results’ quality.
So I tried to install cord19q v2.3.0 because we need to reproduce the high quality results you obtained with previous cord19q versions. However, I’m getting the following error, which might be due to the recent changes you made to neuml/magnitude
Screenshot from 2020-08-16 17-31-25
Do you think I should try to install cord19q v2.3.0 with a previous version of magnitude (i.e. neuml/magnitude v0.1.143)?
or would you be able to do something on your side that would allow me to install and run cord19q v2.3.0?

Clear output on print

It would be great for the cord19q package on GitHub if you clear the output before printing a new line for the etl, vector, and index functions. The output would be much cleaner without the thousands of lines scrolling on display. Thanks.

Optimization idea

in etl.execute.py
for _, text in sections:
# Look for at least one keyword match
if re.findall(regex, text.lower()):
tags = "COVID-19"

if we find a "COVID-19" tag, we should return? No need to look through the other sentences?

Also, in execute.files, good to add a check for key not present in the dictionary:
if row.get('has_pmc_xml_parse'):
if row["has_pmc_xml_parse"].lower() == "true":

Thanks so much for making this code public. Learning so much from going through this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.