Git Product home page Git Product logo

gregory's Introduction

gregory's People

Contributors

anachaba avatar antoniolopes avatar brunoamaral avatar dippas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gregory's Issues

Move API to django rest framework

Example: https://api.gregory-ms.com/articles/all

  • List article that matches the {ID} number.

https://api.gregory-ms.com/articles/id/{ID}

Example: https://api.gregory-ms.com/articles/id/19[](https://api.gregory-ms.com/articles/keyword/myelin)

  • List all articles by keyword.

https://api.gregory-ms.com/articles/keyword/{keyword}

Example: https://api.gregory-ms.com/articles/keyword/myelin

  • List related articles by keywords

POST https://gregory-ms.com/articles/related/[](https://api.gregory-ms.com/articles/relevant)

Expects a json object of keywords in the post body.

{ "keywords": ['trials','gait rehabilitation','multiple sclerosis'] }
https://gregory-ms.com/articles/related/

  • List all relevant articles.

These are articles that we show on the home page because they appear to offer new courses of treatment.

https://api.gregory-ms.com/articles/relevant[](https://api.gregory-ms.com/articles/source/1)

Example: https://api.gregory-ms.com/articles/relevant

Articles’ Sources

  • List all articles from specified {source}.

https://api.gregory-ms.com/articles/source/{source_id}

Example: https://api.gregory-ms.com/articles/source/1[](https://api.gregory-ms.com/articles/sources)

  • List all available sources.

https://api.gregory-ms.com/articles/sources[](https://api.gregory-ms.com/trials/all)

Example: https://api.gregory-ms.com/articles/sources

Trials

  • List all trials.

https://api.gregory-ms.com/trials/all[](https://api.gregory-ms.com/trials/keyword/myelin)

Example: https://api.gregory-ms.com/trials/all

  • List all trials by keyword.

https://api.gregory-ms.com/trials/keyword/{keyword}

Example: https://api.gregory-ms.com/trials/keyword/myelin[](https://api.gregory-ms.com/trials/source/pubmed)

Trials’ Sources

  • List all trials from specified {source}.

https://api.gregory-ms.com/trials/source/{source}

Example: https://api.gregory-ms.com/trials/source/pubmed[](https://api.gregory-ms.com/trials/sources)

  • List all available sources.

https://api.gregory-ms.com/trials/sources

Example: https://api.gregory-ms.com/trials/sources

add more information about sources to the database

Example:

[
    {
        "source": "CUF",
        "link": "https://www.example.com"
    },
    {
        "source": "ClinicalTrials.gov",
        "link": "https://www.example.com"
    },
    {
        "source": "Novartis",
        "link": "https://www.example.com"
    }
]

Other relevant information, the link of the search page and keywords we use.

error building the container on Ubuntu 21.04

$ sudo docker-compose up

Creating volume "gregory_flows" with local driver
Creating volume "gregory_python" with local driver
Creating node-red ... error

ERROR: for node-red  Cannot create container for service node-red: failed to mount local volume: mount ./docker-python:/var/lib/docker/volumes/gregory_python/_data, flags: 0x1000: no such file or directory

ERROR: for node-red  Cannot create container for service node-red: failed to mount local volume: mount ./docker-python:/var/lib/docker/volumes/gregory_python/_data, flags: 0x1000: no such file or directory
ERROR: Encountered errors while bringing up the project.

move database from SQLite to Postgres

reasons for it:

  • better handling of timestamp data
  • equal integration with metabase
  • faster (?) response time

best approach would be psql -d gregory -f ./docker-data/gregory.db but it results in syntax errors because of the html values in some columns.

deleting an article does not delete the relationship with the category(ies)

I must have missed something when I wrote the models.

from django.db import models
class Categories(models.Model):
	category_id = models.AutoField(primary_key=True)
	category_name = models.CharField(blank=True, null=True,max_length=200)
	category_description = models.TextField(blank=True, null=True)

	def __str__(self):
		return self.category_name

	class Meta:
		managed = True
		verbose_name_plural = 'categories'
		db_table = 'categories'

class Articles(models.Model):
	article_id = models.AutoField(primary_key=True)
	title = models.TextField(blank=False, null=False, unique=True)
	summary = models.TextField(blank=True, null=True)
	link = models.URLField(blank=False, null=False, max_length=2000)
	published_date = models.DateTimeField(blank=True, null=True)
	discovery_date = models.DateTimeField()
	source = models.ForeignKey('Sources', models.DO_NOTHING, db_column='source', blank=True, null=True)
	relevant = models.BooleanField(blank=True, null=True)
	ml_prediction_gnb = models.BooleanField(blank=True, null=True)
	ml_prediction_lr = models.BooleanField(blank=True, null=True)
	noun_phrases = models.JSONField(blank=True, null=True)
	categories = models.ManyToManyField(Categories)
	entities = models.ManyToManyField('Entities')
	sent_to_admin = models.BooleanField(blank=True, null=True)
	sent_to_subscribers = models.BooleanField(blank=True, null=True)
	sent_to_twitter = models.BooleanField(blank=True, null=True)
	doi = models.CharField(max_length=280, blank=True, null=True)

	def __str__(self):
		return str(self.article_id)

	class Meta:
		managed = True
		# unique_together = (('title', 'link'),)
		verbose_name_plural = 'articles'
		db_table = 'articles'


class Entities(models.Model):
	entity = models.TextField()
	label = models.TextField()


	class Meta:
		managed = True
		verbose_name_plural = 'entities'
		db_table = 'entities'


class Sources(models.Model):
	TABLES = [('articles', 'Articles'),('trials','Trials')]


	source_id = models.AutoField(primary_key=True)
	source_for = models.CharField(choices=TABLES, max_length=50, default='articles')
	name = models.TextField(blank=True, null=True)
	link = models.TextField(blank=True, null=True)
	language = models.TextField()
	subject = models.TextField()
	method = models.TextField()
	

	def __str__(self):
		return self.name

	class Meta:
		managed = True
		verbose_name_plural = 'sources'
		db_table = 'sources'


class Trials(models.Model):
	trial_id = models.AutoField(primary_key=True)
	discovery_date = models.DateTimeField(blank=True, null=True)
	title = models.TextField(blank=False,null=False, unique=True)
	summary = models.TextField(blank=True, null=True)
	link = models.URLField(blank=False, null=False, max_length=2000)
	published_date = models.DateTimeField(blank=True, null=True)
	source = models.ForeignKey('Sources', models.DO_NOTHING, db_column='source', blank=True, null=True)
	relevant = models.BooleanField(blank=True, null=True)
	sent = models.BooleanField(blank=True, null=True)
	sent_to_twitter = models.BooleanField(blank=True, null=True)
	sent_to_subscribers = models.BooleanField(blank=True, null=True)

	def __str__(self):
		return str(self.trial_id) 

	class Meta:
		managed = True
		verbose_name_plural = 'trials'
		db_table = 'trials'

include spacy.io in node-red container

We are using https://github.com/explosion/spaCy to detect the noun phrases in the title of articles. This information is then used to list related articles on each page.

Half of the build process is running spacy.io, so it should be included in the node-red flows to save that information in the database.

We could run it as a separate script, but I don't want to split the different processing steps between the container and the host server.

running 3_predict.py returns an error using the scikit branch

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/sklearn/multiclass.py", line 100, in _predict_binary
    score = np.ravel(estimator.decision_function(X))
AttributeError: 'GaussianNB' object has no attribute 'decision_function'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "3_predict.py", line 120, in <module>
    data = predictor(dataset)
  File "3_predict.py", line 110, in predictor
    prediction = pipelines[model].predict([input])
  File "/usr/local/lib/python3.7/dist-packages/sklearn/utils/metaestimators.py", line 113, in <lambda>
    out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)  # noqa
  File "/usr/local/lib/python3.7/dist-packages/sklearn/pipeline.py", line 470, in predict
    return self.steps[-1][1].predict(Xt, **predict_params)
  File "/usr/local/lib/python3.7/dist-packages/sklearn/multiclass.py", line 457, in predict
    indices.extend(np.where(_predict_binary(e, X) > thresh)[0])
  File "/usr/local/lib/python3.7/dist-packages/sklearn/multiclass.py", line 103, in _predict_binary
    score = estimator.predict_proba(X)[:, 1]
  File "/usr/local/lib/python3.7/dist-packages/sklearn/naive_bayes.py", line 125, in predict_proba
    return np.exp(self.predict_log_proba(X))
  File "/usr/local/lib/python3.7/dist-packages/sklearn/naive_bayes.py", line 104, in predict_log_proba
    jll = self._joint_log_likelihood(X)
  File "/usr/local/lib/python3.7/dist-packages/sklearn/naive_bayes.py", line 489, in _joint_log_likelihood
    n_ij = -0.5 * np.sum(np.log(2.0 * np.pi * self.var_[i, :]))
AttributeError: 'GaussianNB' object has no attribute 'var_'

Automatic categorisation does not take synonyms into account

This is a caveat where the system fails to include articles in the corresponding category if the noun used is different. For example Ocrelizumab and Ocrevus, or Natalizumab and Tysabri. These nouns correspond to a single medication, respectively, however, in the current state, Gregory can only identify them as being separate entities.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.