Git Product home page Git Product logo

inspire-magpie's Introduction

Inspirehep

Pre requirements

Python

Python 3.8

You can also use pyenv for your python installations. Simply follow the instructions and set the global version to 3.8.

Debian / Ubuntu

$ sudo apt-get install python3 build-essential python3-dev

MacOS

$ brew install postgresql@14 libmagic openssl@3 openblas python

nodejs & npm using nvm

Please follow the instructions https://github.com/nvm-sh/nvm#installing-and-updating

We're using v16.16.0 (first version we install is the default)

$ nvm install 16.16.0
$ nvm use global 16.16.0

yarn

Debian / Ubuntu

Please follow the instructions https://classic.yarnpkg.com/en/docs/install/#debian-stable

MacOS

$ brew install yarn

poetry

install poetry https://poetry.eustace.io/docs/

$ curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python -

pre-commit

install pre-commit https://pre-commit.com/

$ curl https://pre-commit.com/install-local.py | python -

And run

$ pre-commit install

Docker & Docker Compose

The topology of docker-compose

Alt text

Follow the guide https://docs.docker.com/compose/install/

For MacOS users

General

Turn of the AirPlay Receiver under System Preference -> Sharing -> AirPlay Receiver. Otherwise, you will run into problems with port 5000 being already in use. See this for more information.

M1 users

Install Homebrew-file https://homebrew-file.readthedocs.io/en/latest/installation.html

$ brew install rcmdnk/file/brew-file

And run

$ brew file install

Run with docker

Make

This will prepare the whole inspire development with demo records:

make run
make setup

You can stop it by simply run

make stop

Alternatively you can follow the steps:

Step 1: In a terminal run

docker-compose up

Step 2: On another browser run

docker-compose exec hep-web ./scripts/setup
docker-compose exec next-web inspirehep db create

Step 3: Import records

docker-compose exec hep-web inspirehep importer demo-records

Usage

inspirehep should now be available under http://localhost:8080


Run locally

Backend

$ cd backend
$ poetry install

UI

$ cd ui
$ yarn install

Editor

$ cd record-editor
$ yarn install

Setup

First you need to start all the services (postgreSQL, Redis, ElasticSearch, RabbitMQ)

$ docker-compose -f docker-compose.services.yml up es mq db cache

And initialize database, ES, rabbitMQ, redis and s3

$ cd backend
$ ./scripts/setup

Note that s3 configuration requires default region to be set to us-east-1. If you have another default setup in your AWS config (~/.aws/config) you need to update it!

Also, to enable fulltext indexing & highlighting the following feature flags must be set to true:

FEATURE_FLAG_ENABLE_FULLTEXT = True
FEATURE_FLAG_ENABLE_FILES = True

Run

Backend

You can visit Backend http://localhost:8000

$ cd backend
$ ./scripts/server

UI

You can visit UI http://localhost:3000

$ cd ui
$ yarn start

Editor

$ cd ui
$ yarn start

You can also connect UI to another environment by changing the proxy in ui/setupProxy.js

proxy({
  target: 'http://A_PROXY_SERVER',
  ...
});

How to test

Backend

The backend tests locally use testmon to only run tests that depend on code that has changed (after the first run) by default:

$ cd backend
$ poetry run ./run-tests.sh

If you pass the --all flag to the run-tests.sh script, all tests will be run (this is equivalent to the --testmon-noselect flag). All other flags passed to the script are transferred to py.test, so you can do things like

$ poetry run ./run-tests.sh --pdb -k test_failing

You'll need to run all tests or force test selection (e.g. with -k) in a few cases:

  • an external dependency has changed, and you want to make sure that it doesn't break the tests (as testmon doesn't track external deps)
  • you manually change a test fixture in a non-python file (as testmon only tracks python imports, not external data)

If you want to invoke py.test directly but still want to use testmon, you'll need to use the --testmon --no-cov flags:

$ poetry run py.test tests/integration/records --testmon --no-cov

If you want to disable testmon test selection but still perform collection (to update test dependencies), use --testmon-noselect --no-cov instead.

Note that testmon is only used locally to speed up tests and not in the CI to be completely sure all tests pass before merging a commit.

UI

$ cd ui
$ yarn test # runs everything (lint, bundlesize etc.) indentical to CI
$ yarn test:unit # will open jest on watch mode

Note that jest automatically run tests that changed files (unstaged) affect.

cypress (e2e)

$ sh cypress-tests.sh # runs everything from scratch, identical to CI

$ cd e2e
$ yarn test:dev # open cypress runner GUI runs them against local dev server (localhost:3000)
$ yarn test:dev --env inspirehep_url=<any url that serves inspirehep ui>

visual tests

Visual tests are run only on headless mode. So yarn test:dev which uses the headed browser will ignore them. Running existing visual tests and updating/creating snapshots requires cypress-tests.sh script.

For continuous runs (when local DB is running and has required records etc.), the script can be reduced to only the last part sh cypress-tests-run.sh.

If required, tests can run against localhost:3000 by simply modifying --host option in sh cypress-tests-run.sh.

working with (visual) tests more efficiently

(TODO: improve DX)

You may not always need to run tests exactly like on the CI environment.

  • To run specific suite, just change test script in e2e/package.json temporarily to cypress run --spec cypress/integration/<spec.test.js>
  • To enable mounting backend code and live update, just use e2e/docker-compose.cypress.dev.yml instead.

How to import records

First make sure that you are running:

$ cd backend
$ ./scripts/server

There is a command inspirehep importer records which accepts url -u, a directory of JSON files -d and JSON files -f. A selection of demo records can be found in data directory and they are structure based on the record type (i.e. literature). Examples:

With url

# Local
$ poetry run inspirehep importer records -u https://inspirehep.net/api/literature/20 -u https://inspirehep.net/api/literature/1726642
# Docker
$ docker-compose exec hep-web inspirehep importer records -u https://inspirehep.net/api/literature/20 -u https://inspirehep.net/api/literature/1726642

# `--save` will save the imported record also to the data folder
$ <...> inspirehep importer records -u https://inspirehep.net/api/literature/20 --save

Valid --token or backend/inspirehep/config.py:AUTHENTICATION_TOKEN is required.

With directory

# Local
$ poetry run inspirehep importer records -d data/records/literature
# Docker
$ docker-compose exec hep-web inspirehep importer records -d data/records/literature

With files

# Local
$ poetry run inspirehep importer records -f data/records/literature/374836.json -f data/records/authors/999108.json
# Docker
$ docker-compose exec hep-web inspirehep importer records -f data/records/literature/374836.json -f data/records/authors/999108.json

All records

# Local
$ poetry run inspirehep importer demo-records
# Docker
$ docker-compose exec hep-web inspirehep importer demo-records

inspire-magpie's People

Contributors

eamonnmag avatar jalavik avatar jstypka avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Forkers

kaplun jalavik

inspire-magpie's Issues

I got an error message when I operated UI

I got an error message when I operated UI.
Do you have any solutions?

Python 2.7.15
tensorflow 1.13.1
gensim-0.13.4.1 keras-1.2.2 magpie-1.0 scipy-0.19.1

Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 2309, in call
return self.wsgi_app(environ, start_response)
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functionsrule.endpoint
File "/usr/local/lib/python2.7/site-packages/inspire_magpie/ui.py", line 39, in extractor
labels = predict_labels(corpus, text)[0:20]
File "/usr/local/lib/python2.7/site-packages/inspire_magpie/api.py", line 100, in predict_labels
return model.predict_from_text(text)
File "/usr/local/lib/python2.7/site-packages/magpie/main.py", line 173, in predict_from_text
return self._predict(doc)
File "/usr/local/lib/python2.7/site-packages/magpie/main.py", line 200, in _predict
y_predicted = self.keras_model.predict(x)
File "/usr/local/lib/python2.7/site-packages/keras/models.py", line 724, in predict
return self.model.predict(x, batch_size=batch_size, verbose=verbose)
File "/usr/local/lib/python2.7/site-packages/keras/engine/training.py", line 1269, in predict
self._make_predict_function()
File "/usr/local/lib/python2.7/site-packages/keras/engine/training.py", line 798, in _make_predict_function
**kwargs)
File "/usr/local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 1961, in function
return Function(inputs, outputs, updates=updates)
File "/usr/local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 1919, in init
with tf.control_dependencies(self.outputs):
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 5036, in control_dependencies
return get_default_graph().control_dependencies(control_inputs)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 4536, in control_dependencies
c = self.as_graph_element(c)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3486, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3565, in _as_graph_element_locked
raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("Sigmoid:0", shape=(?, 10000), dtype=float32) is not an element of this graph.

supported magpie version ?

Thank you for providing useful software.
Is inspire-magpie supported version magpie v1.0 or magpie v2.0 ?

Getting ImportError: No module named evaluation.rank_metrics in Magpie2.0 Ubuntu 16 64 bit

I am trying to execute inspire-magpie with magpie-2.0 version, but it is failing since the current inspirehep-magpie code is referring magpie-1.0/.

Changes I have made:

In api.py, renamed as

from magpie import Magpie   #- (removed Magpiemodel)
from magpie.config import EPOCHS, BATCH_SIZE  # removed NB_EPOCHS 

While executing I am getting error due to the below line

from magpie.evaluation.rank_metrics import mean_reciprocal_rank, r_precision, \
    mean_average_precision, ndcg_at_k, precision_at_k

Error message I am getting :

$ python wsgi.py 
/home/user/TrailGitClones/magpie_git/magpie_py2_webins/local/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Traceback (most recent call last):
  File "wsgi.py", line 19, in <module>
    from inspire_magpie import application
  File "build/bdist.linux-x86_64/egg/inspire_magpie/__init__.py", line 20, in <module>
    from .api import get_cached_model
  File "build/bdist.linux-x86_64/egg/inspire_magpie/api.py", line 28, in <module>
    from magpie.evaluation.rank_metrics import mean_reciprocal_rank, r_precision, \
ImportError: No module named evaluation.rank_metrics

Please provide evaluation module for this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.