Git Product home page Git Product logo

cnn-re-tf's Introduction

Convolutional Neural Network for Relation Extraction

Note: This project is mostly based on https://github.com/yuhaozhang/sentence-convnet


Requirements

To download wikipedia articles (distant_supervision.py)

To visualize the results (visualize.ipynb)

Data

  • data directory includes preprocessed data:

    cnn-re-tf
    ├── ...
    ├── word2vec
    └── data
        ├── er              # binay-classification dataset
        │   ├── source.txt      #   source sentences
        │   └── target.txt      #   target labels
        └── mlmi            # multi-label multi-instance dataset
            ├── source.att      #   attention
            ├── source.left     #   left context
            ├── source.middle   #   middle context
            ├── source.right    #   right context
            ├── source.txt      #   source sentences
            └── target.txt      #   target labels
    

    To reproduce:

    python ./distant_supervision.py
    
  • word2vec directory is empty. Please download the Google News pretrained vector data from this Google Drive link, and unzip it to the directory. It will be a .bin file.

Usage

Preprocess

python ./util.py

It creates vocab.txt, ids.txt and emb.npy files.

Training

  • Binary classification (ER-CNN):

    python ./train.py --sent_len=3 --vocab_size=11208 --num_classes=2 --train_size=15000 \
    --data_dir=./data/er --attention=False --multi_label=False --use_pretrain=False
  • Multi-label multi-instance learning (MLMI-CNN):

    python ./train.py --sent_len=255 --vocab_size=36112 --num_classes=23 --train_size=10000 \
    --data_dir=./data/mlmi --attention=True --multi_label=True --use_pretrain=True
  • Multi-label multi-instance Context-wise learning (MLMI-CONT):

    python ./train_context.py --sent_len=102 --vocab_size=36112 --num_classes=23 --train_size=10000 \
    --data_dir=./data/mlmi --attention=True --multi_label=True --use_pretrain=True

Caution: A wrong value for input-data-dependent options (sent_len, vocab_size and num_class) may cause an error. If you want to train the model on another dataset, please check these values.

Evaluation

python ./eval.py --train_dir=./train/1473898241

Replace the --train_dir with the output from the training.

Run TensorBoard

tensorboard --logdir=./train/1473898241

Architecture

CNN Architecture

Results

P R F AUC init_lr l2_reg
ER-CNN 0.9410 0.8630 0.9003 0.9303 0.005 0.05
MLMI-CNN 0.8205 0.6406 0.7195 0.7424 1e-3 1e-4
MLMI-CONT 0.8819 0.7158 0.7902 0.8156 1e-3 1e-4

F1 AUC Loss PR_Curve ER-CNN Embeddings MLMI-CNN Embeddings MLMI-CONT Left Embeddings MLMI-CONT Right Embeddings

*As you see above, these models somewhat suffer from overfitting ...

References

cnn-re-tf's People

Contributors

may- avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cnn-re-tf's Issues

Dataset format and input format for new predictions

Hi, can you please explain how I can form my own dataset for training MLMICNN. I'm confused with the source.att and target files and some tokens in the other files, such as , , etc.
Also is it possible to check the relation prediction for a single sentence after training is completed ?

How to prepare the source.att file

Hi there, I am back again. This time I am trying to train a context cnn model using sentence and some other attributes. For example, including sentence itself and demographic attributes of writer (age, income, education level, etc) into a source data entry: "The fox chase a bunny , 23 , 24000 , high school"

I am not quite sure whether I could just added these other attributes as left, right file, correct me if I am wrong:

source.left:
the fox chase a bunny
source.middle:
23,24000
source.right:
high school
target.txt
1 0

BUT how abount source.att? How to decide the values between 0.0 and 1.0?

TypeError: object of type 'NoneType' has no len() with #3 settings

(.venv) ub16hp@UB16HP:/ub16_prj/cnn-re-tf$ python distant_supervision.py
===== step 1 =====
[1/4] Downloading wiki articles ...
===== step 2 =====
[1/4] Downloading wiki articles ...
===== step 3 =====
[1/4] Downloading wiki articles ...
===== step 4 =====
[1/4] Downloading wiki articles ...
===== step 5 =====
[1/4] Downloading wiki articles ...
===== step 6 =====
[1/4] Downloading wiki articles ...
===== step 7 =====
[1/4] Downloading wiki articles ...
Traceback (most recent call last):
File "distant_supervision.py", line 693, in
main()
File "distant_supervision.py", line 681, in main
positive_examples()
File "distant_supervision.py", line 452, in positive_examples
ret = loop(step, doc_id, limit, entities, relations, counter)
File "distant_supervision.py", line 343, in loop
docs = download_wiki_articles(doc_id, limit)
File "distant_supervision.py", line 73, in download_wiki_articles
pages = bs(r, "html.parser").findAll('page')
File "/home/ub16hp/ub16_prj/cnn-re-tf/.venv/local/lib/python2.7/site-packages/bs4/init.py", line 246, in init
elif len(markup) <= 256 and (
TypeError: object of type 'NoneType' has no len()
(.venv) ub16hp@UB16HP:
/ub16_prj/cnn-re-tf$

distant supervision script exists with error

Hello, Thank you for the code.

I have been trying to recreate dataset using same instructions here (https://github.com/may-/cnn-re-tf/issues/3#issuecomment-309293662) it works great till the very end and gives the following error :


Traceback (most recent call last):
File "./distant_supervision.py", line 693, in
main()
File "./distant_supervision.py", line 687, in main
extract_negative()
File "./distant_supervision.py", line 666, in extract_negative
subj = '<' + entities[row['subj'].encode('utf-8')][0] + '>'
KeyError: 'Bell\xc3\xaame'


Any thoughts? thanks in advance

Did you optimize F1 specifically

Hi,

I am doing a simliar project as you did.
I took a look at your text_cnn looks like the loss function you use is cross_entrophy.
I wonder how does precision and recall look like when your loss start to converge?
In my case loss start to be very small but Precision and Recall is still high, not sure what need to be done.
Did you optimize F1 specifically?
Thanks!

[Help] How do I specify the positive class? How to output the prediction results?

Dear all,

I need help to understand these codes.

I would like to use these codes to make predictions. My data contains two class labels, namely 'Cat' and 'Bunny'. If I would like to pick "Bunny" as positive class, shall I edit the \er\target.txt and setting all instances whose class is "Bunny" as "1 0" and the others (cat) as "0 1"?

Moreover, how can I get the actual predictions?

Thank ahead for your time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.