Git Product home page Git Product logo

hiergat's Introduction

HierGAT

This is the implementation of "Entity Resolution via Hierarchical Graph Attention Network"

Environment

  • Python 3.7
  • PyTorch 1.4
  • HuggingFace Transformers
  • NLTK (for 1-N ER problem)

You should run pip install -r requirements.txt first.

Datasets

The raw datasets can be found at

Train HierGAT

python train.py \ 
	--task Amazon \
	--batch_size 32 \
	--max_len 256 \
	--lr 1e-5 \
	--n_epochs 10 \
	--finetuning \
	--split \
	--lm bert
  • --task: the name of the tasks (see task.json)
  • --batch_size, --max_len, --lr, --n_epochs: the batch size, max sequence length, learning rate, and the number of epochs
  • --split: whether to split the attribute, should always be turned on
  • --finetuning: whether to finetune the LM, should always be turned on
  • --lm: the language model. We now support bert, distilbert, xlnet, roberta (bert by default)
    • If you want to load the model file locally, you can configure the --lm_path

Train HierGAT+

python train_n.py \ 
	--task N/Amazon \
	--su_len 10 \
	--finetuning \
	--split \
	--lm bert

Same as HierGAT, with one additional parameter:

  • --su_len: max entity-level context sequence length

hiergat's People

Contributors

gu18168 avatar lassino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

hiergat's Issues

Error when replicating the results

Hello,

I was running the code using provided datasets (e.g. ABT-BUY) but I ran into some errors:

File "HierGAT/model/layer.py", line 89, in forward
entitys_emb = entity_emb.repeat(1, attr_num, 1)
RuntimeError: Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor

I wonder if you know of a fix or direction for error on this? Thank you!

No Initialization of masks in train_n.py

When running the code for HierGAT+, an error is thrown. I believe it is because there is no initialization of masks in train_n.py line 38: logits, y, _ = model(xs, zs, y, masks).

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.