albertkx / berkeley-crossword-solver Goto Github PK

View Code? Open in Web Editor NEW

122.0 122.0 20.0 110.29 MB

ACL 2022

License: MIT License

Shell 2.75% Python 97.25%

berkeley-crossword-solver's People

Contributors

Stargazers

Watchers

Forkers

codeaudit shafiahmed mbrukman nmstoker sakshstore zebrajack michelllepan jbdatascience parthitpro douglasmendes linboqiao rodrigoieh mihryam thebooort shangadi f4t4nt next24 taolicd anilcan-kara

berkeley-crossword-solver's Issues

Crossword sample

Would you please release one or two formated crosswords first if the whole dataset isn't ready right away? I tried to use your Crossword.py on .json file found on xwordinfo.com but failed. The initialize_grids didn't match the grids' format on .json.

Alternatives for hard negatives

Have you tried using bi encoder self for hard negative mining? Like second stage of training QA model, after using tfidf negatives, or from the beginning (reducing source dependencies). Maybe it could converge into a better model. Or maybe it would be worse due to overfitting.

Thank you for the work and publishing the source code!

Please add a license to this repo

First, thank you for sharing this project with us!

Could you please add an explicit LICENSE file to the repo so that it's clear
under what terms the content is provided, and under what terms user
contributions are licensed?

Per GitHub docs on licensing:

[...] without a license, the default copyright laws apply, meaning that you
retain all rights to your source code and no one may reproduce, distribute,
or create derivative works from your work. If you're creating an open source
project, we strongly encourage you to include an open source license.

Thanks!

Dataset available?

Hello, in the paper you mention:

"we publicly release our code, models, and dataset:"

Is the dataset in the repo?

Thank you!

There is no drfill branch

The README mentions a drfill branch, but I only see a master branch on GitHub.

Running in colab

Hi,

I am trying to write a colab notebook that will solve puzzles easily for people with no GPU.
This is where I got to so far: https://colab.research.google.com/drive/17SQJoHHT36t8fPOam-Kun35mNH4LoxSa?usp=sharing

I have passed many hurdles but now stuck on something I don't understand.
it fails with:

    234 
    235                 query_vectors.extend(out.cpu().split(1, dim=0))
--> 236         query_tensor = torch.cat(query_vectors, dim=0)
    237         assert query_tensor.size(0) == len(questions)
    238         return query_tensor

NotImplementedError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat.  This usually means that this function requires a non-empty list of Tensors, or that you (the operator writer) forgot to register a fallback function.  Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Python, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradLazy, AutogradXPU, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, AutocastCPU, Autocast, Batched, VmapMode, Functionalize].

CPU: registered at aten/src/ATen/RegisterCPU.cpp:21063 [kernel]
CUDA: registered at aten/src/ATen/RegisterCUDA.cpp:29726 [kernel]
QuantizedCPU: registered at aten/src/ATen/RegisterQuantizedCPU.cpp:1258 [kernel]
BackendSelect: fallthrough registered at ../aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:47 [backend fallback]
Named: registered at ../aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at ../aten/src/ATen/ConjugateFallback.cpp:18 [backend fallback]
Negative: registered at ../aten/src/ATen/native/NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at ../aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:64 [backend fallback]
AutogradOther: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradCPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradCUDA: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradXLA: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradLazy: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradXPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradMLC: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradHPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradNestedTensor: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradPrivateUse1: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradPrivateUse2: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
AutogradPrivateUse3: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel]
Tracer: registered at ../torch/csrc/autograd/generated/TraceType_3.cpp:11220 [kernel]
AutocastCPU: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:461 [backend fallback]
Autocast: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:305 [backend fallback]
Batched: registered at ../aten/src/ATen/BatchingRegistrations.cpp:1059 [backend fallback]
VmapMode: fallthrough registered at ../aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
Functionalize: registered at ../aten/src/ATen/FunctionalizeFallbackKernel.cpp:52 [backend fallback]

Training data issues

Hello, I encountered some file format issues while training the model.Now I have a batch of my own Clues and Answers data that I want to use for training, but I don't know how to use them in training.

What format is the dataset in the following code?
bash train_scripts/biencoder/tfidf.sh path/to/dataset
What are the specific formats of answers.jsonl and docs.jsonl?

python3 train_scripts/biencoder/get_tfidf_negatives.py \
    --model path/to/dataset/tfidf/ \
    --fills path/to/dataset/answers.jsonl \
    --clues path/to/dataset/docs.jsonl \
    --out path/to/dataset/ \
    --no-len-filter

What data was used by train.json and validation.json? Are they the ones posted on huggingface? However, there is a difference between the CSV on the huggingface and the JSON required here.

CUDA_VISIBLE_DEVICES=0 bash train_scripts/biencoder/train_bert.sh \
    path/to/dataset/train.json \
    path/to/validation/validation.json \
    checkpoints/biencoder/

In summary, can you provide examples of training files required for each step of the training process so that we can rewrite our own training data format?

Thank you very much indeed.

The python code in the readme is missing import?

The code in the "running the solver" is incomplete, missing imports. Would be great if you included a small self contained python with a tiny demo crossword that works after the installation.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.