Comments (6)
I think I've tracked that assertion down to here:
https://github.com/honnibal/thinc/blob/master/thinc/learner.pyx#L99
But I'm unclear as to why my class label is negative.
from redshift.
Hi,
Thanks for your patience and persistence! Sorry I haven't had much time to help yet.
How is the data in wsj.10.txt formatted? Are the tests passing for you?
This test shows passing a single training example to the train function: https://github.com/syllog1sm/redshift/blob/develop/tests/test_tagger.py
from redshift.
wsj.10.txt is PTB-formatted:
Why/WRB is/VBZ the/DT stock/NN market/NN suddenly/RB so/RB volatile/JJ ?/.
This seems to be the expected format for the Input.from_pos constructor.
I tried running the tests and two of them fail. As you can see from the snippet below, these failures are resulting from the same AssertionError that I mentioned above:
➜ redshift git:(develop) ✗ py.test
========================================================= test session starts ==========================================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 24 items
tests/test_ae.py .............
tests/test_edit_ae.py .....
tests/test_lexicon.py .
tests/test_parser.py E
tests/test_tagger.py ...E
================================================================ ERRORS ================================================================
_____________________________________________________ ERROR at setup of test_parse _____________________________________________________
@pytest.fixture
def train_dir():
import redshift.parser
> redshift.parser.train(train_str, model_dir)
tests/test_parser.py:20:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
redshift/parser.pyx:111: in redshift.parser.train (redshift/parser.cpp:3039)
parser.tagger.train_sent(py_sent)
redshift/tagger.pyx:122: in redshift.tagger.Tagger.train_sent (redshift/tagger.cpp:4013)
self.guide.update(counts)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E AssertionError
thinc/learner.pyx:81: AssertionError
______________________________________________________ ERROR at setup of test_tag ______________________________________________________
@pytest.fixture
def train_dir():
import redshift.tagger
sent_strs = []
for sent_str in train_str.strip().split('\n\n'):
sent = []
for tok_str in sent_str.strip().split('\n'):
fields = tok_str.split()
sent.append('%s/%s' % (fields[1], fields[3]))
sent_strs.append(' '.join(sent))
train_pos = '\n'.join(sent_strs)
> redshift.tagger.train(train_pos, model_dir)
tests/test_tagger.py:27:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
redshift/tagger.pyx:43: in redshift.tagger.train (redshift/tagger.cpp:2391)
tagger.train_sent(sent)
redshift/tagger.pyx:122: in redshift.tagger.Tagger.train_sent (redshift/tagger.cpp:4013)
self.guide.update(counts)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E AssertionError
thinc/learner.pyx:81: AssertionError
================================================== 22 passed, 2 error in 1.71 seconds ==================================================
Are you able to reproduce this? I'm running on OS X 10.10, using Python 2.7.6.
from redshift.
Okay, I think I've fixed this.
The underlying problem is that I've broken the perceptron code out into its own module, thinc, and I'd been redshift against my local version of that library instead of the one on pip.
Try pulling the new version, and running "pip install -r requirements.txt", to get thinc1.50. Then run "fab clean make test".
from redshift.
Yay. Tests pass and I've trained a tagger. Thanks!
from redshift.
Great! Thanks for the bug reports. Let me know if you have any other problems.
from redshift.
Related Issues (20)
- Get redshift into pip HOT 1
- Non ascii chars in train file HOT 8
- Non-english usage HOT 2
- Runtime problem on OSX HOT 1
- set/replace sentence.Input's token label HOT 1
- Missing license HOT 3
- Compilation error HOT 2
- Greedy or not? HOT 3
- Cannot install on Ubuntu HOT 3
- Cannot install on OS X HOT 2
- Replace model transport code with protobuf
- Not installing on ubuntu HOT 12
- Running disfluency parser HOT 3
- still cant install ubuntu 14.04
- Fix interface
- Reading input before parser is created breaks index.hashes
- Unclear running instructions HOT 7
- Documentation for redshift HOT 1
- Can't train CoNLL formatted file HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from redshift.