Comments (8)
The problem is that AMRParser tries to read a tokenization file that doesn't exist. It seems that instead of raising an exception this results in an empty array. This happens in line 169 of AMRParser.scala:
val tokenized = fromFile(options('tokenized).asInstanceOf[String]).getLines/.map(x => x)/.toArray
Trying to access an element of this empty array in line 197 causes an exception which gets handled, but during handling there is another attempted access in line 307, which causes the ArrayIndexOutOfBoundsException.
As a simple workaround in case your input text is already whitespace tokenized, you can replace line 169 with this line, run ./compile again, and everything should work:
val tokenized = input
Alternatively, you could try to run the tokenize script manually and set the --tok environment variable in config.sh
from jamr.
@ritwikmishra I was experiencing a similar problem and the solution in #16 solved it for me. Just comment out jamr/tools/cdec/corpus/support/quote-norm.pl
line 149 to work around the crash, which appears to be a Perl bug, similar to https://rt.perl.org/Public/Bug/Display.html?id=124109.
from jamr.
@ConstantineLignos I tried what you suggested. And compiled it again. Now output comes
### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/ATS/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48)
at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43)
at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)
Bdw I am using CAMR parser now, it is working better as per my needs.
from jamr.
@calliwen I don't know how much this helps, but I am now seeing what others are, where commenting out the line of Perl I suggest above is not enough to fix it. We have two otherwise identical machines where one works and the other doesn't, and we haven't been able to sort out the difference.
However, in your case, I think this is the most important error:
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
If you comment out line 149 of that file, does the problem go away?
from jamr.
I followed what @bheinzerling suggested and it worked for parsing. But when I run
scripts/ALIGN.sh < output_file2 > aligned_output_file
Here output_file2 is output of parsing step.
Same error is encountered
### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48)
at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43)
at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)
I went to the line number 40 of CorpusTool.scala
and commented the line just like @bheinzerling suggested in case of AMRParser.scala
. I added the following lines instead
val input = stdin.getLines.toArray
val tokenized = input
I compiled it.
Now the script ALIGN.sh
runs without any Exception. And shows this
### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
### Running aligner ###
but it gives nothing as output. The file aligned_output_file
is empty .
What can be done? (I am using the pre-trained models-2016.09.18.tgz only)
Thanks
from jamr.
I followed what @bheinzerling suggested and it worked for parsing. But when I run
scripts/ALIGN.sh < output_file2 > aligned_output_file
Here output_file2 is output of parsing step.
Same error is encountered### Tokenizing ### panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1. Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0 at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48) at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43) at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)
I went to the line number 40 of
CorpusTool.scala
and commented the line just like @bheinzerling suggested in case ofAMRParser.scala
. I added the following lines insteadval input = stdin.getLines.toArray val tokenized = input
I compiled it.
Now the scriptALIGN.sh
runs without any Exception. And shows this### Tokenizing ### panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1. ### Running aligner ###
but it gives nothing as output. The file
aligned_output_file
is empty .What can be done? (I am using the pre-trained models-2016.09.18.tgz only)
Thanks
And I do the same thing like you in CorpusTool.scala file. And I got the follow message:
### Tokenizing ###
/Users/gaoyong/jamr/tools/cdec/corpus/support/utf8-normalize.sh: Cannot find ICU uconv (http://site.icu-project.org/) ... falling back to iconv. Quality may suffer.
iconv: conversion from utf8 unsupported
iconv: try 'iconv -l' to get the list of supported encodings
### Running aligner ###
So can you solve your problem?
from jamr.
@calliwen I don't know how much this helps, but I am now seeing what others are, where commenting out the line of Perl I suggest above is not enough to fix it. We have two otherwise identical machines where one works and the other doesn't, and we haven't been able to sort out the difference.
However, in your case, I think this is the most important error:
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
If you comment out line 149 of that file, does the problem go away?
Thanks for the reply. When I use it on MacOS, I got the "ICU uconv" and "utf8 unsupported" error. But When I run it on Linux, I got the above msg output, and solved it with your solution.
Thanks a lot.
from jamr.
@calliwen Glad it worked! You can probably get a working uconv
from homebrew for MacOS. You may have to manually get the executables on your path, see https://apple.stackexchange.com/questions/201590/uconv-on-mac-os-x-anywhere .
from jamr.
Related Issues (20)
- Array Index Out of Bounds Exception when running JAMR HOT 3
- WARNING: Cannot find label = :wiki in the labelset
- Empty concept
- Hand alignment for LDC2014T12
- trouble with wget command for Illinois tagger HOT 1
- Out of memory errors when parsing large files, and alignments for parsed files HOT 1
- cannot use aligner
- 运行结果不对
- 运行结果不正确
- 404 error in setup script
- AMRs with minus concepts :mod (-/-)
- slash in token results in malformed AMR
- JAMR on Mac os X ? HOT 2
- Little Prince HOT 3
- where can I find parsed file? HOT 1
- SBT move to httpS and Scala compiler not found
- Setup error HOT 10
- Instruction of Successfully Installing JAMR by Updating Packages HOT 3
- issue in the process of tok HOT 1
- Use of LDC2014T12 returns Empty graph for words that are present in the model
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jamr.