Git Product home page Git Product logo

Comments (8)

bheinzerling avatar bheinzerling commented on July 20, 2024 3

The problem is that AMRParser tries to read a tokenization file that doesn't exist. It seems that instead of raising an exception this results in an empty array. This happens in line 169 of AMRParser.scala:

val tokenized = fromFile(options('tokenized).asInstanceOf[String]).getLines/.map(x => x)/.toArray

Trying to access an element of this empty array in line 197 causes an exception which gets handled, but during handling there is another attempted access in line 307, which causes the ArrayIndexOutOfBoundsException.

As a simple workaround in case your input text is already whitespace tokenized, you can replace line 169 with this line, run ./compile again, and everything should work:

val tokenized = input

Alternatively, you could try to run the tokenize script manually and set the --tok environment variable in config.sh

from jamr.

ConstantineLignos avatar ConstantineLignos commented on July 20, 2024 1

@ritwikmishra I was experiencing a similar problem and the solution in #16 solved it for me. Just comment out jamr/tools/cdec/corpus/support/quote-norm.pl line 149 to work around the crash, which appears to be a Perl bug, similar to https://rt.perl.org/Public/Bug/Display.html?id=124109.

from jamr.

ritwikmishra avatar ritwikmishra commented on July 20, 2024 1

@ConstantineLignos I tried what you suggested. And compiled it again. Now output comes

### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/ATS/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
   at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48)
   at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43)
   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
   at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43)
   at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)

Bdw I am using CAMR parser now, it is working better as per my needs.

from jamr.

ConstantineLignos avatar ConstantineLignos commented on July 20, 2024 1

@calliwen I don't know how much this helps, but I am now seeing what others are, where commenting out the line of Perl I suggest above is not enough to fix it. We have two otherwise identical machines where one works and the other doesn't, and we haven't been able to sort out the difference.

However, in your case, I think this is the most important error:

panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.

If you comment out line 149 of that file, does the problem go away?

from jamr.

ritwikmishra avatar ritwikmishra commented on July 20, 2024

I followed what @bheinzerling suggested and it worked for parsing. But when I run
scripts/ALIGN.sh < output_file2 > aligned_output_file
Here output_file2 is output of parsing step.
Same error is encountered

 ### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
	at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48)
	at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43)
	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
	at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43)
	at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)

I went to the line number 40 of CorpusTool.scala and commented the line just like @bheinzerling suggested in case of AMRParser.scala . I added the following lines instead

val input = stdin.getLines.toArray
val tokenized = input

I compiled it.
Now the script ALIGN.sh runs without any Exception. And shows this

### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
 ### Running aligner ###

but it gives nothing as output. The file aligned_output_file is empty .

What can be done? (I am using the pre-trained models-2016.09.18.tgz only)
Thanks

from jamr.

calliwen avatar calliwen commented on July 20, 2024

I followed what @bheinzerling suggested and it worked for parsing. But when I run
scripts/ALIGN.sh < output_file2 > aligned_output_file
Here output_file2 is output of parsing step.
Same error is encountered

 ### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
	at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48)
	at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43)
	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
	at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43)
	at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)

I went to the line number 40 of CorpusTool.scala and commented the line just like @bheinzerling suggested in case of AMRParser.scala . I added the following lines instead

val input = stdin.getLines.toArray
val tokenized = input

I compiled it.
Now the script ALIGN.sh runs without any Exception. And shows this

### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
 ### Running aligner ###

but it gives nothing as output. The file aligned_output_file is empty .

What can be done? (I am using the pre-trained models-2016.09.18.tgz only)
Thanks

And I do the same thing like you in CorpusTool.scala file. And I got the follow message:

### Tokenizing ###
/Users/gaoyong/jamr/tools/cdec/corpus/support/utf8-normalize.sh: Cannot find ICU uconv (http://site.icu-project.org/) ... falling back to iconv. Quality may suffer.
iconv: conversion from utf8 unsupported
iconv: try 'iconv -l' to get the list of supported encodings
 ### Running aligner ###

So can you solve your problem?

from jamr.

calliwen avatar calliwen commented on July 20, 2024

@calliwen I don't know how much this helps, but I am now seeing what others are, where commenting out the line of Perl I suggest above is not enough to fix it. We have two otherwise identical machines where one works and the other doesn't, and we haven't been able to sort out the difference.

However, in your case, I think this is the most important error:

panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.

If you comment out line 149 of that file, does the problem go away?

Thanks for the reply. When I use it on MacOS, I got the "ICU uconv" and "utf8 unsupported" error. But When I run it on Linux, I got the above msg output, and solved it with your solution.
Thanks a lot.

from jamr.

ConstantineLignos avatar ConstantineLignos commented on July 20, 2024

@calliwen Glad it worked! You can probably get a working uconv from homebrew for MacOS. You may have to manually get the executables on your path, see https://apple.stackexchange.com/questions/201590/uconv-on-mac-os-x-anywhere .

from jamr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.