Git Product home page Git Product logo

simptextalign's People

Contributors

neosyon avatar ponzius avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

simptextalign's Issues

Paragraph headings not recognised as sentences

Hi,
When running the tool with default parameters, I get alignments such as these:

11: ## Following In Some Generous Footsteps In 2010 , Bill Gates and Warren Buffett publicly launched the Giving Pledge to encourage billionaires to donate the bulk of their wealth to charity . ---(0.8202991224743326)---> 9: In 2010 , Bill Gates and Warren Buffett publicly launched the Giving Pledge to encourage billionaires to donate the bulk of their wealth to charity .

25: ## Using Technology To Change Learning In the open letter , Zuckerberg and Chan talked about the potential that technology offers to re-engineer the way children learn . ---(0.8480297902555115)---> 20: In the open letter , Zuckerberg and Chan talked about the potential that technology offers to re-engineer the way children learn .

As can be seen, the section headings "## Following In Some Generous Footsteps" and "## Using Technology To Change Learning" were not correctly identified as sentences, despite being in a single line in the Newsela article.

Is there a way to prevent this from happening? Maybe changing some property of the sentence splitter internally used by the tool? I haven't checked if this happens with every section heading in every article, but it does happen in all the ones I've manually checked (around 10 original articles with their 4 versions).

Thank you for your help,
Fernando

Where to change language???

If we want to apply this tool on other language.. Like I want to run this on the Urdu language
then where to change OR mention language?

No outputs

hi, I try to run AlignNewselaDataset.jar, but it does give any output. Then I tried to give -o a not existed file, the system does create one but there is no output in that folder

Run the tool on Newsela

I am trying to run the tool on Newsela dataset, but I think I have to change the baseDir in the code and recreate the jar file. Does the baseDir have the articles can you give me example please?

License

Hi,

I was just wondering, could you maybe add an open source license to the repo (e.g. MIT)? I would like to use the code for German but in order to do this, I need to be able to change some of the code, which I cannot do right now, as the standard GitHub license applies.

Thanks a lot!

Hi,

Hi,

Thanks for your useful tool, however I get the same error as in the previous issue but is closed now.

Calculating IDF...
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.IllegalStateException: basedir ../../data/Newsela/articles is not a directory
at org.apache.tools.ant.DirectoryScanner.scan(DirectoryScanner.java:797)
at simplifiedTextAlignment.Representations.NgramModel.buildNewselaNgramModel(NgramModel.java:42)
at simplifiedTextAlignment.DatasetAlignment.AlignNewselaDataset.main(AlignNewselaDataset.java:107)
... 5 more

My java version is
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

Can you please guide how to resolve this,

Thank you in adavance.

Originally posted by @umauh in #2 (comment)

java.lang.NullPointerException

Hi, thanks for your useful tool!

However, when I run the tool, I meet a bug which error log is:

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.NullPointerException
at simplifiedTextAlignment.DatasetAlignment.AlignNewselaDataset.main(AlignNewselaDataset.java:94)
... 5 more

My java version is 8, would you please take a look at this issue?

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.