neosyon / simptextalign Goto Github PK
View Code? Open in Web Editor NEWRepo for the simplified text alignment tools.
License: MIT License
Repo for the simplified text alignment tools.
License: MIT License
Hi,
When running the tool with default parameters, I get alignments such as these:
11: ## Following In Some Generous Footsteps In 2010 , Bill Gates and Warren Buffett publicly launched the Giving Pledge to encourage billionaires to donate the bulk of their wealth to charity . ---(0.8202991224743326)---> 9: In 2010 , Bill Gates and Warren Buffett publicly launched the Giving Pledge to encourage billionaires to donate the bulk of their wealth to charity .
25: ## Using Technology To Change Learning In the open letter , Zuckerberg and Chan talked about the potential that technology offers to re-engineer the way children learn . ---(0.8480297902555115)---> 20: In the open letter , Zuckerberg and Chan talked about the potential that technology offers to re-engineer the way children learn .
As can be seen, the section headings "## Following In Some Generous Footsteps" and "## Using Technology To Change Learning" were not correctly identified as sentences, despite being in a single line in the Newsela article.
Is there a way to prevent this from happening? Maybe changing some property of the sentence splitter internally used by the tool? I haven't checked if this happens with every section heading in every article, but it does happen in all the ones I've manually checked (around 10 original articles with their 4 versions).
Thank you for your help,
Fernando
If we want to apply this tool on other language.. Like I want to run this on the Urdu language
then where to change OR mention language?
hi, I try to run AlignNewselaDataset.jar, but it does give any output. Then I tried to give -o a not existed file, the system does create one but there is no output in that folder
I am trying to run the tool on Newsela dataset, but I think I have to change the baseDir in the code and recreate the jar file. Does the baseDir have the articles can you give me example please?
Hi,
I was just wondering, could you maybe add an open source license to the repo (e.g. MIT)? I would like to use the code for German but in order to do this, I need to be able to change some of the code, which I cannot do right now, as the standard GitHub license applies.
Thanks a lot!
Hi,
Thanks for your useful tool, however I get the same error as in the previous issue but is closed now.
Calculating IDF...
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.IllegalStateException: basedir ../../data/Newsela/articles is not a directory
at org.apache.tools.ant.DirectoryScanner.scan(DirectoryScanner.java:797)
at simplifiedTextAlignment.Representations.NgramModel.buildNewselaNgramModel(NgramModel.java:42)
at simplifiedTextAlignment.DatasetAlignment.AlignNewselaDataset.main(AlignNewselaDataset.java:107)
... 5 more
My java version is
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
Can you please guide how to resolve this,
Thank you in adavance.
Originally posted by @umauh in #2 (comment)
Hi, thanks for your useful tool!
However, when I run the tool, I meet a bug which error log is:
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.NullPointerException
at simplifiedTextAlignment.DatasetAlignment.AlignNewselaDataset.main(AlignNewselaDataset.java:94)
... 5 more
My java version is 8, would you please take a look at this issue?
Thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.