Comments (11)
I haven't seen that, at least on the test texts currently uploaded. Are these full articles you're running against, now?
from abbr.
Actually on both full articles and abstracts. I just checked and the thing that's causing problems right now is a single-letter abbreviation (in this case s). It's not even a true abbreviation. It's an optional pluralization: brain structure(s).
from abbr.
And the stall is coming from utils.replace
!
from abbr.
In findall
, is it returning 'structure' as the term? If so, should we set it so that abbreviations must be enclosed in parentheses and preceded by a space?
from abbr.
It's returning structure(s because we have a line that finds ' (' in the full term, which doesn't exist in this case. When the substring isn't found in a string, the find
method returns -1 (the last character).
Then, it gets stuck in the while
loop in utils.replace
. I think we need both an escape for the while
loop (to prevent infinite loops) and a check for the space before the open parenthesis before trying to replace throughout the text.
A perhaps 'hack-y' way to do it would be to say that index
cannot equal -1.
from abbr.
Okay maybe requiring that there be a space is enough. It looks like it fixed it for me. pytest
isn't working for me because test_utils.py is empty. How do you run the tests?
from abbr.
As soon as you pushed the commit the Travis CI build started— looks like both versions of python still pass!
from abbr.
Oooh wow I totally forgot about CI. I need to stop directly committing and start doing PRs from a fork like you do. Anyway, looks like it's solved at the moment.
from abbr.
Yeah so that was one problem causing infinite loops. Another one just came up.
This is definitely a false positive, but the identified abbreviation is X and the full term is XX.
Testing on the string 'XX XX (X) X'
causes an infinite loop.
I think it has something to do with keeping track of where to start searching for the full term after replacing it once here. Maybe when the "abbreviation" X is replaced with XX, it's extending past the new start_idx
in text
and so it finds a new X to replace with XX, etc.
from abbr.
I think I've managed to deal with the new problem in #12.
from abbr.
I think it's a reasonable fix, and the builds are still passing. I went ahead and merged #12 and will close this issue unless something else arises. Thanks for catching and fixing that!
from abbr.
Related Issues (10)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from abbr.