mholtzscher / syllapy Goto Github PK
View Code? Open in Web Editor NEWCalculate syllable count for English words.
License: MIT License
Calculate syllable count for English words.
License: MIT License
I see this while running my tests
Iโm not importing syllapy
directly, it is a dependency of a dependency of mine
.venv/lib/python3.11/site-packages/syllapy/data_loader.py:3
/opt/clones/github/jalanb/zatsos/zatso/.venv/lib/python3.11/site-packages/syllapy/data_loader.py:3: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources
Hello there!
Thank you for writing syllapy! I have used it to help relaunch the NYT Haiku Twitter bot (see the Github repo for details) and it was such a tremendous relief to be able to use your library instead of keeping a dictionary of syllable counts I had to load. I would like to contribute some things back to this repo though and I wanted to check if you still were interested in maintaining it so I'm not bothering you incessantly if you don't have the time or bandwidth? Thank you.
Words such as 'Ohio' and 'Norway' exist in dict returned from load_dict(). However, when you call syllapy.count('Ohio'), the count() function converts it to lowercase, so it is not found in the dictionary, and 2 is returned instead of the value of 3 in the dict.
"detective" = 3 syllables. Correct.
"detectives" = 4 syllables. Incorrect.
The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.
As I mentioned, I am running a haiku finding bot against the New York Times so I keep finding new exceptions to be added to the list (for instance, I think today there was "falsehoods" and "reconciliation"). Although my intention is to make regular pull requests with syllable changes, I would like to supplement the load_dict method with another method I can call to supplement syllapy's exceptions with my own JSON (or CSV if you decide that approach is okay). This way I can make immediate and incremental patches to behavior on my end and then roll up into larger regular PRs
ERROR: Failed building wheel for ujson
Running setup.py clean for ujson
Failed to build ujson
Installing collected packages: ujson, syllapy
Running setup.py install for ujson ... error
ERROR: Command errored out with exit status 1:
command: 'c:\users\matt\appdata\local\programs\python\python39\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Matt\AppData\Local\Temp\pip-install-l0j0afpg\ujson_7ec301c0567b44aab36a8e2e995eace9\setup.py'"'"'; file='"'"'C:\Users\Matt\AppData\Local\Temp\pip-install-l0j0afpg\ujson_7ec301c0567b44aab36a8e2e995eace9\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Matt\AppData\Local\Temp\pip-record-eobpo6yh\install-record.txt' --single-version-externally-managed --compile --install-headers 'c:\users\matt\appdata\local\programs\python\python39\Include\ujson'
cwd: C:\Users\Matt\AppData\Local\Temp\pip-install-l0j0afpg\ujson_7ec301c0567b44aab36a8e2e995eace9
Complete output (14 lines):
Warning: 'classifiers' should be a list, got type 'filter'
running install
running build
running build_ext
building 'ujson' extension
creating build
creating build\temp.win-amd64-3.9
creating build\temp.win-amd64-3.9\Release
creating build\temp.win-amd64-3.9\Release\lib
creating build\temp.win-amd64-3.9\Release\python
d:\Data\apps\VS2019\VC\Tools\MSVC\14.26.28801\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I./python -I./lib -Ic:\users\matt\appdata\local\programs\python\python39\include -Ic:\users\matt\appdata\local\programs\python\python39\include -Id:\Data\apps\VS2019\VC\Tools\MSVC\14.26.28801\ATLMFC\include -Id:\Data\apps\VS2019\VC\Tools\MSVC\14.26.28801\include -IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um /Tc./lib/ultrajsondec.c /Fobuild\temp.win-amd64-3.9\Release./lib/ultrajsondec.obj -D_GNU_SOURCE
ultrajsondec.c
C:\Users\Matt\AppData\Local\Temp\pip-install-l0j0afpg\ujson_7ec301c0567b44aab36a8e2e995eace9\lib\ultrajson.h(56): fatal error C1083: Cannot open include file: 'stdio.h': No such file or directory
error: command 'd:\Data\apps\VS2019\VC\Tools\MSVC\14.26.28801\bin\HostX86\x64\cl.exe' failed with exit code 2
----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\users\matt\appdata\local\programs\python\python39\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Matt\AppData\Local\Temp\pip-install-l0j0afpg\ujson_7ec301c0567b44aab36a8e2e995eace9\setup.py'"'"'; file='"'"'C:\Users\Matt\AppData\Local\Temp\pip-install-l0j0afpg\ujson_7ec301c0567b44aab36a8e2e995eace9\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Matt\AppData\Local\Temp\pip-record-eobpo6yh\install-record.txt' --single-version-externally-managed --compile --install-headers 'c:\users\matt\appdata\local\programs\python\python39\Include\ujson' Check the logs for full command output.
I have been doing some spot checks and have about 445 additional exceptions for syllable counts I can add to the file you have. I realize though that it might be a really frustrating experience to review as a PR, especially if you didn't want to add some of them. Is there a preferred way I should contribute some additions back to you:
I also wanted to share that it does look like there are few cases that seem to repeat a bit, in case it's useful for your algorithm (many of them seem like special cases):
-sed
or -ked
like poised
or marked
are often coded as 2 syllablese
that are pluralized like graves
or gives
-ism
like journalism or socialism seem to undercount the last syllable-ly
seem to not count the adverb syllableI also realize this is controversial, but I count hour
as 2 syllables for instance, but I don't know if everybody does
import syllapy
syllapy.count('feature')
2
syllapy.count('features')
3
Please add 'screeched' to the dictionary containing words that are incorrectly handled by this program.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.