rozbb / readtomyshoe Goto Github PK
View Code? Open in Web Editor NEWA webapp that reads your articles to you while you're on the subway
License: Other
A webapp that reads your articles to you while you're on the subway
License: Other
Browser: All browsers
Description: Whenever playing an article from the queue, an warning shows up in the console, saying something like
HTTP “Content-Type” of “application/x-unknown-content-type” is not supported. Load of media resource blob:<base_url>/75fe7bcf-6205-4786-9c1a-9725da54a14a failed.
This is because the fake_play()
function attempts to play an empty blob. The reason for that is because Safari requires an immediate play()
action after a button click in order to permit future play()
s without user interaction. So the button click handler calls play()
immediately on invalid audio before loading the real audio.
Notes: A fix for this might be to play an valid, but empty MP3 file. I've tried this and couldn't get it to work for the life of me.
Currently, RTMS assumes it has infinite space for audio files. It would be wise to consider a way to limit space usage (either globally or per-user), and consider what happens when it's hit.
For example, a user who hits their space limit might not be able to add any articles until they "archive" some. Archiving might amount to deleting the MP3 file but saving the metadata somewhere. The the library, there might be an extra section for archived articles. It shows their title and source, but no Add to Queue button.
Browser: Chrome for Desktop and Android.
Description: If you play and article, then reload, then play again, Chrome's media session will be missing all but the pause and stop buttons. Album art is missing too.
Notes: This is a bug in Chrome. It has been reported and assigned.
Currently you need to open RTMS in order to add an article to the library. It would be nice if there were a bookmarklet or share target that let you add it directly from the page. Push To Kindle is a good example of such a bookmarklet.
This has other benefits too. With a bookmarklet, you could possibly do text extraction client-side. This lets users with, e.g., Bloomberg accounts to add Bloomberg articles to their library without having to give RTMS their login.
RTMS instances are being detected as bots, and receiving 403 (Forbidden) on pages hosted by Cloudlflare and AWS Cloudfront. The traffic volume of the instance is irrelevant to whether it gets banned.
Possible solutions:
Currently, the TTS voice speeds through the entire title, byline, and first sentence of the article. These should all be spoken as separate sentences. The reason they're not is because the text extraction has them separated by newlines. The TTS for web-derived articles should replace newlines with periods, so that the TTS breaks at the appropriate places.
It'd be nice to see how long an article is before you try to add it to queue. This should maybe go in the italic text below the title.
Currently, if any TTS request fails, the whole article fails. Re-submitting the article is extremely wasteful. Make it so that individual TTS requests are auto-retried with exponential backoff by the backend.
It would be nice if I could upload a PDF file and RTMS would extract and TTS the text for me. This is a fundamentally hard problem, but even a simple solution like pdf2txt
would be fine as a first step.
Currently, audio_blobs/
lives inside the Docker image. That means 1) it is limited by whatever space constraints the docker runner is using, and 2) whenever the image is torn down, so is all the data.
It would be preferable to have the Dockerfile (or docker-compose
file?) specify an external location to persistently keep the audio_blobs/
data.
When adding by text, it would be nice to allow the manual addition of a source link (just like how adding by link includes the source link).
It would be nice to be able to see the total length of the downloaded queue. Since you can't see the length of articles without opening them (including before you download them), the only way to know how much time you have queued up is to manually open every article. Total queue length solves this problem with less clutter than displaying every article length individually.
Browser: All browsers
Description: If you add an article and an error occurs during TTS, the corrupt file will remain in the library.
Note: Simple fix is to do everything using a temp file and to move it to the library under the appropriate filename after everything succeeds.
There's too much info in the README. The Wiki should have, at least:
Currently, text breaking is only written to work for English text (though it probably works on anything with newlines), and TTS is fixed to a single (American English) voice. It should be possible for a user to select the language of the article they downloaded (or have it autodetected), and select a TTS voice to go with it.
Running scripts/dev.sh
(which calls trunk
) runs a version of RTMS with broken images, broken content scripts, broken manifests, etc. This is because dev mode serves the index from /
, while production serves all the static assets from /assets/
(aside from index.html
, which is special-cased). So if I set the favicon URL to /assets/favicon.ico
, then it'll load in prod mode but fail in dev mode. And if I set it to /favicon.ico
it'll load in dev mode and fail in prod mode.
An obvious thing to try is to put everything inside an /assets
folder in dev mode. But this doesn't work because 1) it needs to serve index.html
from /
, and 2) if you make index.html
separate, and put everything else in /assets/
, then the URLs in prod will be /assets/assets/
, because Trunk doesn't distinguish dev and prod builds.
Another solution is to do the above, but also make a copy of every asset in /
as well. But 1) Trunk doesn't have a notion of making 2 copies of everything (though maybe you could do this in post-build hooks, and 2) this is very ugly.
Currently, on every platform, if you get a notification while listening to an article, or you start to play something else, the playback will not stop. In the best case, the audio will duck and continue playing, and in the worst case it will appear to be playing from the controls, but no audio will come out. It would nice if RTMS could detect when it lost audio focus and pause playback.
This may be impossible at the moment. The only API that seems to address this is the seemingly defunct Audio Focus API.
Currently, if trafilatura
fails at extracting an article, the error presented in the Add Article view is a parsing error. This is because trafilatura
exits with a 0 error code, even on failure.
Note: this is a trafilatura
bug that was reported and fixed. Once the fix is upstreamed, this will be closed.
Browser: Safari for iOS
Description: If the readtomyshoe PWA is paused and backgrounded for 5sec, you cannot resume playback.
Notes: Bug in Safari. Reported
Browser: Chrome
OS: macOS
This is probably the case on other browsers too. Whenever you select Add To Queue with Voiceover, it leaves some unnamed group focused, and you hear the progress ticks. But it never says you're focused on a progress indicator, and it never says the percentage.
The example app in this Vue page does focusing correctly, at least for the button -> progress part. Maybe try to emulate that.
I ran into this issue. A filename had a >
in the title, and the server would return a 400 (Bad Request) error when I attempted to Add to Queue. The reason: nginx was normalizing the percent-encoded URL before it got passed to readtomyshoe-server
. It put >
directly in the URL, making the URL invalid, and forcing hyper
to choke and return a 400.
Solution: make sure your proxy_pass
line in nginx is specified WITHOUT a URI. In my case, that means making sure there is no trailing backslash in the proxy_pass
line below:
location / {
proxy_pass http://localhost:32148;
...
}
More info here
Browser: Safari for iOS
Description: Requires headphones to trigger. If you pause an article and background readtomyshoe, wait 2min, and click the play button on your headphones, the audio will resume but the media session controls will disappear from Notification Center.
Notes: Bug in Safari. Reported.
Currently there's no way to find out anything about readtomyshoe from the webpage. Put a link to the github somewhere.
Currently, there's no way to read a paywalled New York Times article. If the person running the RTMS server has an NYT account, they should be able to use the login to fetch that article.
Ditto goes for Bloomberg, WSJ, and SEC EDGAR (needs user agent).
Browser: All browsers
Description: If you pause, then seek, then hit the play/pause button to resume, it will play from the position before the seek. This is because the play/pause button loads the last known state when playing. It seems the play button and the play/pause button have two mutually exclusive, and useful functionalities.
Notes: One possible solution to make play/pause respect seeking/jumping is to trigger a save on every seek/jump. A better solution I think is to remove the play/pause button altogether. The reason it exists is because clicking the play button on the <audio>
element will sometimes not play from the correct time, because the tab may have previously been unloaded and lost its place. This would be fixed if we had callbacks from the Page Lifecycle API that reload the player state whenever the player becomes visible again and was not playing something already. See visibilitychange
It's an obviously terrible idea to have every article be public. Let users make accounts and optionally share their library with other users.
Also be sure to implement robust access controls, and prevent enumeration attacks where a malicious user might be able to discover the contents of another user's library.
It should be possible to make RTMS use a service account that can bill to subordinate accounts. This would allow me to run a persistent RTMS server and only pay for storage, rather than the expensive TTS bills too.
The flow would roughly be:
Browser: All browsers
Description: If you submit an article and click the back button, the article will not be added to the library. This is because the Add Article backend will terminate early if the connection terminates early.
Notes: A fix for this would be to spawn a separate task to do the Add Article operation.
This really really needs keyboard bindings. You shouldn't need to use a screen reader just to find the pause button for the audio that's currently playing. Copy popular bindings, maybe from youtube.
Add to Queue is pretty hard to use right now. There's a few things that should be done:
Currently, if you run scripts/prod.sh
on your local network, offline mode will fail. The reason for this is because service workers can only be registered in "secure contexts" meaning localhost
or http://...
, and no more. This might be surprising for a developer trying out the server for the first time and finding that one of the core features doesn't work.
Make a prominent note somewhere saying that this is the expected behavior, and that you can only sample offline mode when you're localhost
or running a real HTTPS server.
It would be nice to have a way of reading the text along with the voice. This would be possible using the current API by adding an SSML <mark>
tag to every sentence. The TTS will return timepoints that can tell the client which sentence they're reading at a given timestamp.
Make a selection box which, when selected, will autoplay the next article once the current one is done.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.