Git Product home page Git Product logo

readtomyshoe's People

Contributors

rozbb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

readtomyshoe's Issues

Warning due to "fake play"

Browser: All browsers

Description: Whenever playing an article from the queue, an warning shows up in the console, saying something like

HTTP “Content-Type” of “application/x-unknown-content-type” is not supported. Load of media resource blob:<base_url>/75fe7bcf-6205-4786-9c1a-9725da54a14a failed.

This is because the fake_play() function attempts to play an empty blob. The reason for that is because Safari requires an immediate play() action after a button click in order to permit future play()s without user interaction. So the button click handler calls play() immediately on invalid audio before loading the real audio.

Notes: A fix for this might be to play an valid, but empty MP3 file. I've tried this and couldn't get it to work for the life of me.

Consider storage limits and what happens when those are hit

Currently, RTMS assumes it has infinite space for audio files. It would be wise to consider a way to limit space usage (either globally or per-user), and consider what happens when it's hit.

For example, a user who hits their space limit might not be able to add any articles until they "archive" some. Archiving might amount to deleting the MP3 file but saving the metadata somewhere. The the library, there might be an extra section for archived articles. It shows their title and source, but no Add to Queue button.

Support other ways to add to library

Currently you need to open RTMS in order to add an article to the library. It would be nice if there were a bookmarklet or share target that let you add it directly from the page. Push To Kindle is a good example of such a bookmarklet.

This has other benefits too. With a bookmarklet, you could possibly do text extraction client-side. This lets users with, e.g., Bloomberg accounts to add Bloomberg articles to their library without having to give RTMS their login.

Blocked by reverse proxies

RTMS instances are being detected as bots, and receiving 403 (Forbidden) on pages hosted by Cloudlflare and AWS Cloudfront. The traffic volume of the instance is irrelevant to whether it gets banned.

Possible solutions:

  1. Register with Cloudflare as a friendly bot
  2. Write a bookmarklet that will submit articles directly to the server, rather than having the server make the request. This is a nice solution because it also solves #28 without any extra work.

Insert sentence breaks to titles

Currently, the TTS voice speeds through the entire title, byline, and first sentence of the article. These should all be spoken as separate sentences. The reason they're not is because the text extraction has them separated by newlines. The TTS for web-derived articles should replace newlines with periods, so that the TTS breaks at the appropriate places.

Retry requests when Google TTS fails

Currently, if any TTS request fails, the whole article fails. Re-submitting the article is extremely wasteful. Make it so that individual TTS requests are auto-retried with exponential backoff by the backend.

Support articles that are PDF files

It would be nice if I could upload a PDF file and RTMS would extract and TTS the text for me. This is a fundamentally hard problem, but even a simple solution like pdf2txt would be fine as a first step.

Make it so the Docker image can put the `audio_blobs/` directory on an external volume

Currently, audio_blobs/ lives inside the Docker image. That means 1) it is limited by whatever space constraints the docker runner is using, and 2) whenever the image is torn down, so is all the data.

It would be preferable to have the Dockerfile (or docker-compose file?) specify an external location to persistently keep the audio_blobs/ data.

Add source links

When adding by text, it would be nice to allow the manual addition of a source link (just like how adding by link includes the source link).

Add total queue length

It would be nice to be able to see the total length of the downloaded queue. Since you can't see the length of articles without opening them (including before you download them), the only way to know how much time you have queued up is to manually open every article. Total queue length solves this problem with less clutter than displaying every article length individually.

Add Article creates corrupt file on error

Browser: All browsers

Description: If you add an article and an error occurs during TTS, the corrupt file will remain in the library.

Note: Simple fix is to do everything using a temp file and to move it to the library under the appropriate filename after everything succeeds.

Make a Wiki

There's too much info in the README. The Wiki should have, at least:

  • A high-level overview
  • Design goals and non-goals
  • Dev notes
  • Deployment notes. Cover Fly.io, Caddy config, nginx config, and GCP setup and cost
  • A quickstart guide

Support more languages and voices

Currently, text breaking is only written to work for English text (though it probably works on anything with newlines), and TTS is fixed to a single (American English) voice. It should be possible for a user to select the language of the article they downloaded (or have it autodetected), and select a TTS voice to go with it.

Images are broken in dev mode

Running scripts/dev.sh (which calls trunk) runs a version of RTMS with broken images, broken content scripts, broken manifests, etc. This is because dev mode serves the index from /, while production serves all the static assets from /assets/ (aside from index.html, which is special-cased). So if I set the favicon URL to /assets/favicon.ico, then it'll load in prod mode but fail in dev mode. And if I set it to /favicon.ico it'll load in dev mode and fail in prod mode.

An obvious thing to try is to put everything inside an /assets folder in dev mode. But this doesn't work because 1) it needs to serve index.html from /, and 2) if you make index.html separate, and put everything else in /assets/, then the URLs in prod will be /assets/assets/, because Trunk doesn't distinguish dev and prod builds.

Another solution is to do the above, but also make a copy of every asset in / as well. But 1) Trunk doesn't have a notion of making 2 copies of everything (though maybe you could do this in post-build hooks, and 2) this is very ugly.

Stop speech if other media plays

Currently, on every platform, if you get a notification while listening to an article, or you start to play something else, the playback will not stop. In the best case, the audio will duck and continue playing, and in the worst case it will appear to be playing from the controls, but no audio will come out. It would nice if RTMS could detect when it lost audio focus and pause playback.

This may be impossible at the moment. The only API that seems to address this is the seemingly defunct Audio Focus API.

Article extraction errors are bad

Currently, if trafilatura fails at extracting an article, the error presented in the Add Article view is a parsing error. This is because trafilatura exits with a 0 error code, even on failure.

Note: this is a trafilatura bug that was reported and fixed. Once the fix is upstreamed, this will be closed.

Clicking Add to Queue should shift focus to the progress indicator

Browser: Chrome
OS: macOS

This is probably the case on other browsers too. Whenever you select Add To Queue with Voiceover, it leaves some unnamed group focused, and you hear the progress ticks. But it never says you're focused on a progress indicator, and it never says the percentage.

The example app in this Vue page does focusing correctly, at least for the button -> progress part. Maybe try to emulate that.

Add note about nginx URL normalization

I ran into this issue. A filename had a > in the title, and the server would return a 400 (Bad Request) error when I attempted to Add to Queue. The reason: nginx was normalizing the percent-encoded URL before it got passed to readtomyshoe-server. It put > directly in the URL, making the URL invalid, and forcing hyper to choke and return a 400.

Solution: make sure your proxy_pass line in nginx is specified WITHOUT a URI. In my case, that means making sure there is no trailing backslash in the proxy_pass line below:

location / {
        proxy_pass http://localhost:32148;
        ...
}

More info here

Let users enter credentials for paywalled content

Currently, there's no way to read a paywalled New York Times article. If the person running the RTMS server has an NYT account, they should be able to use the login to fetch that article.

Ditto goes for Bloomberg, WSJ, and SEC EDGAR (needs user agent).

Play/Pause <button> does not always act like a play/pause button

Browser: All browsers

Description: If you pause, then seek, then hit the play/pause button to resume, it will play from the position before the seek. This is because the play/pause button loads the last known state when playing. It seems the play button and the play/pause button have two mutually exclusive, and useful functionalities.

Notes: One possible solution to make play/pause respect seeking/jumping is to trigger a save on every seek/jump. A better solution I think is to remove the play/pause button altogether. The reason it exists is because clicking the play button on the <audio> element will sometimes not play from the correct time, because the tab may have previously been unloaded and lost its place. This would be fixed if we had callbacks from the Page Lifecycle API that reload the player state whenever the player becomes visible again and was not playing something already. See visibilitychange

Make user accounts

It's an obviously terrible idea to have every article be public. Let users make accounts and optionally share their library with other users.

Also be sure to implement robust access controls, and prevent enumeration attacks where a malicious user might be able to discover the contents of another user's library.

Let users use their own Google Cloud account for TTS

It should be possible to make RTMS use a service account that can bill to subordinate accounts. This would allow me to run a persistent RTMS server and only pay for storage, rather than the expensive TTS bills too.

The flow would roughly be:

  1. User makes a Google Cloud account just like in Getting Started
  2. User logs into RTMS and is presented with a special authentication link
  3. User clicks the link, which will take them to GCP and ask if they want to give access to their TTS API key. User clicks OK
  4. User returns to RTMS and can use it on their own dime.

Add Article terminates too early

Browser: All browsers

Description: If you submit an article and click the back button, the article will not be added to the library. This is because the Add Article backend will terminate early if the connection terminates early.

Notes: A fix for this would be to spawn a separate task to do the Add Article operation.

Keyboard bindings

This really really needs keyboard bindings. You shouldn't need to use a screen reader just to find the pause button for the audio that's currently playing. Copy popular bindings, maybe from youtube.

UI improvements for Add to Queue

Add to Queue is pretty hard to use right now. There's a few things that should be done:

  1. Some UI feedback that acknowledges you pressed the button to add the article to queue
  2. Adding an article to the queue might take a long time if you have a poor internet connection. So a progress on the status of the download would be helpful
  3. Something should prevent you from adding the same article to queue multiple times (or at least optimize it so that it doesn't have to download anything twice). On the flip side, deleting from queue should remove all instances of the deleted article.

Add note about "secure contexts" for offline mode

Currently, if you run scripts/prod.sh on your local network, offline mode will fail. The reason for this is because service workers can only be registered in "secure contexts" meaning localhost or http://..., and no more. This might be surprising for a developer trying out the server for the first time and finding that one of the core features doesn't work.

Make a prominent note somewhere saying that this is the expected behavior, and that you can only sample offline mode when you're localhost or running a real HTTPS server.

Implement a readalong UI

It would be nice to have a way of reading the text along with the voice. This would be possible using the current API by adding an SSML <mark> tag to every sentence. The TTS will return timepoints that can tell the client which sentence they're reading at a given timestamp.

Improve error in Safari private mode

IndexedDB is not available in Safari's private mode, so RTMS fundamentally does not work. Make it error on add-to-queue, with error text nicer than attached.
Screen Shot 2022-09-12 at 00 13 17

Make an autoplay option

Make a selection box which, when selected, will autoplay the next article once the current one is done.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.