Git Product home page Git Product logo

rss-parrot's Introduction

RSS Parrot

Source code of RSS Parrot, a service that lets you turn Mastodon into your feed reader.

Details on https://rss-parrot.net/, and in the Fediverse.

rss-parrot's People

Contributors

gitlimes avatar gugray avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rss-parrot's Issues

Detect duplicate title and body

Some feeds (e.g. those provided by nitter.cz) duplicate the article title and body. Is it possible to detect for this and remove one or the other?

Example:

Screenshot_20240120-163311-226

Purge old posts

Periodically delete data from the Parrot's database to keep it lean: delete old posts from every feed. Keep the top X, and don't delete anything more recent than Y, but delete all the rest. The Parrot is a relay, not an archive.

language setting

if it is possible the language of feed should be set in the toot. This way it will create much less spam on global timeline.

Allow verification via //link[@rel='me'] in feed's website link

Thanks for building this, it's great!

I've got the feed available at https://codeandsupply.co/jobs birb'd via @[email protected]. I've added //link[@rel='me'] tags to the site to indicate that RSS-parrot's profile is verified, like how other Mastodon and fediverse systems allow.

<link href='https://rss-parrot.net/@codeandsupply.co.jobs' rel='me'>
<link href='https://rss-parrot.net/web/feeds/codeandsupply.co.jobs' rel='me'>

It doesn't look like RSS-parrot supports this right now. It'd be cool if it did.

Google Alerts Feed does not work

The feed https://www.google.de/alerts/feeds/15238767306718225030/1632617747796897053 gets the response

"Hm, I can't find a feed for this site.

Is the address right? It can also happen that the site is temporarily down, or it doesn't have a valid RSS or Atom feed."

Masto account feed not detected

Tried to follow myself: https://neuromatch.social/@jonny/111709784531341231
got this error message: https://rss-parrot.net/u/birb/status/1704442136821027061

it seems like this was the one i was supposed to get, right?:

template = "reply_feed_mastodon.html"

something not working right in that "get account for feed" function or the control flow in there :)

cool thing!!! nice work!
edit: sorry that reads as sarcastic, i meant it sincerely

Send a toot with the last feed entry immediately after following a feed

Subscribing to a feed can be frustrating as the feed might not change for some time. It means that the user starts following the feed, and nothing happens for days or even weeks.

The onboarding experience would be much nicer if the parrot immediately sent a toot with the last entry of the feed.

GZIP body of POST requests if possible

Test if Mastodon servers accept POST requests from us where the body is Gzipped.
This would reduce outgoing data volume, which helps with hosting fees.

Mechanism for sites to indicate their preferred bridge

Hi again! Now that we have multiple feed-to-fediverse bridges - RSS Parrot, Bridgy Fed, MastoFeed, rss-to-activitypub, feed2toot, feed2fedi - we'll inevitably end up with multiple fediverse accounts for the same web site, each run by a different bridge. Not a big problem, but it'd still be nice to avoid.

One approach could be to let sites indicate their "preferred" bridge with eg <link rel="me bridge"> in their home page's HTML, similar to #16. If example.com wants to use RSS Parrot to get its posts into the fediverse, it would add something like this in its HTML:

<link rel="me bridge" href="https://rss-parrot.net/web/feeds/example.com">

Then, when other bridges start to create a fediverse account for example.com, they'd first check its home page HTML for a link rel=bridge. If one exists, and it doesn't point to them, they'd stop and give up.

One difficulty is that a site could want to use multiple bridges for different networks, eg RSS Parrot for the fediverse and https://atomstr.data.haus/ for Nostr. Maybe we'd allow multiple <link rel="me bridge">s to handle that, maybe differentiated by type, or maybe that differentiation isn't necessary.

Thoughts?

No response to request from GoToSocial

If I send a message to RSS parrot from a GoToSocial account, I never get a response from the bot.

Logs from the GTS side suggest that RSS Parrot doesn't like the formatting of the request. Messages from GTS to other platforms like Mastodon do work, so I'm guessing it's a parse issue and not a generating issue.

I'll see about getting some more detail logs including what GoToSocial is sending.

timestamp="11/01/2024 02:47:13.361" func=federation.(*federatingActor).Send level=INFO requestID=rtj6nxmc04001n481h4g msg="send activity Create via outbox https://gts.keysmash.xyz/users/kelsonv/outbox"
timestamp="11/01/2024 02:47:13.366" func=httpclient.(*Client).DoSigned level=INFO method=POST url=https://rss-parrot.net/inbox requestID=rtj6nxmc04001n481h4g pubKeyID=https://gts.keysmash.xyz/users/kelsonv/main-key msg="performing request"
timestamp="11/01/2024 02:47:13.581" func=workers.(*clientAPI).CreateStatus level=ERROR requestID=rtj6nxmc04001n481h4g msg="error federating status: CreateStatus: error sending Create activity via outbox https://gts.keysmash.xyz/users/kelsonv/outbox: func1: error delivering to https://rss-parrot.net/inbox: deliver: POST request to https://rss-parrot.net/inbox failed: status=\"400 Bad Request\" body=\"{\"error\":\"Request body is not valid JSON\",\"status\":400}\n\""

Delete accounts with 0 followers

In the background process that updates feeds, check if the account still has any followers. If it has no followers, delete the account and its related posts and toots.

This will bring the number of accounts from the current 13600+ (68 pages of 200 accounts) down to a lower value - only those that have been actually requested since the Parrot's go-alive.

Opt out?

Hi @gugray! RSS Parrot is great, I love seeing this kind of work in the fediverse. Congrats on launching it!

I run a similar service, https://fed.brid.gy/, and one of its features that I've already used a handful of times is opt out, ie letting people ask to prevent my service from bridging their sites or accounts. Just curious, have you considered that for RSS Parrot yet?

Opting out

I'd like to opt-out, because I believe creators should be in control who's publishing what and where. That means it should be opt-in, especially because it's not hard to create another fediverse account for just publishing their own feed.

I don't think opting out by e-mail is reasonable, because it exposes an otherwise not exposed e-mail address.

I also don't think putting a hashtag on a profile is reasonable, because it litters the profile with unrelated info and won't work for many "services".

Having the automatic feed account deleted is important to achieve a full opt-out, because I don't want a feed with my name on lingering around.

You can gather the feed in question from my GitHub profile.

Support sites with multiple feeds

It seems like rss-parrot creates one account per website, but that fails when a website has multiple rss feeds.

To reproduce
Message the bot with example.com/feed1.xml. An account should be created for example.com.

Message the bot with example.com/feed2.xml. You should get a reply with the previously created account which has the wrong feed.

Use RSS enclosure as media attachment

Would be nice to have the possibility to attach media files, at least images would be a great feature. So this is a feature request for adding RSS enclosures as media attachments when sending out new toots.

Some Feeds not Read

I use a service out there to translate my Steam Achievements into an RSS feed. However when I ask the bird to turn this into an account online, the bird can't find anything. I'm not really all that interested in getting this feed uplifted into Mastodon, however I will say that the feed is valid, however I suspect the server is blocking the bird from reading the RSS feed. I thought I'd bring it up. Here is a link to the feed that is clearly a feed, but the bird is blind to.

Feed: https://truesteamachievements.com/friendfeedrss.aspx?gamerid=90878

Entities like &#39; showing up in descriptions

Quick fix: just unescape the entities which appear to appear frequently

Deep fix: check a lot of retrieved descriptions, see to what extent they have HTML markup, come up with a good-enough approach to sanitize them if needed.

Option to add a #Hashtag to a feed

Many RSS subscriptions would benefit from having one or more default Hashtag associated to them. These would be specified at the time the subscription was created?

Cannot view or follow new parrot accounts in Mastodon

Symtom A:

  1. Request a new feed by tooting at the birb
  2. In the response, click/tap the link to the new parrot account
    -> I get redirected to the account's web page in rss-parrot
    Expected: I get to see the account in my Mastodon client

Symptob B:

  1. In my Mastodon client I search for an existing but recent Parrot account, like @[email protected]. It's important that no account from my Mastodon instance is following this account yet.
    -> I get no results
    Expected: I get to see the account, and I can explore its details within my Mastodon client.

RSS Parrot post URLs on page?

On the server I am following RSS Parrot account on, older posts are not always visible:

image

When I click the page, I end up on the (nice) landing page, e.g. https://rss-parrot.net/web/feeds/nanocommons.github.io.erm-database

But for new RSS Parrots, I often like to "advertise" it, by boosting an older post from the RSS. But for that I need to Mastodon post URL. Because I cannot get that from my server, I was wondering if that it can be added to that page?

Toots from the Parrot can be too long

If a new item's description in the RSS feed is long (e.g., it contains the entire content of the post), then the toot that the Parrot sends about it can also be very long. Two examples of feeds with long content:

https://rss-parrot.net/web/feeds/dev.to
https://rss-parrot.net/web/feeds/ludic.mataroa.blog

It is desirable to truncate the description at something like 250 chars in the toot. Otherwise the toots can be unpleasant to read, and they can clutter up the public timeline of instances.

Support podcast feeds with no link

Oftenm, items in Podcast feeds have no element. Instead, they have an element with a link to the episode's audio file.

If there is no link in an item, the birb should look for the link in and use that.

Boost author's fedi post if available

This is a wild idea that popped to my head:

Would it be possible to detect whether the author of i.e. the blog post already shared it on their verified Fedi account and boost that post instead?

How I'd imagine this to work would be that the parrot would look for a Fedi verification link on the followed website and follow the author. If a new link appears in the website's feed, the parrot would check whether the author has mentioned the new link in a post of theirs.
If they have, the parrot would boost that post. If they haven't, it would fall back on creating a new post of its own.

Now obviously, you'd have to wait a certain amount of time before checking as the author may need some time to type out the Fedi post.

This would allow me to boost the author's post that A. has their account linked to it and B. may provide better extra info such as a short summary.

(Why not follow the author directly? I usually don't want to see every single thing an author posts on their Fedi account in my feed but do want to see their higher effort articles.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.