lkiesow / python-feedgen Goto Github PK

View Code? Open in Web Editor NEW

690.0 17.0 121.0 1.31 MB

Python module to generate ATOM feeds, RSS feeds and Podcasts.

Home Page: https://feedgen.kiesow.be/

License: BSD 2-Clause "Simplified" License

Makefile 0.57% Python 99.43%

python feed rss atom

python-feedgen's Issues

Exception while generating atom geed

  File "articles/feed.py", line 172, in create_feed_from_articles
    return feed.atom_str(1)
  File "local/lib/python2.7/site-packages/feedgen/feed.py", line 211, in atom_str
    feed, doc = self._create_atom(extensions=extensions)
  File "local/lib/python2.7/site-packages/feedgen/feed.py", line 195, in _create_atom
    entry = entry.atom_entry()
  File "local/lib/python2.7/site-packages/feedgen/entry.py", line 139, in atom_entry
    cat = etree.SubElement(feed, 'category', term=c['term'])
NameError: global name 'feed' is not defined

TIA

How to add arbitrary fields to items and the feed?

Is there a way to add arbitrary attributes to items or the feed itself? I'm especially trying to add media tags e.g. <media:thumbnail ...>.

Add new content to existing RSS file

Does this library offer a way to add entries to existing RSS files? I am able to generate new RSS files and append items to it during the initial setup of the file. However, when I don't see a mechanism to read a file, add content, and then resave. Am I missing something obvious? Thanks.

fatal error when installing with pip

I'm trying to install feedgen on ubuntu 12.04 with

sudo pip install feedgen

but the compilation keeps terminating with thie following error:

~/build/lxml/src/lxml/includes/etree_defs.h:14:31: fatal error: libxml/xmlversion.h: No such file or directory

'FeedEntry' object has no attribute 'source'

As the source attribute is part of the RSS 2.0 specification, please implement fe.source(...)

http://www.rssboard.org/rss-specification#ltsourcegtSubelementOfLtitemgt

Include tests in release tarball

Distributions that package Python modules would like the ability to verify that they actually work. Hence, it would be nice if the release tarballs included the test suite.

atom_str and rss_str return bytestrings when their docstrings indicate they return strings (Python3)

Under Python 3 atom_str and rss_str in feed.py both return a bytestring when their docstrings indicate they return strings. This seems to be because xml.etree.ElementTree.tostring returns a bytestring unless the encoding='unicode' argument is supplied. I did try passing in this parameter but it doesn't seem like lxml likes this: ValueError: Serialisation to unicode must not request an XML declaration

Dead Project?

I came across this library searching for something to help generate MRSS feeds, but it appears to be largely abandoned as there are PRs from last year with no movement.

Is development dead on it?

How can I add multiple lines into description

Hello, this is my first message here.

I'm trying to add multiple lines to the description but it's not working. I've tried with

, with < ! [ CDATA[Hi Rss feed

Here is new line ]]>

But it's not showing multiple lines. Is this an issue from my reader or I might be doing it wrong ?

Thank you so much.

pubDate is inconsistent between feed and entries

In the main feed class the method for the pubDate field is pubDate() however in the entry class it's pubdate()

These should be consistent IMHO

Scrapping web pages

I have seen the following:

$ python -m feedgen
Usage: python -m feedgen ( <file>.atom | atom | <file>.rss | rss | podcast )

  atom             -- Generate ATOM test output and print it to stdout.
  rss              -- Generate RSS test output and print it to stdout.
  <file>.atom      -- Generate ATOM test feed and write it to file.atom.
  <file>.rss       -- Generate RSS test teed and write it to file.rss.
  podcast          -- Generate Podcast test output and print it to stdout.
  dc.atom          -- Generate DC extension test output (atom format) and print it to stdout.
  dc.rss           -- Generate DC extension test output (rss format) and print it to stdout.
  syndication.atom -- Generate DC extension test output (atom format) and print it to stdout.
  syndication.rss  -- Generate DC extension test output (rss format) and print it to stdout.

I do not see a way to generate a feed out of a web page (scrap), in a similar way to how Feed Creator of FiveFilters.org does, by submitting URL, content of class, div, id etc. and/or content of links.

Submitted parameters (
    [url] => 
    [in_id_or_class] => 
    [url_contains] => 
)

Note: user must submit either in_id_or_class or url_contains (or both) parameters, and a URL.

Example: python-feedgen --url= --attribute= --link=

Can't have entry-specific pubDate

It's always set to the feed's pubDate, which is less than optimal.

Improve "Add Feed Entries" example

Using your example to add entry is not enough to generate atom_str, because it lacks link/content item:

ValueError: Entry must contain an alternate link or a content element.

Please mention it to example, eg.:

fe.link(href="http://example.org/somepath")

disable author name in parenthesis if author name missing or empty

The feed fails to render as rss if author 'name' key is missing from dict: ValueError: Data contains not all required keys, however, as the docs point out, in rss the author name isn't required, just their email address: Name is mandatory for ATOM, email is mandatory for RSS

adding the name key but leaving it empty still results in parenthesis being emitted: <author>[email protected] ()</author>

I understand the motivation behind requiring both, but not being able to disable the behaviour is annoying.

Only href gets rendered for links

python-feedgen/feedgen/entry.py

Line 139 in 97260ab

for link in self.__atom_link or []:

Due to this commit 966fea4

This is our self.__atom_link object

But because in this line the link from the loop gets overwritten, all the subsequent gets do nothing.

Changing the code back to this, fixes our problem.

        for l in self.__atom_link or []:
            link = xml_elem('link', entry, href=l['href'])
            if l.get('rel'):
                link.attrib['rel'] = l['rel']
            if l.get('type'):
                link.attrib['type'] = l['type']
            if l.get('hreflang'):
                link.attrib['hreflang'] = l['hreflang']
            if l.get('title'):
                link.attrib['title'] = l['title']
            if l.get('length'):
                link.attrib['length'] = l['length']

Undocumented / Unwanted? API change in Version 0.9

The latest published version (0.9) contains an unwanted? API change. The getter function summary() in FeedEntry is now returning a dict instead of a string.

python-feedgen/feedgen/entry.py

Line 462 in ffe3e4d

def summary(self, summary=None, type=None):

See the following example reading the Summary of an entry:

Expected behavior: Return Summary as String

Actual behavior: Summary is returned as dict -> {'summary': "content"}

Steps to reproduce:

Create test.py in current folder:

from feedgen.feed import FeedGenerator
from feedgen.version import version as feedgenversion
print('Feedgen Version: {}'.format(feedgenversion))
feed = FeedGenerator()
fe = feed.add_entry()
fe.summary('description')
print('Return type:     {}'.format(type(feed.entry()[0].summary())))
print("Summary:         {}".format(feed.entry()[0].summary()))

Execute the following command

pip3 install --user --upgrade feedgen==0.8 > /dev/null; python3 test.py; echo "----------"; pip3 install --user --upgrade feedgen==0.9 > /dev/null; python3 test.py

Receive the following output:

Feedgen Version: (0, 8, 0)
Return type:     <class 'str'>
Summary:         description
----------
Feedgen Version: (0, 9, 0)
Return type:     <class 'dict'>
Summary:         {'summary': 'description'}

How to set titile CDATA

I notice there is one method called content to generate description with CDATA quoted. However, I want to put the title into CDATA, can you do that?

Add support for extra values for rel attribute

We are constructing a paged atomfeed (https://tools.ietf.org/html/rfc5005) and would need to set the rel attribute to the values ``prev-archiveandnext-archive`.

As per there are only 5 values allowed.

Is it possible to expand this list, ideally with the values found here: http://www.iana.org/assignments/link-relations/link-relations.xhtml?

if this is something you are willing to implement I'd be happy to fork the project and make a pull request.

Is it possible to set custom timezone?

Hello, is it possible to config feedgen so it will set time with different timezone? F.ex. I need time to be formatted as GMT+3
Or could you please give an example, how to properly set published argument for entry object?

Improper extension attribute creation

File "/Library/Python/2.7/site-packages/feedgen/feed.py", line 1055, in load_extension
    ext    = getattr(extmod, extname)
AttributeError: 'module' object has no attribute 'Podcast_entryExtension'

This is the error that is thrown, it looks like the attribute should be (based on documentation)

'PodcastEntryExtension'

Called using

fg.load_extension('podcast_entry')

docs confusion

In https://github.com/lkiesow/python-feedgen#create-a-feed I'm confused about link appearing twice:

...
fg.link( href='http://example.com', rel='alternate' )
fg.logo('http://ex.com/logo.jpg')
fg.subtitle('This is a cool feed!')
fg.link( href='http://larskiesow.de/test.atom', rel='self' )
...

Is this another way to "set fields that can occur more than once?" in addition to the three bulleted points below that? I assumed so at first, but then was confused that the technique of calling the method again didn't appear in that list.

Fedora25 repo is missing

I was about to install via the COPR repo, but Fedora25 is explicitly missing:
https://copr.fedorainfracloud.org/coprs/lkiesow/python-feedgen/

There are packages for Rawhide, Fedora24 & Fedora26. Can this be fixed?

Adding description as summary fails to generate Atom

After update to 0.9.0, generation of the Atom fails, if Summary is created via description. Here is minimal example:

from feedgen.feed import FeedGenerator

fg = FeedGenerator()
fg.id('http://lernfunk.de/media/654321')
fg.title('Some Testfeed')

fe = fg.add_entry()
fe.id('http://lernfunk.de/media/654321/1')
fe.title('The First Episode')
fe.link(href="http://lernfunk.de/feed")
fe.description("Some description", isSummary=True)

print(fg.atom_str(pretty=True).decode())

It fails with:

Traceback (most recent call last):
  File "feedfail.py", line 21, in <module>
    print(fg.atom_str(pretty=True).decode())
  File "/usr/lib/python3/dist-packages/feedgen/feed.py", line 222, in atom_str
    feed, doc = self._create_atom(extensions=extensions)
  File "/usr/lib/python3/dist-packages/feedgen/feed.py", line 198, in _create_atom
    entry = entry.atom_entry()
  File "/usr/lib/python3/dist-packages/feedgen/entry.py", line 152, in atom_entry
    _add_text_elm(entry, self.__atom_summary, 'summary')
  File "/usr/lib/python3/dist-packages/feedgen/entry.py", line 29, in _add_text_elm
    type_ = data.get('type')
AttributeError: 'str' object has no attribute 'get'

Adding as content (without isSummary=True) and adding it directly via summary method works as expected.

"ValueError: Required fields not set" when generating Atom feed

I'm setting the following fields on the FeedGenerator:

title
id
author
description
subtitle
language
link

and the following fields on the feed entries:

link
title
description
summary
author
id

The RSS feed generates fine, but the Atom feed gives the above error.

[request] Ability to remove entries

Add entry to existing feed

How do I perform following action
fe = fg.add_entry()
for existing feed XML file .

Say I created feed using the following code:

fg = FeedGenerator()
fg.id('http://lernfunk.de/media/654321')
fg.title('Some Testfeed')
fg.author({'name': 'John Doe', 'email': '[email protected]'})
fg.link(href='http://example.com', rel='alternate')
fg.logo('http://ex.com/logo.jpg')
fg.subtitle('This is a cool feed!')
fg.link(href='http://larskiesow.de/test.atom', rel='self')
fg.language('en')
fe = fg.add_entry()
fe.id('http://lernfunk.de/media/654321/1')
fe.link(href='http://example.com', rel='alternate')
fe.title('The First Episode')
atomfeed = fg.atom_str(pretty=True)
fg.atom_file(feed_local_feed_filepath)

Now that I have saved a the XML file at a particular location, how to add more to this already present XML file?

NameError: global name 'updated' is not defined

Using python-feedgen to create a RSS feed causes the following traceback:

lib/python2.7/site-packages/feedgen/feed.py", line 447, in lastBuildDate
return updated( lastBuildDate )
NameError: global name 'updated' is not defined

fg.register_extension('media', MediaExtension, MediaEntryExtension)

I'm trying to use the media ext from the mediarss branch, and for some reason I can't register it.

I have the file media.py in the root folder of my project.

  File "genRSS.py", line 6, in <module>
    fg.register_extension('media', MediaExtension, MediaEntryExtension)
NameError: name 'MediaExtension' is not defined

[Bug?] "Missing description" error when generating RSS feed

Hello.
This code works fine:

fg = FeedGenerator()
fg.id('http://example.com/')
fg.link(href='https://example.com', rel='self')
fg.title('My title')
fg.subtitle('My subtitle')

However when I remove subtitle, I get the following error: ValueError: Required fields not set (description)

Unable to set categories

I'm working on a script that reverses RSS feeds https://github.com/steinarb/feedreverser

I'm trying to reverse a wordpress feed so that I can use feediverse to post the feed entries in cronological order.

I parse the wordpress feed with feedparser 5.2.1
I'm using python-feedgen 0.9.0 to output the reversed feed.

Categories don't survive in the feed reversal.

This is what I do to set the categories:
https://github.com/steinarb/feedreverser/blob/28f53647bf78ee7015fdbcee52dd0bb8c437d496/feedreverser.py#L26

I've printed out the categories and they look OK to me.

Reading an RSS feed the categories from feedparser looks like this:

[{'term': 'Emacs', 'scheme': None, 'label': None}, {'term': 'editor', 'scheme': None, 'label': None}, {'term': 'emacs', 'scheme': None, 'label': None}, {'term': 'extension', 'scheme': None, 'label': None}, {'term': 'lisp', 'scheme': None, 'label': None}, {'term': 'programming', 'scheme': None, 'label': None}]

The resulting RSS entry produced by python-feedgen looks like this

<item><title>Installing debian “squeeze” with PXE boot on a Samsung N145 Plus netbook</title><description>Introduction This article describes the steps necessary to install debian 6 &amp;#8220;squeeze&amp;#8221; on a Samsung N145 Plus netbook, with the following specification: Intel Atom processor 10.1&amp;#8243; display 1GB RAM 340GB HDD Windows 7 preinstalled Setting up netboot of the debian installer DHCP requests in my home LAN network is provided by dnsmasq on a desktop PC &amp;#8230; &lt;a href="https://steinar.bang.priv.no/2012/06/11/installing-debian-squeeze-with-pxe-boot-on-a-samsung-n145-plus-netbook/" class="more-link"&gt;Continue reading &lt;span class="screen-reader-text"&gt;Installing debian &amp;#8220;squeeze&amp;#8221; with PXE boot on a Samsung N145 Plus netbook&lt;/span&gt; &lt;span class="meta-nav"&gt;&amp;#8594;&lt;/span&gt;&lt;/a&gt;</description><guid isPermaLink="false">http://steinar.bang.priv.no/?p=63</guid><category/><category/><category/><category/><category/><category/><category/><category/><category/><category/><category/><pubDate>Sat, 13 Jun 2020 11:02:56 +0000</pubDate></item>

Reading an atom feed, the categories from feedparser looks like this:

[{'term': 'Emacs', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'editor', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'emacs', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'extension', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'lisp', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'programming', 'scheme': 'https://steinar.bang.priv.no', 'label': None}]

The resulting RSS entry produced by python-feedgen looks like this

<item><title>Emacs and lisp</title><description>Introduction The Emacs text editor uses lisp as an extension language. This article will attempt to explain enough lisp to do basic emacs customization, to someone who knows imperative programming languages. Evaluating lisp Lisp consists of balanced pairs of parantheses, filled with tokens, separated by space, eg. like this: (somefun1 1 2 3 "four") (somefun2 &amp;#8230; &lt;a href="https://steinar.bang.priv.no/2012/05/03/emacs-and-lisp/" class="more-link"&gt;Continue reading &lt;span class="screen-reader-text"&gt;Emacs and lisp&lt;/span&gt; &lt;span class="meta-nav"&gt;&amp;#8594;&lt;/span&gt;&lt;/a&gt;</description><guid isPermaLink="false">http://steinar.bang.priv.no/?p=23</guid><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><pubDate>Sat, 13 Jun 2020 11:20:42 +0000</pubDate></item>

ImportError: No module named lxml

>>> from feedgen.feed import FeedGenerator
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "feedgen/feed.py", line 12, in <module>
    from lxml import etree
ImportError: No module named lxml

if someone knows what's wrong I am doing, then please let me know.

'deprected' mispelled

The word 'deprecated' is misspelled (as 'deprected') in both the dox and the module itself:

pubdate(…) is deprected and may be removed in feedgen ≥ 0.8. Use pubDate(…) instead.

guid permalink for rss

there should be a way to set the isPermaLink attribute in <guid isPermaLink="false"> to true for provided guids.
eg. fg.id("http://domain.com/guid", permalink=True)

Image Tag not Displayed in RSS Feed

Summary: Attempting to write a commandline podcast generator. Everything seems to be working except for the image for RSS channel.

Problem: When calling the following functions on a feed fg that has yet to be rendered:

fg = FeedGenerator()
fg.load_extension('podcast')
#...
#fg configuration
#...
fg.logo( logo=logo_url )
fg.image( url=logo_url, title=title_string )
fg.podcast.itunes_image( logo_url )

Neither an RSS <image> tag nor an <itunes:image> tag comes out in the channel metadata.

Development Environment:

feedgen 0.2.8
Python 2.7.3
Ubuntu 12.04.4 LTS 64-bit

Add image in item entry.

Hello,
Can you explain,
How to add image in the item?
I can't find any method for that in FeedEntry Class.

please give a valid example of using extention media

Hi,
I've been trying for hours but nothing I do succesfully loads the media example

My code is essentially:
feed = FeedGenerator() feed.load_extension('media') feed.id('https://www.bomengids.nl/mastodon-homefeed-rss') #give the id you want this rss feed to be for feed.title('mastodon (re)-posts') statuses = enriched_list for status in statuses: created = status['published'] item = feed.add_entry() item.id(url) item.media.media_content({"url" : media_url, "type": mediatype})

it then says:
item.media.media_content({"url" : media_url, "type": mediatype})
AttributeError: 'MediaEntryExtension' object has no attribute 'media_content'

What am I doing wrong?

Add typehints

It seems that the project is dead?

If not, at least basic typhints would be desirable as it becomes more common to use typechecker and it generally makes libraries more usable.

basestring type doesn't exists in Python 3

In several places feedgen use isinstance(foo, basestring) to guess whether the argument is a string or not, for instance in pubDate method https://github.com/lkiesow/python-feedgen/blob/master/feedgen/feed.py#L833.

I think it's more pythonic to use try/except instead of isinstance.

What's the best way to determine subscriber counts?

Making "generator" element truly optional

It would be helpful if there were an option to exclude the generator element when creating the feed. It's currently possible to create an empty generator element, but not omit it entirely. I have no problem that the default behavior is to name feedgen as the generator; sometimes I just need to omit that element entirely for compatibility purposes.

Thanks!

Possible debug leftover

I think there is debug leftover on line 384 In commit
4a7e7ad#L0R384

"Health & Fitness" incorrect &

When using the itunes category Health & Fitness, the & symbol is replaced with & in the output string

w

XML doctypes and stylesheet support

Right now using atom_file() will not generate any XML doctypes, i.e.

<?xml blabla ?>

It would be nice to have this as well as an optional setting to have a <?xml-stylesheet ?> linked.

AttributeError: 'PodcastExtension' object has no attribute 'itunes_duration'

I have the following code for rss feed generator:


import sys
from feedgen.feed import FeedGenerator


def generator():
    fg = FeedGenerator()
    fg.load_extension('podcast')
    fg.title('xyz')
    fg.link(href='https://something.coml', rel='alternate')
    fg.language('en')

    fg.podcast.itunes_category({"cat":"Leisure", "sub":"Games"})

    fe = fg.add_entry()
    fe = fg.podcast.itunes_duration('123')
    fe.title('The First Episode')

    fg.rss_str(pretty=True)
    fg.rss_file('podcast.xml')

generator()

Also, I deleted categories that are irrelevant to the problem.
For some reason, I get an AttributeError. I have literally no idea why this happens, the library is imported properly.

WebSub Support?

Hi there,

Thanks for such a nice feedgen. Actually we are trying to add the link for WebSub fg.link(rel='hub', href="https://pubsubhubbub.appspot.com/") but this seems only adding as https://pubsubhubbub.appspot.com/

Appreciated if you can guide for this.

Regards,

New release

Hi is it possible to create a new release and publish it on PyPi.

We would need this fix.

Missing iTunes tags

Hello Lars,

there are some changes in the iTunes' RSS feed specification. According to those changes some tags are missing the in the Podcast extension:

channel/itunes:type
item/itunes:title
item/itunes:episodeType
item/itunes:episode
item/content:encoded

not working timezone formate

fg.lastBuildDate(write_date.strftime(timezone_format))

issue is : this is defualt get timezone formate
https://github.com/lkiesow/python-feedgen/blob/master/feedgen/feed.py#L315

i want this : Tue, 09 December 2014 11:37:23 +0000
but
i got this : Tue, 09 Dec 2014 11:37:23 +0000

TRACEBACK
File "/usr/lib/python2.7/site-packages/feedgen/feed.py", line 462, in lastBuildDate return self.updated( lastBuildDate ) File "/usr/lib/python2.7/site-packages/feedgen/feed.py", line 435, in updated updated = dateutil.parser.parse(updated) File "/usr/lib/python2.7/site-packages/dateutil/parser.py", line 744, in parse timestr = timestr.decode() UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 9: ordinal not in range(128)

timezone_format = '%a, %d %b %Y %H:%M:%S %z
can return non-ascii characters as it is localized. On my computer (in French) %b will return Déc for "December".

Non-Zero Integer for enclosure length causes type error

Attached is a file that causes a type error when rss_str is called.

Traceback (most recent call last):

  File "c:\Users\DavidMck\Documents\feedgen-test1.py", line 14, in <module>
    fg.rss_str(pretty=True)
  File "C:\Program Files\Python38\lib\site-packages\feedgen\feed.py", line 398, in rss_str
    feed, doc = self._create_rss(extensions=extensions)
  File "C:\Program Files\Python38\lib\site-packages\feedgen\feed.py", line 374, in _create_rss
    item = entry.rss_entry()
  File "C:\Program Files\Python38\lib\site-packages\feedgen\entry.py", line 247, in rss_entry
    enclosure.attrib['length'] = self.__rss_enclosure['length']
  File "src\lxml\etree.pyx", line 2429, in lxml.etree._Attrib.__setitem__
  File "src\lxml\apihelpers.pxi", line 593, in lxml.etree._setAttributeValue
  File "src\lxml\apihelpers.pxi", line 1538, in lxml.etree._utf8
TypeError: Argument must be bytes or unicode, got 'int'

The problem is this call to enclosure -
fe.enclosure('http://lernfunk.de/media/654321/1/file.mp3', 100, 'audio/mpeg')
if 100 is passed as a string - "100" - the error does not occur. The call in lxml.etree expects the value to be a string.

feedgen-test1.py.txt

Certainly one could change the call to enclosure, but it seems that ideally a "length" variable would allow an integer value.

lkiesow / python-feedgen Goto Github PK

python-feedgen's Issues

Recommend Projects

Recommend Topics

Recommend Org