mibe / feedwriter Goto Github PK
View Code? Open in Web Editor NEWPHP Universal Feed Generator
Home Page: http://ajaxray.com/blog/php-universal-feed-generator-supports-rss-10-rss-20-and-atom
License: GNU General Public License v3.0
PHP Universal Feed Generator
Home Page: http://ajaxray.com/blog/php-universal-feed-generator-supports-rss-10-rss-20-and-atom
License: GNU General Public License v3.0
Hello,
Adding an incorrect version upon constructing a feed results in an invalid feed. Adding this to the top of the __construct() method will default it to RSS2 if it's invalid.
//Ensure the version is valid, if not, default to RSS2
$this->version = $version;
if ($this->version != RSS1 && $this->version != RSS2 && $this->version != ATOM) {
$this->version = RSS2;
$version = RSS2;
}
Kind regards,
Scott
P.S. Great work :)
When calling the addGenerator()
method in ATOM feeds, the <generator>
element should contain the URI to this GitHub project as XML attribute:
https://github.com/mibe/FeedWriter/blob/master/Feed.php#L138
This is not working because there is currently no technique to add channel elements which contain attributes (besides the special atom:link
element). See #18 for that problem.
The XML spec. clearly states that these chars have to be escaped, and FeedWriter does that. But the RSS Best Practices recommends using the hexadecimal character reference:
A publisher should encode "&" and "<" in plain text using hexadecimal character references. When encoding the ">" character, a publisher should use the hexadecimal reference >.
This maximizes the compatibility with clients.
When i put into a function those lines:
include '../addons/rss/Item.php';
include '../addons/rss/Feed.php';
include '../addons/rss/RSS2.php';
date_default_timezone_set('UTC');
use \FeedWriter\RSS2;
I have internal server Error 500. If i remove "use" it works fine. I have good version of PHP
When I validate a Feed outputted by FeedWriter.php I get this error Missing atom:link with rel="self"
The docs on the issue can be found here:http://feed2.w3.org/docs/warning/MissingAtomSelfLink.html
I was able to fix the issue by adding xmlns:atom="http://www.w3.org/2005/Atom"
to the $out.=
on line 283
then adding the atom rel=self inside the channel I added this to line 386
by adding echo '<atom:link href="http:/example.com/feeds/myfeed.rss" rel="self" type="application/rss+xml" />';
Would love to see this added to core. Thanks :)
I think it is time to replace the internal methods makeHeader()
, makeChannels()
, makeItems()
, makeFooter()
and makeNode()
with PHP's own functions to generate the XML output. The current methods have some quirks like the two described in #23 and #18 (last comment).
There are plenty of function & classes to choose from. SimpleXML, XMLWriter or even the full DOM classes look promising. However, these classes are / were extensions and not always available on every PHP installation. This depends heavily on the server configuration. And that's the advantage of having the XML assembled in own methods: It's very unlikely that plain string functions are deactivated. 😉
Is 0.1 or 1.0 coming anytime soon? Also, can we have a branch alias in the composer.json too?
I'm wondering if this could be leave to the developer if he want to?...
Beside that, this produce an issue with items which are link. That is, if the link url contain ampersands in the query string part (e.g.: http://example.com/?dont=borgat&that=join&my=friend) then the output url will not corresponds.
<atom:link>
is a magic tag in the sense that it is the only tag created by Feed->setChannelElement()
that is rendered with attributes instead of having it's $content
passed into it as sub elements...
eg: https://github.com/mibe/FeedWriter/blob/master/Feed.php#L719
I require the ability to set attributes for name spaced elements such as <itunes:image href="link/to/my/feed/image.jpg"/>
as well as the ability to make the element self closing...
Pull request #23 introduced a filter to remove characters, which are invalid in the XML context. This is implemented by using a regular expression replace operation, which is done by the PCRE library.
The problem here is that the result of that operation is not checked. The preg_replace()
function returns NULL in an error condition. NULL would be then casted to a string, which results in an empty string.
This behaviour was firstly noticed by @NeoCsatornaja in issue #28 by setting the feed encoding to ISO-8859-2 and supplying data with this encoding.
The best solution probably to use the regular expression functions from the Multibyte String extension mbstring
. The problem with that is this extension is not enabled by default. This would make FeedWriter incompatible with installations without this extensions. I don't know how common this is, but I could imagine this is the case on cheap shared webhosters or so.
So a compatible solution is IMHO to check if preg_replace()
failed and then in this case use a regular expression without multibyte chars.
If you want to reproduce the problem by yourself, here's the code:
header("Content-Type: text/plain");
$string = "\x54\x65\x73\x74\x09\xc1\xe9\x75\xc3";
mb_regex_encoding('UTF-8');
$after = mb_ereg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', '_', $string);
var_dump($after);
$after = preg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', '_', $string);
var_dump($after);
var_dump(preg_last_error());
Result is
string(9) "Test ÁéuÃ"
NULL
int(4)
As you can see the regex is identical, but preg_replace
exited with an PREG_BAD_UTF8_ERROR
error.
It would be useful to add more xmlns. I suggest
this
if($this->version == Feed::RSS2) {
$out .= '<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/">';
}
to become
$out .= '<rss version="2.0"';
$out .= 'xmlns:content="http://purl.org/rss/1.0/modules/content/" ';
$out .= 'xmlns:wfw="http://wellformedweb.org/CommentAPI/" ';
$out .= 'xmlns:dc="http://purl.org/dc/elements/1.1/" ';
$out .= 'xmlns:sy="http://purl.org/rss/1.0/modules/syndication/">';
During researching for PR #33 I noticed that for ATOM IDs the code is using the urn
URI scheme. This is not wrong per se, but IMHO the tag
scheme is better. I've checked the ATOM feeds of some big sites, and all were using the tag
scheme. It doesn't need a registration at IANA and also contains more information than a simple UUID.
Here are some opinions and information:
What do you guys think about that?
Is it possible to set an image to an item?
Something like setImage() is missing!
Is there interest for the sfeed format, a TAB-separated, plaintext output?
The current method of error handling is to terminate the script by calling die()
. This isn't really good.
But since we have method chaining, returning TRUE / FALSE is not possible. Better possibilities would be:
trigger_error()
?Just expanding on the previous patch. A better idea would be to remove the responsibility of the writer class having to valid the format is valid. So in the class below i have removed that responsibility to a new format class, which can then make use of type hinting to enforce the correct type is given to the constructor of the writer class.
Not tested this code and it's almost 1am for me, but it's just to show you a rough idea of how it would work and how it would make adding new formats a little easier in the future. Hope it helps or at least gives you more ideas.
/**
* use this class to enforce that a valid instance of feed_format is passed to the feed_writer
* this removes the responsibility of the feed_writer checking the feed format and encapsulates the formats
* might want to consider refactoring some other features out of the single writer class and use inheritance
* to give each format specific features (i.e. feed_writer = abstract class, feed_writer_rss1 +
* rss_writer_atom + feed_writer_rss2 all extend feed_writer making it easy to add future feed formats) -
* but one step at a time ;)
*/
class feed_format {
/**
* formats available
*/
const ATOM = 'ATOM';
const RSS1 = 'RSS 1';
const RSS2 = 'RSS 2';
/**
* default format to use if given an invalid format on construct
*/
const DEFAULT = 'RSS 2';
/**
* currently selected format
*/
private $format;
/**
* what format to request
*/
public function __construct($format) {
//uppercase so it will still match the constants used
$this->format = strtoupper($format);
//check it's valid, else use default
switch ($this->format) {
case self::ATOM:
case self::RSS1:
case self::RSS2:
break;
default:
trigger_error("Incorrect feed format used. Defaulting to: " . self::DEFAULT, E_USER_NOTICE);
$this->format = self::DEFAULT;
}
}
/**
* returns the currently selected format so the feed_writer class can ask questions about it's type via methods
*/
public function get_format() {
return $this->format;
}
}
/**
* writer class construct signature
*/
class feed_writer {
/**
* now you can't give it an incorrect format as it will only accept an instance of feed_format
*/
public function __construct(feed_format $format) {
//...code here
}
}
//example
$feed = new feed_writer(new feed_format(feed_format::RSS));
Just a thought.
Kind regards,
Scott
Hi!
Is it possible to set attributes for a content element in the channel in RSS2? I'd like to generate something like this:
<itunes:category text="Technology">
<itunes:category text="Tech News"/>
</itunes:category>
But I cannot figure out, how to send the attributes to the inner element (if it is possible).
Thank you
Each feed type should be a separate class, instead of checking all the time what type of feed it is:
if($type == 'RSS 1')
...
elseif($type == 'RSS 2')
elseif($type == 'Atom')
This is not the OOP way.
Howdy. I was thinking about forking this and making a couple of key changes:
\VendorName\Namespace\ClassName
convention (obviously requires PHP 5.3+)I'd be happy to issue a pull request, and do the work with that in mind, if there's interest in having those changes on this project. Otherwise I'd probably make it a subset of sparkfun/SparkLib.
If you would like a pull request, what should I use for "vendor name"?
Thanks!
There is no built in way to add images to the items
The Atom standard supports more than plain-text in the elements' titles when the type is set to html
instead of default text
.
I don't know whether RSS or RSS2 supports html elements in titles.
Would FeedWriter be interested in supporting this? I could work on it.
Hi,
Would you mind publishing FeedWriter on Packagist? It’s the main Composer repository, and allows one to install your lib using:
composer install mbi/FeedWriter
I wanted to generate an ATOM feed with the library.
I checked the documentation here :
https://mibe.github.io/FeedWriter/namespaces/FeedWriter.html
But when clicking on ATOM, I got a 404 error. Can you fix it so that I can learn how to generate an ATOM feed?
Thank you in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.