pelican-plugins / seo Goto Github PK
View Code? Open in Web Editor NEWPelican plugin to improve search engine optimization (SEO)
Pelican plugin to improve search engine optimization (SEO)
Regarding Structured Data / Article Schema could be useful to add :figure: field where :figure: is used instead of :image:?
For example, the following settings should probably be prefixed with SEO_[…]
in order to prevent potential collisions with other plugins.
ARTICLES_LIMIT = 10
PAGES_LIMIT = 10
I really love the project. But when I first started using it, I had problems getting it to work. I tried pip installing and it seemed like nothing happened since I couldn't import the plugin (maybe I used the wrong alias?). I ended up git cloning the repo into my plugins folder and then it finally worked. It would be very helpful for future new users if there is a better step-by-step guide on how to use it.
Is there a need to create the metadata field description
while there is already summary
in pelican ?
In pelican's docs, the description of summary
is Brief description of content for index pages
.
See #33
https://github.com/pelican-plugins/seo/blob/master/pelican/plugins/seo/seo_enhancer/__init__.py#L37 should not call CanonicalURLCreator.create_url()
method in case of external canonical as the URL is already built.
Given
SITEURL = 'mysite.com/blog'
in the Pelican settings file.
and a fileurl
of 'posts/my_article/index.html`
the _create_absolute_fileurl
method in open_graph.py
will return a file_url
missing the SITEURL
s subdirectory.
This is due to the way that urllib's parse.urljoin
works.
My quick workaround for now is to append a slash to the siteurl
. If it's redundant, it will be stripped out by urljoin
def __init__(
self, siteurl, fileurl, file_type, title, description, image, locale
) -> None:
self.siteurl = siteurl + "/"
See #33
Localization: https://github.com/pelican-plugins/seo/blob/master/pelican/plugins/seo/tests/test_seo_enhancer.py
The test should be parametrized and test the HTML generation for canonical feature for the following cases:
external_canonical
save_as
external_canonical
AND save_as
Salut @MaevaBrunelles !
Clearly I caught this in the works. But here is a proposal.
Could we add <meta>
tags to improve search engine optimization? See examples below!
For further reference, I think jekyll does this pretty well.
I'm rebuilding my website and coding up some jinja logic to get this working. Can share some code 😄
IMO these are essential for great SEO results!
Thanks for the great work!
Some examples:
<meta name="twitter:card" content="summary">
<meta name="twitter:site" content="@site_account">
<meta name="twitter:creator" content="@individual_account">
<meta name="twitter:url" content="https://example.com/page.html">
<meta name="twitter:title" content="Content Title">
<meta name="twitter:description" content="Content description less than 200 characters">
<meta name="twitter:image" content="https://example.com/image.jpg">
<meta name="twitter:image:alt" content="A text description of the image conveying the essential nature of an image to users who are visually impaired. Maximum 420 characters.">
<meta property="fb:app_id" content="123456789">
<meta property="og:url" content="https://example.com/page.html">
<meta property="og:type" content="website">
<meta property="og:title" content="Content Title">
<meta property="og:image" content="https://example.com/image.jpg">
<meta property="og:image:alt" content="A description of what is in the image (not a caption)">
<meta property="og:description" content="Description Here">
<meta property="og:site_name" content="Site Name">
<meta property="og:locale" content="en_US">
<meta property="article:author" content="">
Perhaps google verification as well?
<meta name="google-site-verification" content="your verification string">
Hi!
I am using this plugin in one of my websites and the Google Console reported me an error related with the structured data based on Schema. At the first time I thought it may be related with this note in the README:
Note that schemas generated by default are compliant with Schema.org but not (by default) Google-compliant.
However, after check the console and then the HTML files generated I discovered that structured data is being added in existing <script>
tags and then it creates two new <script type="application/ld+json">
but let them empty. So in my <head>
I have:
<script src="/theme/navbar.js">
{"@context": "https://schema.org", "@type": "BreadcrumbList", "itemListElement": [{"@type": "ListItem", "position": 1, "name": "Coruja Digital", "item": "https://corujadigital.tech"}, {"@type": "ListItem", "position": 2, "name": "Blog", "item": "https://corujadigital.tech/blog"}, {"@type": "ListItem", "position": 3, "name": "Rediseno sitio web", "item": "https://corujadigital.tech/blog/rediseno-sitio-web.html"}]}
</script>
<script src="/theme/js/fontawesome-all.min.js">
{"@context": "https://schema.org", "@type": "Article", "author": {"@type": "Person", "name": "Iván Hernández Cazorla"}, "publisher": {"@type": "Organization", "name": "Coruja Digital", "logo": {"@type": "ImageObject", "url": "https://corujadigital.tech/theme/logo_coruja_digital.png"}}, "headline": "Rediseño del sitio web de Coruja Digital", "about": "corujadigital.tech", "datePublished": "2020-10-29 00:00"}
</script>
<script type="application/ld+json">
</script>
<script type="application/ld+json">
</script>
I upgraded to the latest version (1.0.1) but it did not fix this. Have you got any idea of what could be happening?
Thanks in advance!
I'm using pelican-seo - The resulting LD+JSON is good, I'm only lacking the canonical url, but all other tags work (image, description, etc).
However, it is not generating any meta property tags.
My configuration:
SEO_REPORT = True
SEO_ENHANCER = True
SEO_ENHANCER_OPEN_GRAPH = True
SEO_ENHANCER_TWITTER_CARDS = True
I have all this tags in my articles (I use .rst):
:category:
:tags:
:image:
:description:
:og_description:
:og_image:
:date:
:summary:
No tw_author, though, but documentation says its optional.
What am I missing? Is this a bug or user error?
I am attempting to enable this plugin with the Flex theme. Unfortunately, as soon as I enable the plugin, I get the following error when generating my output.
$ pelican -s pelicanconf.py -lrv --debug
[08:49:05] DEBUG Pelican version: 4.8.0 __init__.py:531
DEBUG Python version: 3.10.8 __init__.py:532
DEBUG Adding current directory to system path __init__.py:66
DEBUG Finding namespace plugins _utils.py:81
DEBUG Namespace plugins found: _utils.py:84
pelican.plugins.seo
pelican.plugins.series
DEBUG Loading plugin `series` _utils.py:90
DEBUG Loading plugin `extract_toc` _utils.py:90
DEBUG Loading plugin `neighbors` _utils.py:90
DEBUG Loading plugin `extended_sitemap` _utils.py:90
DEBUG Loading plugin `seo` _utils.py:90
DEBUG Registering plugin `pelican.plugins.series` __init__.py:73
DEBUG Registering plugin `extract_toc` __init__.py:73
DEBUG Registering plugin `neighbors` __init__.py:73
DEBUG Registering plugin `extended_sitemap` __init__.py:73
DEBUG Registering plugin `pelican.plugins.seo` __init__.py:73
INFO SEO plugin initialized seo.py:43
DEBUG Found generator: ArticlesGenerator (internal) __init__.py:209
DEBUG Found generator: PagesGenerator (internal) __init__.py:209
DEBUG Found generator: SitemapGenerator (extended_sitemap) __init__.py:209
DEBUG Found generator: StaticGenerator (internal) __init__.py:209
[reading files that should be generated]
INFO SEO plugin - SEO Report: seo_report.html file created __init__.py:273
CRITICAL AttributeError: 'SitemapGenerator' object has no attribute 'output_path' __init__.py:552
The seo_report.html is generated. Nothing else is generated.
When I disable the plugin, site generation works as expected.
My config file has the relevant entries:
OUTPUT_PATH ="output/"
PLUGINS = [
...
"seo",
...
]
SEO_REPORT = False # Odd that the seo_report is generated with this false
SEO_ENHANCER = True
SEO_ENHANCER_OPEN_GRAPH = False
SEO_ENHANCER_TWITTER_CARDS = False
How do I resolve the CRITICAL
error in the logs? My goal is to add structured data to my output.
I followed the installation steps and am not seeing any output utilizing like you show when running pelican content --verbose
If I run pelican --print-settings
it will show my settings are present.
Any further tips to debug?
One of my pages (home.md) has a save__as metadata as index.html on that page, the canonical URL given is for /pages/.html.
This should check what the correct page URL is.
Would it be useful if the various settings (e.g., SEO_REPORT
, etc.) were allowed to be set in the pelicanconf.py
file? Currently to change these, you have to edit the settings.py
file (see #51). I can try to work on this if you think it would be a good idea!
Hello, the plugin looks good.
Would you please place the project under a specific open source license?
Thanks.
I set everything to True
SEO_REPORT = True
SEO_ENHANCER = True
SEO_ENHANCER_OPEN_GRAPH = True
SEO_ENHANCER_TWITTER_CARDS = True
But I'm not seeing any twitter:card
tags in the html. Using Python 3.8.5 on Mac. I can work up a simple example if necessary (are there examples somewhere already?)
Installing the pluing via pip and enabling it in pelicanconf.cpy
results in this error:
CRITICAL: 'charmap' codec can't decode byte 0x9d in position 5425: character maps to <undefined>
I'm sure it's this plugin, since it started when I added the plugin to a site which generated fine, and it stops if I remove the plugin.
Running Window10 & Python 3.8.5; this was the only plugin added when the error occurs. I'm guessing it might be something inside of beautiful soup scraping a file in a file with an unexpected encoding. To be sure I tried this on an empty project created with the pelican quickstart and got the same error.
pelicanconf.py
: https://gist.github.com/jwodder/35d570ca8710779af6138786b78f64daAttempting to use this plugin on my site produces the following error:
-> Writing /Users/jwodder/work/GITHUB/kbits/site/build/posts/pypkg-mistakes/index.html
CRITICAL: local variable 'max_index' referenced before assignment
Traceback (most recent call last):
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/bin/pelican", line 8, in <module>
sys.exit(main())
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/__init__.py", line 512, in main
pelican.run()
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/__init__.py", line 121, in run
p.generate_output(writer)
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/generators.py", line 686, in generate_output
self.generate_pages(writer)
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/generators.py", line 595, in generate_pages
self.generate_articles(write)
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/generators.py", line 466, in generate_articles
write(article.save_as, self.get_template(article.template),
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/writers.py", line 269, in write_file
_write_file(template, localcontext, self.output_path, name,
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/writers.py", line 216, in _write_file
signals.content_written.send(path, context=localcontext)
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/blinker/base.py", line 266, in send
return [(receiver, receiver(sender, **kwargs))
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/blinker/base.py", line 266, in <listcomp>
return [(receiver, receiver(sender, **kwargs))
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/plugins/seo/seo.py", line 104, in run_html_enhancer
html_enhancements = seo_enhancer.launch_html_enhancer(
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/plugins/seo/seo_enhancer/__init__.py", line 30, in launch_html_enhancer
"breadcrumb_schema": html_enhancer.breadcrumb_schema.create_schema(),
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/plugins/seo/seo_enhancer/html_enhancer/breadcrumb_schema_creator.py", line 100, in create_schema
breadcrumb_items = self._create_paths()
File "/Users/jwodder/work/GITHUB/kbits/site/.nox/publish/lib/python3.8/site-packages/pelican/plugins/seo/seo_enhancer/html_enhancer/breadcrumb_schema_creator.py", line 51, in _create_paths
del split_path[0:max_index]
UnboundLocalError: local variable 'max_index' referenced before assignment
It's clear that this condition failed to be true, and so no value was assigned to max_index
, and yet the code used max_index
anyway. I'm not sure what that piece of code is meant to be doing, however, so I can't recommend a specific fix.
Example link:
https://www.kentoseth.com/posts/2022/apr/17/python-pelican-how-to-view-draft-content-on-a-website/
Metadata from article:
Title: Python Pelican: How to view draft content on a website
Tags: pelican, blog
Author: Mohamed H.
Summary: Finding draft content on Pelican, a static site generator written in the Python programming language.
There is no metadata info for the things being requested in the report: https://docs.getpelican.com/en/stable/content.html
How do we add these things to the article metadata? How much do they impact SEO with/without having them?
The path to the css file may have to be interpreted by a browser on a windows machine. The hardcoded /
characters here cause an error:
The css file can not be found.
Replacing the line with the following code works on Windows:
css_file = "file:///" + os.path.join(plugin_path, "static", "seo_report.css")
When adding settings for pelican-seo to pelicanconf.py, it gives an error:
Exception: You must fill in SITEURL variable in pelicanconf.py __init__.py:566
to use SEO plugin.
But this will break the site for local testing, which uses localhost
. If you add a SITEURL, localhost
stops working for rendering the content locally.
I see in the following line that the content of the article is used to count the
h1 tags:
You suggest (like everyone) the following Markdown structure in your README:
Title: Page Title
Description: Page Description
# Heading Content
Nevertheless, most (all?) templates already encapsulate the "Title" metadata in an h1 tag.
This processing is independent from the rest of the article written in Markdown indeed contained in the content
attribute of the objects. It is inserted as is in the html template.
Therefore such an example poses 2 problems:
- duplication of the h1 tag (that of the template + that of the article content)
- duplication not detected by the current plugin
Currently, as far as I know, the only simple way to get a compliant html page is to write articles starting the heading level at h2 via ##
although it is semantically wrong in Markdown and can disturb some plugins (table of contents rendering, etc.).
I personally use an homemade plugin to modify the final html without modifying the original Markdown.
In this case, the SEO report plugin misleads the user by not detecting any h1 title.
I would like to mention that your plugin is very welcome because it allowed me to highlight a problem that I had totally missed.
The plugin should be refactored to read finalized pages, like the SEOEnhancer part
called after the content_written signal
.
Pandoc has implemented an option to automatically shift the heading level :
jgm/pandoc#5615
The html5 allows nesting of independent units in tags like <section>, <article>
, etc., which allows multiple h1 titles to coexist in a page (Outline algorithm). However, Mozilla is very clear about this: it is a non-standard practice and not recommended.
Cf
https://developer.mozilla.org/fr/docs/orphaned/Web/Guide/HTML/Using_HTML_sections_and_outlines#lalgorithme_outline_html5
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements#multiple_h1_elements_on_one_page
Discussion on Hugo's side:
https://discourse.gohugo.io/t/option-to-shift-headings/6136
I get the following error:
[...]
File "...seo\seo_enhancer\html_enhancer\article_schema_creator.py", line 17, in __init__
self._author = author.name
AttributeError: 'NoneType' object has no attribute 'name'
The reason for the error is:
The implicit assumption that the object author is not empty is wrong because there's the option to have no authors for an article.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.