Git Product home page Git Product logo

blawgdawg's Introduction

blawgDawg

Python 3.6.5 tools for HubSpot blog objects

externalBlawgDawg.py - Turn any external blog into a HubSpot importable XML file
blogFeaturedImageSoup.py - Soup featured images from an external blog, upload them to the File Manager, set them as featured for the posts respective HubSpot equivalent

externalBlawgDawg.py

A Python python to turn any external blog into an XML file which you can import into HubSpot
REQUIRES
BeautifulSoup
lxml

USAGE
This Python python requires manually setting a few variables to make sure we can scrub external content, and get all the content and data we need to import a blog into HubSpot. It will grab post titles, urls, meta descriptions, authors, tags and post bodies, and turn them all into importable <items>

blogRootUrl - Root URL of the external blog
blogPosts- Python list of blog urls to import
soup. - There are 5 soup. statements which require a selector of the html elements(s) you are getting to grab content from. See the BeautifulSoup docs to learn more about modifieing these selectors depending on your needs

soups

[html from scrubbed post]
>>>>>
[xml output form externalBlawgDawg.py]

soup.find('title') - you should not need to touch this soup. Finds and prints the <item> <title> to blog.xml

<title>This is the post title</title>
>>>>>
<title>This is the post title</title> 

soup.find('meta', attrs={'name':'description'}) - you should not need to touch this soup. Finds and prints the <item> <excerpt:encoded> to blog.xml

<meta name="description" content="This is the meta description"> 
>>>>>
<excerpt:encoded><![CDATA[This is the meta description]]<excerpt:encoded>

soup.find('a', attrs={'rel':'author'}) - you likely need to change this soup. Finds the author and prints the <item> <dc:creator>, as well as creates top level <wp:author> tags in blog.xml

<a href="link" rel="author">Author</a>
>>>>>
<dc:creator>Author</dc:creator>
&
<wp:author>
    <wp:author_display_name><![CDATA[Author]]></wp:author_display_name>
    <wp:author_login><![CDATA[Author]]></wp:author_login>
</wp:author>

soup.find_all('a', attrs={'rel':'category tag'}) - you likely need to change this soup. Finds the tags of a post and prints the <item> <categories> (sets nicename, etc.) to blog.xml

<a href="link" rel="category tag">Tag 1</a>
<a href="link" rel="category tag">Tag 2</a>
>>>>>
<category domain="category" nicename="tag-1"><![CDATA[Tag 1]]></category>
<category domain="category" nicename="tag-2"><![CDATA[Tag 2]]></category>

soup.select('.entry-content')[0] - you likely need to change this soup. Finds the content of a post and prints the <item> <content:encoded> to blog.xml

<div class="entry-content">This is the post body</div>
>>>>>
<content:encoded><![CDATA[This is the post body]]</content:encoded><

XML setup which happens on its own:

<?xml version='1.0' encoding='UTF-8'?>
<rss>
  <channel>
  <link>!!blog root url link from blogRootUrl variable!!</link>
  <
    !!<wp:author>s build here!!
  >
    <item>
        <link>!!post link from blogPosts library!!</link>
        <wp:post_id>!!set automatically!!</wp:post_id>
        <wp:status>publish</wp:status>
        <wp:post_type>post</wp:post_type>
        <
            !!The above soups fill in XML here!!
        >
    </item>
    ... !!for every post in blogPosts library, the above <item> is created!!
  </channel>
</rss>
$ python3 externalBlawgDawg.py

blogFeaturedImageSoup.py

A Python python to find the featured image on an external blog, upload it to the HubSpot File Manager, and then set the HubSpot hosted version of the posts' featuredImage with the newly uploaded File Manager asset
REQUIRES
BeautifulSoup
requests

USAGE
This Python python requires manually setting 4 variables: accessToken, blogRootUrl, featuredImageSelector & postsToSoupScrubKitten

accessToken - An access_token for the portal you want to blogFeaturedImageSoup
blogRootUrl- The external blogs root url featuredImageSelector - The CSS selector which select the external blogs featured image (ex. .featured-image img)
postsToSoupScrubKitten - A python list of the slugs of all of the external posts to soup (ex. ["slug/post/1", "slug/post/2", "slug/post/3"])

It is important that the post slugs of the external and HubSpot posts are the same. blogRootUrl + postsToSoupScrubKitten[slugs] should equal the actual URL of the posts. slugs should not start with a /, rather, blogRootUrl should end with a /

$ python3 blogFeaturedImageSoup.py

blawgdawg's People

Contributors

williamspiro avatar

Forkers

adesignl

blawgdawg's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.