liminalcrab / auger Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 2.0 1.33 MB

A feed aggregator for the Merveilles webring project written in Python.

License: MIT License

Python 2.12% HTML 97.51% CSS 0.30% Shell 0.06%

blog-engine rss

auger's People

Contributors

Stargazers

Watchers

Forkers

chickensoupwithrice deviant-forks

auger's Issues

URL's being rejected by the pull scraper (and date)

Need to figure out why the following URL's are being rejected and implement a fix

This line of code isn't splitting certain 'rss' tags
links = [x for x in root if x.tag.split("}")[1] in ("entry", "item")]
https://github.com/LiminalCrab/auggar/blob/main/data/pull.py#L63

failed
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7eb810>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7ebcc0>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079d0e8400>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7b4db0>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7eba90>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7a7c70>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7a8540>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7ad400>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079d0c5900>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c8020e0>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c802360>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c774e00>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c691360>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7e1ae0>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7ad040>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7ad810>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7edd60>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c801950>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7b4f90>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c659810>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7ef8b0>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7b4220>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c7a8090>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c801bd0>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c659220>
EXCEPTION_ROOT:<Element 'rss' at 0x7f079c774b30>

accepted
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079d0c54f0>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c7e15e0>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079d0e8810>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c7a8e00>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c7ed2c0>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c802bd0>
TRY2_ROOT:<Element '{http://www.sitemaps.org/schemas/sitemap/0.9}urlset' at 0x7f079c659ef0>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c673770>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c7a8d60>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c801090>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c673e50>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c6c4cc0>
TRY2_ROOT:<Element '{http://www.w3.org/2005/Atom}feed' at 0x7f079c6c3e00>

PG/SL - Replace any string manipulation done in SQL.

Updating the database is dangerous, it's prone to SQL injection. Data was originally being manipulated with SQL itself, it slowed the application to a grinding halt and actually served as a method of SQL injection.

Refactor modules, reimplement Asyncio.

Coroutines aren't implemented correctly at all, the Asyncio library is not being used to its full potential. It's amazing this application worked when it did. Breaking each main function up into different functions with separate responsibilities should greatly empower Asyncio and increase the speed of the application.

urls.py - Consolidate lists into dictionary.

Two cases for this, since the website url is simply an identifier, the dictionary could be formatted as the following.

"site":"feedurl"

In the case of adding instance usernames to the list, a nested dictionary might be a better solution.

"site":
   "username":"user",
   "feed":"feedurl"

Fix yr shit

From 242f5517d73a073dc5fcee4bd0d940426d2c647b Mon Sep 17 00:00:00 2001
From: Quinlan Pfiffer <[email protected]>
Date: Thu, 8 Apr 2021 09:16:10 -0700
Subject: [PATCH] Fix yr shit.

---
 data/pull.py | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/data/pull.py b/data/pull.py
index 9789755..e237de8 100644
--- a/data/pull.py
+++ b/data/pull.py
@@ -65,15 +65,7 @@ async def main():
             try:
                 links = [x for x in root if x.tag.split("}")[1] in ("entry", "item")]
             except IndexError:
-                links = [x for x in root if x.tag in ("entry", "item")]
-                for match in re.findall('mlns:[^=]+="(?P<url>[^"]+)', response.text):
-                    print("REGEX:", match)
-                    
-                
-                print("LINKS:", links)
-            
-                #print("URL {} is fucked up.".format(url))
-                continue
+                links = [x for x in root[0] if x.tag in ("entry", "item")]
 
             for link in links:
                 title = [x.text for x in link if x.tag.split("}")[1] == "title"]
-- 
2.25.1

This fixes the fucked up ones.

Implement - FastAPI

FastAPI to work as a bridge between data processing on the backend and rendering content on the frontend. Retrieve and modify data with periodic API calls. This will replace the inefficient nested for loop the application is wrapped in as well as CRON. It will also create a better way of transporting data to separate components, and make it easier to make future changes without affecting how the entire application works.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.