Git Product home page Git Product logo

tailoredscoop's Introduction

tailoredscoop

Automated customized daily news letter service.

https://tailoredscoops.com/


Overview

This application creates daily newsletters curated using user's keywords. If no keywords are selected, the newsletter contains general top headlines. To retrieve most important and relevant stories, this service uses news APIs to get articles' URLs then scrapes each URL to get article texts. Then, huggingface text summarization models condense each article. And lastly, Open AI models are used to format the output into a nicely readable newsletter.

I'm using AWS Batch for serverless scheduled computing, which run Docker containerized services to create newsletters and AWS SES to send emails. Various news APIs such as NewsAPI and Google News RSS feeds are used to fetch stories. Text models include facebook/bart-large-cnn downloaded from huggingface and gpt-3.5.turbo and gpt-4 from OpenAI. The frontend code for the webapp is built on Django and code can be found in the apps.chansoos repository. The backend uses two databases: postgresql and mongodb.

Ongoing Challenges

Though there were initial challenges with hallucinations, this problem has mostly been solved. While the app no longer makes up stories (which though rare did happen at first), it does sometimes get some details wrong. As an example, it recently put in the headline that the Miami Heat were up 3-0 against the Boston Celtics, instead of 3-2. The story got it right, but the headline was wrong. This is something that remains to be solved.

Scalability challenges depend on the distribution of keywords. If all users select the same keyword, only one newsletter needs to be generated each day. In the worst case, all users would request different keywords. A quick solution would be to restrict the number of available keywords by providing a list of options.

Example Email

tailoredscoop's People

Contributors

chansooligans avatar

Stargazers

 avatar M̵̞̗̝̼̅̏̎͝Ȯ̴̝̻̊̃̋̀Õ̷̼͋N̸̩̿͜ ̶̜̠̹̼̩͒ avatar

Watchers

James Cloos avatar  avatar

Forkers

bitar-zaaz

tailoredscoop's Issues

Classify keywords always

Prompt

User
I have a personalized daily newsletter tool that creates newsletters based on keywords submitted by a user. Instead of querying the keywords, I want to classify the keyword as one of 100 news categories. Please create a table containing 100 news categories. In a second column contain key words that would identify articles in that category

Tune down temperature

e.g. fake news: "Boston Celtics coach Joe Mazzulla says he should have called a timeout during Game 4 against the Philadelphia 76ers. The series is tied 2-2, with Game 5 taking place on Tuesday night in Boston. The winner will face the winner of the Washington Wizards-Indiana Pacers series."

aws to run email service tasks

Create a Docker container with your Hugging Face model and all necessary dependencies. You can use the Hugging Face-provided Docker images as a base (found here: https://github.com/huggingface/transformers/tree/master/docker) and customize it with your ML job specifics.
Push the Docker container to Amazon Elastic Container Registry (ECR).
Set up an AWS Batch Compute Environment. You can use the "spot" environment to save on cost, as your job is not time-sensitive. Choose an instance type with at least 8GB RAM, such as c5.xlarge. Set up a Job Queue associated with the Compute Environment.
Create a Job Definition in AWS Batch, specifying the ECR repository URI of your Docker container, the instance type, and any required environment variables.
Schedule your job to run each morning. You can use Amazon EventBridge (formerly known as CloudWatch Events) to create a rule that triggers your AWS Batch job on a schedule, such as every day at a specific time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.