I've written a Node.js module that scrapes news articles from the popular news aggregator Inshorts. The module utilizes the cheerio and node-fetch libraries to parse HTML and make HTTP requests, respectively.
I'm Harshit Sharma, also known as harshitehic online.
To use this module in your Node.js project, you can install it via npm:
npm install inshorts-news-scraper
To use this module, you can import it into your Node.js file:
const inshortsScraper = require('inshorts-news-scraper');
The getNews
function takes in two arguments: options
and callback
. options
is an object that contains the language and category of the news to be scraped. callback
is a function that will be called with the scraped news data and the news_offset
parameter (used for pagination).
const options = { lang: 'en', category: 'national', };
inshortsScraper.getNews(options, (news, news_offset) => { console.log(news); console.log(news_offset); });
The getMoreNews
function is similar to getNews
, but it is used for pagination. It takes in an additional parameter, options.news_offset
, which is used to request the next page of news articles. The function sends an HTTP POST request to the Inshorts website with the given options, then scrapes the HTML using cheerio
to extract the relevant news data. The scraped data is stored in an array of objects, where each object represents a single news article. The function then calls the callback
function with the scraped data and the new news_offset
parameter.
const options = { lang: 'en', category: 'national', news_offset: 'jvn36k2y-1', };
inshortsScraper.getMoreNews(options, (news, news_offset) => { console.log(news); console.log(news_offset); });
This module was written by Harshit Sharma, also known as harshitehic online. You can learn more about him and his work on his website, harshitethic.in.
This module is licensed under the MIT License.