Git Product home page Git Product logo

Comments (2)

dataabc avatar dataabc commented on June 13, 2024

目前还没有添加按时间段爬微博的功能,以后可能添加。现在可以添加爬取页数

for page in tqdm(range(1, page_count + 1), desc=u"进度"):

这段代码控制了微博的爬取范围从第一页爬到第page_count页,比如

for page in tqdm(range(10, 20 + 1), desc=u"进度"):

代表爬了微博的第十页到第二十页。
因为微博从第一页到最后一页时间由近到远,你可以通过控制页数间接的控制时间,不过有点麻烦。
其实如果你如果只想爬从某日到现在最新的微博其实实现很简单,只需要判断每条微博的发布时间是否在范围内,是就加入爬虫结果,不是就停止程序,因为微博时间从前到后时间越来越远,当前微博如果不在范围内,后面的就更不在范围内了,所以可以停止爬虫并将结果写入文件。
麻烦的是,如果你想要爬从时间点A到时间点B的微博,因为A不是现在,可能是很久之前,用上面的方法速度太慢,使用暂时没有加入时间段选择功能。

from weibo-crawler.

mixixibimlkrv avatar mixixibimlkrv commented on June 13, 2024

哇!谢谢啦!

from weibo-crawler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.