Git Product home page Git Product logo

Comments (7)

dataabc avatar dataabc commented on June 14, 2024

我想到两者可能:
1.可能是用户在跳过去的日期内没有发表过微博;
2.即便发表过微博,只不过是转发,而程序被设置成了获取原创微博,所以跳过。

不知道是不是这两种情况,如果不是,能否提供user_id,方便调试,谢谢

from weibo-crawler.

tangyuekun avatar tangyuekun commented on June 14, 2024

我想到两者可能:
1.可能是用户在跳过去的日期内没有发表过微博;
2.即便发表过微博,只不过是转发,而程序被设置成了获取原创微博,所以跳过。

不知道是不是这两种情况,如果不是,能否提供user_id,方便调试,谢谢

2803301701 是人民日报应该不是上述两种可能吧,跟反爬机制有关系吗?

from weibo-crawler.

dataabc avatar dataabc commented on June 14, 2024

我刚刚试了下,遇到类似问题,然后手动访问,显示访问过于频繁。应该是速度太快了,可以修改get_pages方法中sleep相关的代码,加快暂停频率,增加sleep时间,通过降速来减少被限制概率。

刚刚试了下娱乐类的微博,可以正常获取。上面这种账号可能限制比较严。

from weibo-crawler.

tangyuekun avatar tangyuekun commented on June 14, 2024

我刚刚试了下,遇到类似问题,然后手动访问,显示访问过于频繁。应该是速度太快了,可以修改get_pages方法中sleep相关的代码,加快暂停频率,增加sleep时间,通过降速来减少被限制概率。

刚刚试了下娱乐类的微博,可以正常获取。上面这种账号可能限制比较严。

好的,我再尝试下,万分感谢!

from weibo-crawler.

tangyuekun avatar tangyuekun commented on June 14, 2024

您好我又出现这样的问题
Error: list index out of range
Traceback (most recent call last):
File "D:/weibo1/weibo-crawler-master/weibo.py", line 1007, in start
self.get_pages()
File "D:/weibo1/weibo-crawler-master/weibo.py", line 942, in get_pages
self.get_user_info()
File "D:/weibo1/weibo-crawler-master/weibo.py", line 224, in get_user_info
card_list = cards[0]['card_group'] + cards[1]['card_group']
IndexError: list index out of range

user_id 2212518065 和 6914257879均是这样,不知该如何解决

from weibo-crawler.

dataabc avatar dataabc commented on June 14, 2024

感谢反馈,问题已经修复。

我试了几遍,都没有复现该问题。修复代码是根据上面的出错提示改的。如果还有问题或建议欢迎继续反馈。

from weibo-crawler.

tangyuekun avatar tangyuekun commented on June 14, 2024

可以使用了! 感谢!

from weibo-crawler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.