Comments (4)
对于一份微博的JSON数据,请问需要导出哪些字段?
{
"visible": {
"type": 0,
"list_id": 0
},
"created_at": "Mon Oct 24 19:50:29 +0800 2011",
"id": 3372109543844864,
"idstr": "3372109543844864",
"mid": "3372109543844864",
"mblogid": "xufTIg8dW",
"user": {
"id": 1659452041,
"idstr": "1659452041",
"pc_new": 7,
"screen_name": "吃菠萝的萝卜皮",
"profile_image_url": "https://tvax1.sinaimg.cn/crop.0.0.1080.1080.50/62e93a89ly8gm4hum1perj20u00u0whp.jpg?KID=imgbed,tva&Expires=1666369149&ssig=xr6pUjGTfA",
"profile_url": "/u/1659452041",
"verified": true,
"verified_type": 0,
"domain": "",
"weihao": "",
"verified_type_ext": 0,
"avatar_large": "https://tvax1.sinaimg.cn/crop.0.0.1080.1080.180/62e93a89ly8gm4hum1perj20u00u0whp.jpg?KID=imgbed,tva&Expires=1666369149&ssig=3jaDedbTZR",
"avatar_hd": "https://tvax1.sinaimg.cn/crop.0.0.1080.1080.1024/62e93a89ly8gm4hum1perj20u00u0whp.jpg?KID=imgbed,tva&Expires=1666369149&ssig=SZSHH8gfYn",
"follow_me": false,
"following": false,
"mbrank": 3,
"mbtype": 2,
"planet_video": false,
"icon_list": []
},
"can_edit": false,
"text_raw": "你要热爱生活,但是没有必要完全体验生活,生命是二叉树,想往前只能不断的作单项选择,舍弃然后得到,你只能看见与你不同的生活,但是如果你妄想得到的话,一般来说后果很糟糕。 ",
"text": "你要热爱生活,但是没有必要完全体验生活,生命是二叉树,想往前只能不断的作单项选择,舍弃然后得到,你只能看见与你不同的生活,但是如果你妄想得到的话,一般来说后果很糟糕。 ",
"source": "uc浏览器ios",
"favorited": false,
"pic_ids": [],
"geo": null,
"pic_num": 0,
"is_paid": false,
"mblog_vip_type": 0,
"number_display_strategy": {
"apply_scenario_flag": 3,
"display_text_min_number": 1000000,
"display_text": "100万+"
},
"reposts_count": 0,
"comments_count": 0,
"attitudes_count": 0,
"attitudes_status": 0,
"isLongText": false,
"mlevel": 0,
"content_auth": 0,
"is_show_bulletin": 0,
"comment_manage_info": {
"comment_permission_type": -1,
"approval_comment_type": 0,
"comment_sort_type": 0
},
"share_repost_type": 0,
"title": {
"text": "公开",
"base_color": 1,
"icon_url": "http://h5.sinaimg.cn/upload/2015/07/14/34/timeline_title_public.png"
},
"mblogtype": 0,
"showFeedRepost": false,
"showFeedComment": false,
"pictureViewerSign": false,
"showPictureViewer": false,
"rcList": [],
"customIcons": [],
"ok": 1
}
from weibo-crawler.
I need id user_id screen_name created_at reposts_count comments_count attitudes_count text geo. Thanks.
from weibo-crawler.
可以先在根目录创建 mids.txt
,每行放入一个 16 位数字的微博 ID。运行以下脚本会开始抓取这些微博的 JSON。
python tool/get_content.py
抓取完成后,使用以下脚本可以导出 mids.txt
中涉及的微博。
python tool/export_json_by_mids.py
用 WPS 打开数据结果如下:
以上使用说明更新在了首页的“进阶使用”章节中。
from weibo-crawler.
感谢作者大大👍
from weibo-crawler.
Related Issues (20)
- 增加脚本方便合并数据
- 整理搜索进度记录 HOT 1
- 创建待爬MID表
- 增加对搜索页面的初步记录
- base62引用
- README改进
- 自动解析一组Cookies
- 增加对无基础使用者的完整教程 HOT 1
- Program error HOT 1
- 邮件提示报错
- 任务完成后邮件提醒未触发 HOT 1
- 报错后退出信息不正确
- 任务完成后自动停止周期报告
- 自动更新 cookies
- 输出形式希望能改变 HOT 2
- 爬取结果有漏 HOT 1
- 抓出了很多无关键词微博
- 待将查漏模块修改为自动迭代 HOT 1
- 增加地址字段
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from weibo-crawler.