Git Product home page Git Product logo

Comments (4)

Wenzhi-Ding avatar Wenzhi-Ding commented on August 10, 2024

对于一份微博的JSON数据,请问需要导出哪些字段?

{
    "visible": {
        "type": 0,
        "list_id": 0
    },
    "created_at": "Mon Oct 24 19:50:29 +0800 2011",
    "id": 3372109543844864,
    "idstr": "3372109543844864",
    "mid": "3372109543844864",
    "mblogid": "xufTIg8dW",
    "user": {
        "id": 1659452041,
        "idstr": "1659452041",
        "pc_new": 7,
        "screen_name": "吃菠萝的萝卜皮",
        "profile_image_url": "https://tvax1.sinaimg.cn/crop.0.0.1080.1080.50/62e93a89ly8gm4hum1perj20u00u0whp.jpg?KID=imgbed,tva&Expires=1666369149&ssig=xr6pUjGTfA",
        "profile_url": "/u/1659452041",
        "verified": true,
        "verified_type": 0,
        "domain": "",
        "weihao": "",
        "verified_type_ext": 0,
        "avatar_large": "https://tvax1.sinaimg.cn/crop.0.0.1080.1080.180/62e93a89ly8gm4hum1perj20u00u0whp.jpg?KID=imgbed,tva&Expires=1666369149&ssig=3jaDedbTZR",
        "avatar_hd": "https://tvax1.sinaimg.cn/crop.0.0.1080.1080.1024/62e93a89ly8gm4hum1perj20u00u0whp.jpg?KID=imgbed,tva&Expires=1666369149&ssig=SZSHH8gfYn",
        "follow_me": false,
        "following": false,
        "mbrank": 3,
        "mbtype": 2,
        "planet_video": false,
        "icon_list": []
    },
    "can_edit": false,
    "text_raw": "你要热爱生活,但是没有必要完全体验生活,生命是二叉树,想往前只能不断的作单项选择,舍弃然后得到,你只能看见与你不同的生活,但是如果你妄想得到的话,一般来说后果很糟糕。 ​​​",
    "text": "你要热爱生活,但是没有必要完全体验生活,生命是二叉树,想往前只能不断的作单项选择,舍弃然后得到,你只能看见与你不同的生活,但是如果你妄想得到的话,一般来说后果很糟糕。 ​​​",
    "source": "uc浏览器ios",
    "favorited": false,
    "pic_ids": [],
    "geo": null,
    "pic_num": 0,
    "is_paid": false,
    "mblog_vip_type": 0,
    "number_display_strategy": {
        "apply_scenario_flag": 3,
        "display_text_min_number": 1000000,
        "display_text": "100万+"
    },
    "reposts_count": 0,
    "comments_count": 0,
    "attitudes_count": 0,
    "attitudes_status": 0,
    "isLongText": false,
    "mlevel": 0,
    "content_auth": 0,
    "is_show_bulletin": 0,
    "comment_manage_info": {
        "comment_permission_type": -1,
        "approval_comment_type": 0,
        "comment_sort_type": 0
    },
    "share_repost_type": 0,
    "title": {
        "text": "公开",
        "base_color": 1,
        "icon_url": "http://h5.sinaimg.cn/upload/2015/07/14/34/timeline_title_public.png"
    },
    "mblogtype": 0,
    "showFeedRepost": false,
    "showFeedComment": false,
    "pictureViewerSign": false,
    "showPictureViewer": false,
    "rcList": [],
    "customIcons": [],
    "ok": 1
}

from weibo-crawler.

yixin-mei avatar yixin-mei commented on August 10, 2024

I need id user_id screen_name created_at reposts_count comments_count attitudes_count text geo. Thanks.

from weibo-crawler.

Wenzhi-Ding avatar Wenzhi-Ding commented on August 10, 2024

可以先在根目录创建 mids.txt,每行放入一个 16 位数字的微博 ID。运行以下脚本会开始抓取这些微博的 JSON。

python tool/get_content.py

抓取完成后,使用以下脚本可以导出 mids.txt 中涉及的微博。

python tool/export_json_by_mids.py

用 WPS 打开数据结果如下:

image

以上使用说明更新在了首页的“进阶使用”章节中。

from weibo-crawler.

yixin-mei avatar yixin-mei commented on August 10, 2024

感谢作者大大👍

from weibo-crawler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.