Git Product home page Git Product logo

douyincomments's Introduction

Alphaply

Top Langs

目前学习方向:人工智障
学习目标:运用python去解决生活实际问题
个人博客:http://blog.a152.top

douyincomments's People

Contributors

alphaply avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

douyincomments's Issues

在2024年2月27日00:42:45,脚本向https://www.douyin.com/aweme/v1/web/comment/list/发送了一个GET请求,目标视频ID为<7306797205535378738>。 请求包含了详尽的Cookies信息,以及其它请求头字段,以模拟浏览器环境,确保正常访问抖音API。 服务器成功响应了此次请求,HTTP状态码为200,表明请求成功。 然而,在接收到的响应数据中,脚本未找到任何评论数据,即“Found 0 comments”。

import httpx
import asyncio
from datetime import datetime
import pandas as pd
import argparse
from typing import Any
import logging

初始化日志模块

logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s')

url = "https://www.douyin.com/aweme/v1/web/comment/list/"
reply_url = url + "reply/"
headers = {
"authority": "www.douyin.com",
"Accept": "application/json, text/plain, /",
"Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
"Cookie": '', # 这里填写实际的Cookie
"Referer": "https://www.douyin.com/",
"Sec-Ch-Ua": 'Not A(Brand";v="99", "Google Chrome";v="121", "Chromium";v="121',
"Sec-Ch-Ua-Mobile": "?0",
"Sec-Ch-Ua-Platform": "Windows",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36",
}

async def get_comments_async(client: httpx.AsyncClient, aweme_id: str, cursor: str = "0", count: str = "50") -> dict[
str, Any]:
params = {
"device_platform": "webapp",
"aid": "6383",
"channel": "channel_pc_web",
"aweme_id": aweme_id,
"cursor": cursor,
"count": count,
}

# 输出即将发起的请求URL、参数和请求头
logging.info(f"Sending request to '{url}' with params: {params}, headers: {headers}")

response = await client.get(url, params=params, headers=headers)

# 输出请求的响应状态码和内容摘要
logging.info(f"Received response with status code: {response.status_code}, headers: {response.headers}")

# 如果需要,还可以打印详细的响应体(如仅在调试模式下)
if logging.getLogger().isEnabledFor(logging.DEBUG):
    logging.debug(f"Response body: {response.text}")

comments_dict = response.json()  # 将响应转为字典
return comments_dict

async def fetch_all_comments_async(aweme_id: str) -> list[dict[str, Any]]:
async with httpx.AsyncClient() as client:
cursor = 0
all_comments = []
has_more = 1
while has_more:
response = await get_comments_async(client, aweme_id, cursor=str(cursor))
comments = response.get("comments", []) # 使用 .get() 避免 KeyError
if isinstance(comments, list):
all_comments.extend(comments)
has_more = response.get("has_more", False)
if has_more:
cursor = response["cursor"]
return all_comments

其他函数定义...

def create_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(description='抖音评论和回复爬虫')
parser.add_argument('--aweme_id', type=str, help='抖音视频的ID')
parser.add_argument('--cookies', type=str, help='抖音网站的cookies')
return parser

async def main():
global headers
# 获取所有评论
parser = create_parser()
args = parser.parse_args()
aweme_id = args.aweme_id
cookies = args.cookies
headers['Cookie'] = cookies

all_comments = await fetch_all_comments_async(aweme_id)
print(f"Found {len(all_comments)} comments.")

# ... 下面继续其他功能的实现 ...

运行 main 函数

if name == "main":
asyncio.run(main())
print('done!')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.