目前学习方向:人工智障
学习目标:运用python去解决生活实际问题
个人博客:http://blog.a152.top
alphaply / douyincomments Goto Github PK
View Code? Open in Web Editor NEW抖音的评论以及二级评论获取
抖音的评论以及二级评论获取
目前学习方向:人工智障
学习目标:运用python去解决生活实际问题
个人博客:http://blog.a152.top
import httpx
import asyncio
from datetime import datetime
import pandas as pd
import argparse
from typing import Any
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s')
url = "https://www.douyin.com/aweme/v1/web/comment/list/"
reply_url = url + "reply/"
headers = {
"authority": "www.douyin.com",
"Accept": "application/json, text/plain, /",
"Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
"Cookie": '', # 这里填写实际的Cookie
"Referer": "https://www.douyin.com/",
"Sec-Ch-Ua": 'Not A(Brand";v="99", "Google Chrome";v="121", "Chromium";v="121',
"Sec-Ch-Ua-Mobile": "?0",
"Sec-Ch-Ua-Platform": "Windows",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36",
}
async def get_comments_async(client: httpx.AsyncClient, aweme_id: str, cursor: str = "0", count: str = "50") -> dict[
str, Any]:
params = {
"device_platform": "webapp",
"aid": "6383",
"channel": "channel_pc_web",
"aweme_id": aweme_id,
"cursor": cursor,
"count": count,
}
# 输出即将发起的请求URL、参数和请求头
logging.info(f"Sending request to '{url}' with params: {params}, headers: {headers}")
response = await client.get(url, params=params, headers=headers)
# 输出请求的响应状态码和内容摘要
logging.info(f"Received response with status code: {response.status_code}, headers: {response.headers}")
# 如果需要,还可以打印详细的响应体(如仅在调试模式下)
if logging.getLogger().isEnabledFor(logging.DEBUG):
logging.debug(f"Response body: {response.text}")
comments_dict = response.json() # 将响应转为字典
return comments_dict
async def fetch_all_comments_async(aweme_id: str) -> list[dict[str, Any]]:
async with httpx.AsyncClient() as client:
cursor = 0
all_comments = []
has_more = 1
while has_more:
response = await get_comments_async(client, aweme_id, cursor=str(cursor))
comments = response.get("comments", []) # 使用 .get() 避免 KeyError
if isinstance(comments, list):
all_comments.extend(comments)
has_more = response.get("has_more", False)
if has_more:
cursor = response["cursor"]
return all_comments
def create_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(description='抖音评论和回复爬虫')
parser.add_argument('--aweme_id', type=str, help='抖音视频的ID')
parser.add_argument('--cookies', type=str, help='抖音网站的cookies')
return parser
async def main():
global headers
# 获取所有评论
parser = create_parser()
args = parser.parse_args()
aweme_id = args.aweme_id
cookies = args.cookies
headers['Cookie'] = cookies
all_comments = await fetch_all_comments_async(aweme_id)
print(f"Found {len(all_comments)} comments.")
# ... 下面继续其他功能的实现 ...
if name == "main":
asyncio.run(main())
print('done!')
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.