Git Product home page Git Product logo

5ime / video_spider Goto Github PK

View Code? Open in Web Editor NEW
2.3K 38.0 580.0 142 KB

短视频去水印:抖音,皮皮虾,火山,微视,微博,绿洲,最右,轻视频,快手,全民小视频,巴塞电影,陌陌,Before避风,开眼,Vue Vlog 小咖秀,皮皮搞笑,全民K歌,西瓜视频,逗拍,虎牙,6间房,梨视频,新片场,acfun,美拍...

Home Page: https://lab.5ime.cn/video

PHP 100.00%
spider video php

video_spider's Introduction

video_spider

目前支持23个平台视频去水印下载,欢迎各位Star,提交issues时请附带视频链接

支持平台

平台 状态 平台 状态 平台 状态 平台 状态 平台 状态
皮皮虾 抖音短视频 火山短视频 皮皮搞笑 全民K歌
微视短视频 微博 最右 vuevlog 小咖秀
轻视频 快手短视频 全民小视频 陌陌 Before避风
西瓜视频 逗拍 虎牙 6间房 梨视频
新片场 Acfun 美拍

请求示例

支持GET/POST url参数必填,请优先使用 POST 请求,GET 请求自行 urlencode 编码

返回数据

因为平台众多,所以返回的参数不固定,但 title, cover, url 一定会有

字段名 说明 字段名 说明 字段名 说明 字段名 说明
author 视频作者 avatar 作者头像 like 视频点赞量 time 视频发布时间
title 视频标题 cover 视频封面 url 视频无水印链接 sex 作者性别
age 作者年龄 city 所在城市 uid 作者id code 状态码

调用示例

如果你不会调用 我在demo目录下放两个最基本的调用演示

  • demo.html98行请修改为你的接口地址
  • demo.py7行请修改为你的接口地址

FAQ

为什么演示网址界面和demo文件夹里的不一样

因为我用vue重写了(https://github.com/5ime/vue-page)

网址中包含特殊字符导致GET请求无法传递正确的参数值

传递的参数中包含#&=之类的,可能无法正确传递参数值,建议使用POST请求urlencode编码后进行GET请求

关于有些视频平台解析失败

有些平台需要cookie,请手动更新cookie,如果还是解析失败,请提交issues

短视频图集图片去水印

https://github.com/5ime/images_spider

抖音X-Bogus校验

目前使用的 https://github.com/B1gM8c/X-Bogus 提供的服务

你也可以基于我的模板 https://github.com/5ime/Tiktok_Signature 一键部署到 vercel,需要修改的地方如下

$url = 'https://tiktok.iculture.cc/X-Bogus';
$data = json_encode(array('url' => 'https://www.douyin.com/aweme/v1/web/aweme/detail/?aweme_id=' . $id[0] . '&aid=1128&version_name=23.5.0&device_platform=android&os_version=2333','userAgent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'));
$header = array('Content-Type: application/json');
$url = json_decode($this->curl($url, $data, $header), true)['param'];
// 改为
$url = '你的 vercel 地址';
$data = json_encode(array('url' => 'https://www.douyin.com/aweme/v1/web/aweme/detail/?aweme_id=' . $id[0] . '&aid=1128&version_name=23.5.0&device_platform=android&os_version=2333','userAgent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'));
$header = array('Content-Type: application/json');
$url = json_decode($this->curl($url, $data, $header), true)['data']['url'];

免责声明

本仓库只为学习研究,如涉及侵犯个人或者团体利益,请与我取得联系,我将主动删除一切相关资料,谢谢!

video_spider's People

Contributors

5ime avatar b1gm8c avatar mtmin avatar nizerin avatar yuanzhihai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

video_spider's Issues

想请教几个问题。

1.在demo里替换的解析接口是不是搭建好的网站?(我已经替换成网站,偶尔报错。如:https://qsy.jyiwbb.ml/?url=)
2.此项目是单纯的网页解析程序嘛?搭建好以后我用自己网站接口添加参数,跳转的还是网站首页没有变化。
3.如果搭建出来的不是解析接口,请教一下,各平台的解析接口应该怎么搭建。
麻烦大佬,希望可以解答一下我的疑惑。

全民视频 失效


//全民小视频
    public function quanmin($url)
    {
        $client = new Client();
        $html   = $client->get(
            $url,
            [
                'headers' => [
                    'User-Agent'                => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36',
                    'upgrade-insecure-requests' => '1'
                ]
            ]
        )->getBody()->getContents();
        preg_match('/<meta property=\"og:title\" content=\"(.*?)\">/', $html, $title);
        preg_match('/<meta property=\"og:image\" content=\"(.*?)\">/', $html, $cover);
        preg_match('/<meta property=\"og:videosrc\" content=\"(.*?)\">/', $html, $video);
        preg_match('/<div class=\"author-main\"><p class=\"name\">(.*?)<\/p>/', $html, $author);
        return [
            'code' => 200,
            'msg'  => '解析成功',
            'data' => [
                "title"  => $title[1],
                "cover"  => $cover[1],
                "url"    => $video[1],
                'author' => $author[1]
            ]
        ];
    }

大佬 请问西瓜视频是不是不能解析了。

换了很多和cookie都这样,麻烦大佬看看。

{ "code": 200, "msg": "解析成功", "data": { "author": null, "avatar": "", "like": null, "time": null, "title": null, "cover": null, "url": null, "music": { "url": null } } }

最右 有bug

修复后

public function zuiyou($url)
    {
        $text = $this->curl($url);
        preg_match('/"urlsrc":"(.*?)"/', $text, $video);
        preg_match('/:<\/span><h1>(.*?)<\/h1><\/div><\/div><div class=\"ImageBoxII\">/', $text, $video_title);
        preg_match('/<img alt=\"\" src=\"(.*?)\/id\/(.*?)\?w=540/', $text, $video_cover);
        $video_url = str_replace('\\', '/', str_replace('u002F', '', $video[1]));
        preg_match('/<span class=\"SharePostCard__name\">(.*?)<\/span>/', $text, $video_author);
        if (!empty($video_url)) {
            return [
                'code' => 200,
                'msg'  => '解析成功',
                'data' => [
                    'author' => $video_author[1],
                    'title'  => $video_title[1],
                    'cover'  => 'https://file.izuiyou.com/img/png/id/' . $video_cover[2] . '/sz/600',
                    'url'    => $video_url,
                ]
            ];
        }
    }

西瓜视频 解析

//西瓜视频
    public function xigua($url)
    {
        if (strpos($url, 'v.ixigua.com') !== false) {
            $loc = get_headers($url, true)['Location'];
            preg_match('/video\/(.*)\//', $loc, $id);
            $url = 'https://www.ixigua.com/' . $id[1];
        }
        $headers = [
            "User-Agent" => "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36 ",
            // "cookie"     => "wafid=aa79e1cd-16dc-421b-b94d-b20a5ebfe91c; wafid.sig=D1-hFWUnCB8JJJTV-R1e_Cdx9uI; _ga=GA1.2.1690086969.1589782861; tt_webid=6865999761311827469; __ac_nonce=060860ee900a3c46c139c; __ac_signature=_02B4Z6wo00f01BqANnQAAIDDqnHav4zlIoQaoTLAAGYs20; ttcid=f76179215341423fa72abd8cab05464915; MONITOR_WEB_ID=251a33dc-3fa9-413d-b16f-ea312209737e; ttwid=1|gG2sxNEtarKxMKxuCZDzAK7r5j_05sBv8Nd-5HXNUhY|1619398378|501f3788898a34f2c2a210c982bc82b2a22f0e365b65d7c2ff3ba9ce44be6b50; ixigua-a-s=0;",
            "cookie"     => "MONITOR_WEB_ID=7892c49b-296e-4499-8704-e47c1b150c18; ixigua-a-s=1; ttcid=af99669b6304453480454f150701d5c226; BD_REF=1; __ac_nonce=060d88ff000a75e8d17eb; __ac_signature=_02B4Z6wo00f01kX9ZpgAAIDAKIBBQUIPYT5F2WIAAPG2ad; ttwid=1%7CcIsVF_3vqSIk4XErhPB0H2VaTxT0tdsTMRbMjrJOPN8%7C1624806049%7C08ce7dd6f7d20506a41ba0a331ef96a6505d96731e6ad9f6c8c709f53f227ab1"

        ];
        try {
            $client   = new client();
            $response = $client->get(
                $url,
                [
                    'headers' => $headers
                ]
            );
            $html     = $response->getBody()->getContents();
            preg_match('/<script id=\"SSR_HYDRATED_DATA\">window._SSR_HYDRATED_DATA=(.*?)<\/script>/', $html, $jsondata);
            $data         = json_decode(str_replace('undefined', 'null', $jsondata[1]), 1);
            $result       = $data["anyVideo"]["gidInformation"]["packerData"]["video"];
            $title        = $result["title"];
            $video        = $result["videoResource"]["dash"]["dynamic_video"]["dynamic_video_list"];
            $music        = $result["videoResource"]["dash"]["dynamic_video"]["dynamic_audio_list"];
            $video_url    = $video[3]['main_url'] . $video[3]['backup_url_1'];
            $music_url    = $music[0]['main_url'] . $music[0]['backup_url_1'];
            $wm_video     = $result["videoResource"]["normal"]["video_list"];
            $wm_video_url = $wm_video['video_4']['main_url'] . $wm_video['video_4']['backup_url_1'];
            $author       = $result['user_info']['name'];
            $avatar_url   = $result['user_info']['avatar_url'];
            try {
                //大部分视频是没有cover的,这里加了个判断
                $cover = $data["anyVideo"]["gidInformation"]["packerData"]["pSeries"]["firstVideo"]["middle_image"]["url"];
                return [
                    'code' => 200,
                    'msg'  => '解析成功',
                    'data' => [
                        'author'    => $author,
                        'avatar'    => $avatar_url,
                        'like'      => $result['video_like_count'],
                        'time'      => $result['video_publish_time'],
                        'title'     => $title,
                        'cover'     => $cover,
                        'wm_video'  => [
                            'url' => base64_decode($wm_video_url)
                        ],
                        'video_url' => [
                            'url' => base64_decode($video_url)
                        ],
                        'music'     => [
                            'url' => base64_decode($music_url)
                        ]
                    ]
                ];
            } catch (\Exception $e) {
                return [
                    'code' => 200,
                    'msg'  => '解析成功',
                    'data' => [
                        'author'    => $author,
                        'avatar'    => $avatar_url,
                        'like'      => $result['video_like_count'],
                        'time'      => $result['video_publish_time'],
                        'title'     => $title,
                        'wm_video'  => [
                            'url' => base64_decode($wm_video_url)
                        ],
                        'video_url' => [
                            'url' => base64_decode($video_url)
                        ],
                        'music'     => [
                            'url' => base64_decode($music_url)
                        ]
                    ]
                ];
            }
        } catch (\Throwable $e) {
            return json(
                [
                    'code' => 100,
                    'msg'  => '暂无相关数据,请检查相关数据' . $e->getMessage()
                ]
            );
        }
    }

视频地址:
https://www.ixigua.com/6837727489259733518/?app=video_article&timestamp=1602058436&utm_source=copy_link"
"&utm_medium=android&utm_campaign=client_share

快手正则貌似有问题 解决方法贴下方 另外求快手音频m4a写法

快手的开源php正则需要修改下
preg_match(‘/srcNoMark":"(.*?)"}/‘, $text, $video_url);
需要删掉}
音频m4a能不能写个正则 我搞了半天不太对

不去掉提取出来的不是纯地址
提取出来是这样的
“url”: “https://txmov2.a.yximgs.com/upic/2021/04/26/17/BMjAyMTA0MjYxNzQ1MDJfMTE3NzMyMTc4NF80ODQ4OTk1ODYyMF8xXzM=_b_B4c389db6327921fb6037aa228c25d805.mp4?clientCacheKey=3xv8y6t3uvgpxpi_b.mp4&tt=b&di=31eaddfa&bp=13380\",\"tagShowBottom\":{\"bizId\":\"5xzefnycavcaphu\",\"bannerType\":2,\"usedCount\":\"5\",\"type\":3,\"name\":\"启大大•每晚9点户外的作品原声\",\"postScheme\":\"kwai://post?musicId=5924032058&musicType=9",
修改之后就不会有后面那一节了 你可以试试

希望大佬将输出的json标准化,~帖子内有json标准化教程

大佬,你的这个解析输出的json不标准啊,不能被网上的免费php解析解析出来啊,希望修复

你可以看看这个帖子哦
PHP基础之解析json数据_余熙钰的博客-CSDN博客
https://blog.csdn.net/yubo_725/article/details/44980899

这就是 你的 输出
data{ }
它的是
data[ { } ]

这个帖子的json是
{
"name": "zhangsan",
"age": 21,
"sex": "male",
"books": [
{
"name": "Chinese",
"price": "50"
},
{
"name": "History",
"price": "60"
},
{
"name": "Music",
"price": "30"
}
]
}

你的是

{
"name": "zhangsan",
"age": 21,
"sex": "male",
"books":
{
"name": "Chinese",
"price": "50"
},
{
"name": "History",
"price": "60"
},
{
"name": "Music",
"price": "30"
}
}

微博解析失效

微博解析已失效
修复后

$client   = new Client();
        $page     = explode('com', $url)[1];
        if (strpos($url,'show?fid=') != false){
            preg_match('/fid=(.*)/',$url,$id);
        } else {
            preg_match('/\d+\:\d+/',$url,$id);
        }
        $headers  = [
            "origin"       => "https://weibo.com",
            "content-type" => "application/x-www-form-urlencoded",
            "page-referer" => $page,
            "referer"      => $url,
            # cookie必不可少
            "cookie"       => "SUHB=0BLZaSr2TMJsnD; ALF=1630575988; SUB=_2AkMoI0ICf8NxqwJRmP0cz2PmZIpxyg3EieKef7PZJRMxHRl-yT9jqkwLtRB6A6Ns7c9KCXoCZlg8Wo5TqfsqS3tQRXQE; SINAGLOBAL=8972953107322.951.1621395320195; _s_tentry=-; Apache=5923720387618.461.1624770149518; ULV=1624770149558:2:1:1:5923720387618.461.1624770149518:1621395320299; TC-V-WEIBO-G0=b09171a17b2b5a470c42e2f713edace0; XSRF-TOKEN=MbK5goZ_U1CEHfx5frHzVlZ5",
            "user-agent"   => "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36 ",
        ];
        $data     = ['data' => json_encode(['Component_Play_Playinfo' => ['oid' => $id[0]]])];
        $response = $client->post(
            'https://weibo.com/tv/api/component?page=' . $page,
            [
                'headers'     => $headers,
                'form_params' => $data
            ]
        );
        $arr      = json_decode($response->getBody()->getContents(), 1, 512, JSON_THROW_ON_ERROR);
        if (!empty($arr['data']['Component_Play_Playinfo'])) {
            return [
                'code' => 200,
                'msg'  => '解析成功',
                'data' => [
                    'author' => $arr['data']['Component_Play_Playinfo']['author'],
                    'avatar' => $arr['data']['Component_Play_Playinfo']['avatar'],
                    'time'   => $arr['data']['Component_Play_Playinfo']['real_date'],
                    'title'  => $arr['data']['Component_Play_Playinfo']['title'],
                    'cover'  => $arr['data']['Component_Play_Playinfo']['cover_image'],
                    'url'    => $arr['data']['Component_Play_Playinfo']['urls']
                ]
            ];
        }
        return [
            'code' => 100,
            'msg'  => '解析失败',
        ];
    }

视频地址:https://weibo.com/tv/show/1034:4573084129361948?mid=4573086780751909

西瓜视频 cookies

2021/3/27 新增西瓜视频 请在video_spider.php文件第426行填写你的西瓜视频cookies,无需登录

请问这个 cookie 如何填写?

    $cookies = "MONITOR_WEB_ID=7892c49b-296e-4499-8704-e47c1b150c18; ixigua-a-s=1; ttcid=af99669b6304453480454f150701d5c226; BD_REF=1; __ac_nonce=060d88ff000a75e8d17eb; __ac_signature=_02B4Z6wo00f01kX9ZpgAAIDAKIBBQUIPYT5F2WIAAPG2ad; ttwid=1%7CcIsVF_3vqSIk4XErhPB0H2VaTxT0tdsTMRbMjrJOPN8%7C1624806049%7C08ce7dd6f7d20506a41ba0a331ef96a6505d96731e6ad9f6c8c709f53f227ab1";

        $headers = [
            "cookie:{$cookies}"
        ];

这么写无效, 结果返回 空, 所以不知道是cookie拿的不对, 还是其他的什么原因?

逗拍视频解析

public function doupai($url)
    {
        preg_match('/(http[s]?:\/\/[^\s]+)/', $url, $deal_url);
        preg_match("/topic\/(.*?).html/", $deal_url[1], $d_url);
        $vid      = $d_url[1];
        $base_url = "https://v2.doupai.cc/topic/" . $vid . ".json";
        $client   = new client();
        $response = $client->get($base_url);
        $content  = json_decode($response->getBody()->getContents(), 1);
        $url      = $content["data"]["videoUrl"];
        $title    = $content["data"]["name"];
        $cover    = $content["data"]["imageUrl"];
        $time     = $content['data']['createdAt'];
        $author   = $content['data']['userId'];
        if (!empty($url)) {
            return [
                'code' => 200,
                'msg'  => '解析成功',
                'data' => [
                    "title"     => $title,
                    "cover"     => $cover,
                    'time'      => $time,
                    'author'    => $author['name'],
                    'avatar'    => $author['avatar'],
                    "video_url" => $url
                ]
            ];
        }
    }

doupai('出国证 https://p.doupai.cc/#/topic/5fa20d8c8f71d10031f27abb.html');

视频地址:https://p.doupai.cc/#/topic/5fa20d8c8f71d10031f27abb.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.