前言
本文的文字及图片来源于网络,仅供学习、交流使用,不具有任何商业用途,如有问题请及时联系我们以作处理。
B站是国内知名的视频弹幕网站,有最及时的动漫新番,ACG氛围,最有创意的Up主。站点中的视频数据分成了视频画面和音频数据。
今天带大家下载以及合并B站的视频。
Python 数据分析入门案例讲解
https://www.bilibili.<a href="https://www.gaodaima.com/tag/com" title="查看更多关于com的文章" target="_blank">com</a>/video/BV1LX4y1u7VA
www#gaodaima.com来源gaodai$ma#com搞$$代**码网搞代码
环境介绍:
- python 3.6
- pycharm
- requests
- re
- json
- subprocess
解析网页
目标网页分析
B站的视频和音频是分开的,音频url和视频url都在<script>window.__playinfo__=</script> 里面
提取数据
1、正则匹配提取数据
2、正则提取出数据为一个列表,通过列表取值,取出
3、字符串转json数据
4、通过字典取值的方式,提取视频url以及音频url
爬虫代码
导入工具
<span>import</span><span> requests </span><span>import</span> re <span>#</span><span> 正则表达式</span> <span>import</span><span> pprint </span><span>import</span><span> json </span><span>import</span> subprocess
请求头
headers =<span> { </span><span>"</span><span>user-agent</span><span>"</span>: <span>"</span><span>Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36</span><span>"</span>}
请求数据
<span>def</span><span> send_request(url): response </span>= requests.get(url=url, headers=<span>headers) </span><span>return</span> response
解析视频数据
<span>def</span><span> get_video_data(html_data): </span><span>"""</span><span>解析视频数据</span><span>"""</span> <span>#</span><span> 提取视频的标题</span> title = re.findall(<span>"</span><span><span class="tit">(.*?)</span></span><span>"</span><span>, html_data)[0] </span><span>#</span><span> print(title)</span> <span>#</span><span> 提取视频对应的json数据</span> json_data = re.findall(<span>"</span><span><script>window.__playinfo__=(.*?)</script></span><span>"</span><span>, html_data)[0] </span><span>#</span><span> print(json_data) # json_data 字符串</span> json_data =<span> json.loads(json_data) pprint.pprint(json_data) </span><span>#</span><span> 提取音频的url地址</span> audio_url = json_data[<span>"</span><span>data</span><span>"</span>][<span>"</span><span>dash</span><span>"</span>][<span>"</span><span>audio</span><span>"</span>][0][<span>"</span><span>backupUrl</span><span>"</span><span>][0] </span><span>print</span>(<span>"</span><span>解析到的音频地址:</span><span>"</span><span>, audio_url) </span><span>#</span><span> 提取视频画面的url地址</span> video_url = json_data[<span>"</span><span>data</span><span>"</span>][<span>"</span><span>dash</span><span>"</span>][<span>"</span><span>video</span><span>"</span>][0][<span>"</span><span>backupUrl</span><span>"</span><span>][0] </span><span>print</span>(<span>"</span><span>解析到的视频地址:</span><span>"</span><span>, video_url) video_data </span>=<span> [title, audio_url, video_url] </span><span>return</span> video_data
保存数据
<span>def</span><span> save_data(file_name, audio_url, video_url): </span><span>#</span><span> 请求数据</span> <span>print</span>(<span>"</span><span>正在请求音频数据</span><span>"</span><span>) audio_data </span>=<span> send_request(audio_url).content </span><span>print</span>(<span>"</span><span>正在请求视频数据</span><span>"</span><span>) video_data </span>=<span> send_request(video_url).content with open(file_name </span>+ <span>"</span><span>.mp3</span><span>"</span>, mode=<span>"</span><span>wb</span><span>"</span><span>) as f: f.write(audio_data) </span><span>print</span>(<span>"</span><span>正在保存音频数据</span><span>"</span><span>) with open(file_name </span>+ <span>"</span><span>.mp4</span><span>"</span>, mode=<span>"</span><span>wb</span><span>"</span><span>) as f: f.write(video_data) </span><span>print</span>(<span>"</span><span>正在保存视频数据</span><span>"</span>)
数据的合并
<span>def</span><span> merge_data(video_name): </span><span>print</span>(<span>"</span><span>视频合成开始:</span><span>"</span><span>, video_name) </span><span>#</span><span> ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac -strict experimental output.mp4</span> COMMAND = f<span>"</span><span>ffmpeg -i {video_name}.mp4 -i {video_name}.mp3 -c:v copy -c:a aac -strict experimental output.mp4</span><span>"</span><span> subprocess.Popen(COMMAND, shell</span>=<span>True) </span><span>print</span>(<span>"</span><span>视频合成结束:</span><span>"</span>, video_name)
效果图
合并视频与音频
这里使用到一个工具<ffmpeg>,FFmpeg是一套可以用来记录、转换数字音频、视频,并能将其转化为流的开源计算机程序。
下载之后解压即可,但是需要你设置环境变量。
1、我的电脑,鼠标右键点击属性
2、选择系统高级设置
3、选择环境变量
4、添加环境变量,复制文件路径,选择新建添加即可