前言
本文的文字及图片来源于网络,仅供学习、交流使用,不具有任何商业用途,如有问题请及时联系我们以作处理。
PS:如有需要Python学习资料的小伙伴可以加点击下方链接自行获取
python免费学习资料以及群交流解答点击即可加入
基本环境配置
- python 3.6
- pycharm
- requests
- parsel
相关模块pip安装即可
确定网址
<code class="language-python">https://<a href="https://www.gaodaima.com/tag/www" title="查看更多关于www的文章" target="_blank">www</a>.huya.<a href="https://www.gaodaima.com/tag/com" title="查看更多关于com的文章" target="_blank">com</a>/g/2168 </code>
www#gaodaima.com来源gaodai#ma#com搞@代~码网搞代码
请求网页
<code class="language-python">import requests url = "https://www.huya.com/g/2168" headers = { "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36" } response = requests.get(url=url, headers=headers) print(repsonse.text) </code>
解析网页数据
<code class="language-python">import parsel selector = parsel.Selector(response.text) urls = selector.css(".live-list .game-live-item a img::attr(data-original)").getall() titles = selector.css(".live-list .game-live-item a img::attr(title)").getall() info_data = zip(urls, titles) for i in info_data: img_url = i[0].split("?")[0] title = i[1] </code>
保存数据
<code class="language-python"> img_url_response = requests.get(url=img_url, headers=headers) path = "D:pythondemo虎牙img" + title + ".jpg" with open(path, mode="wb") as f: f.write(img_url_response.content) print(title) </code>