本文的文字及图片来源于网络,仅供学习、交流使用,不具有任何商业用途,版权归原作者所有,如有问题请及时联系我们以作处理
( 想要学习Python?Python学习交流群:1039649593,满足你的需求,资料都已经上传群文件流,可以自行下载!还有海量最新2020python学习资料。 )
一、前言
听说很多小伙伴喜欢王者荣耀这个moba类游戏,下面老师带大家把游戏中所有英雄的皮肤图片爬取下来一睹为快把。。。
二、课程亮点
1、分析url地址构成
2、数据的字段提取
3、记录程序运行时间
三、所用到得库
<span><a href="https://www.gaodaima.com/tag/import" title="查看更多关于import的文章" target="_blank">import</a></span> requests <span>#</span><span> 第三方模块</span> <span>import</span> time <span>#</span><span> 时间模块</span> <span>import</span> pprint <span>#</span><span> 格式化输出模块</span>
四、环境配置
python 3.6<span> pycharm requests</span>
五、找寻数据地址:
<span>#</span><span> 记录程序运行的开始时间(时间戳)</span> start_time =<span> time.time() </span><span>#</span><span> 找数据地址</span> url = <span>"</span><span>https://pvp.qq.com/web201605/js/herolist.json</span><span>"</span>
六、发送网络请求
<span>#</span><span> 发送网络请求</span> response = requests.get(url=<span>url) json_data </span>=<span> response.json() </span><span>#</span><span> pprint.pprint(json_data)</span>
七、完整代码:
<span>#</span><span> 数据提取 id(ename) 英雄名字(cname) 皮肤数量(skin_name)</span> <span>for</span> data <span>in</span><span> json_data: cname </span>= data[<span>"</span><span>cname</span><span>"</span>] <span>#</span><span> 英雄名字</span> ename = data[<span>"</span><span>ename</span><span>"</span>] <span>#</span><span> 英雄id(ename)</span> <span>try</span><span>: skin_name </span>= data[<span>"</span><span>skin_name</span><span>"</span>].split(<span>"</span><span>|</span><span>"</span>) <span>#</span><span> 皮肤数量(skin_name)</span> <span>except</span><span>: </span><span>pass</span> <span>#</span><span> print(cname, ename, skin_name)</span> <span>#</span><span> 构建皮肤数量的循环</span> <span>"""</span><span> http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/106/106-bigskin-7.jpg http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/ + 英雄id + "/" + 英雄id + -bigskin- + 皮肤数量 + ".jpg" </span><span>"""</span> <span>for</span> skin_num <span>in</span> range(1, len(skin_name) + 1<span>): skin_url </span>= <span>"</span><span>http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/</span><span>"</span> + str(ename) + <span>"</span><span>/</span><span>"</span> +<span> str( ename) </span>+ <span>"</span><span>-bigskin-</span><span>"</span> + str(skin_num) + <span>"</span><span>.jpg</span><span>"</span> <span>#</span><span> print(skin_url)</span> <span>#</span><span> 请求每一个图片地址数据</span> skin_data =<span> requests.get(skin_url).content </span><span>#</span><span> 图片数据的保存 英雄名字 + 皮肤名字 + 文件尾缀</span> with open(<span>"</span><span>pic</span><span>"</span> + cname + <span>"</span><span>-</span><span>"</span> + skin_name[skin_num - 1] + <span>"</span><span>.jpg</span><span>"</span>, mode=<span>"</span><span>wb</span><span>"</span><span>) as f: f.write(skin_data) </span><span>print</span>(<span>"</span><span>保存完成:</span><span>"</span>, cname + <span>"</span><span>-</span><span>"</span> + skin_name[skin_num - 1<span>]) all_time </span>= time.time() -<span> start_time </span><span>print</span>(<span>"</span><span>共花费时间(单位秒): </span><span>"</span>, all_time)
结尾
爬虫是非常有趣的,因为它非常直观,视觉冲击感强,写出来也很有成就感,爬虫虽然强大,但千万不能随意爬取隐私信息。