• 欢迎访问搞代码网站,推荐使用最新版火狐浏览器和Chrome浏览器访问本网站!
  • 如果您觉得本站非常有看点,那么赶紧使用Ctrl+D 收藏搞代码吧

每天B站的热门视频都是哪些?爬取B站热门视频

python 搞java代码 3年前 (2022-05-21) 20次浏览 已收录 0个评论

本文的文字及图片来源于网络,仅供学习、交流使用,不具有任何商业用途,版权归原作者所有,如有问题请及时联系我们以作处理

 

这篇文章主要介绍了Python如何爬取b站热门视频并导入Excel,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下

 

代码如下

<span><a href="https://www.gaodaima.com/tag/import" title="查看更多关于import的文章" target="_blank">import</a></span><span> requests
</span><span>from</span> lxml <span>import</span><span> etree
</span><span>import</span><span> xlwt
</span><span>import</span><span> os

</span><span>#</span><span> 爬取b站热门视频信息</span>
<span>def</span><span> spider():
  video_list </span>=<span> []
  url </span>= <span>"</span><span>https://www.bilibili.com/ranking?spm_id_from=333.851.b_7072696d61727950616765546162.3</span><span>"</span><span>
  html </span>= requests.get(url, headers={<span>"</span><span>User-Agent</span><span>"</span>: <span>"</span><span>Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36</span><span>"</span><span>}).text
  html </span>=<span> etree.HTML(html)
  infolist </span>= html.xpath(<span>"</span><span>//li[@class="rank-item"]</span><span>"</span><span>)
  </span><span>for</span> item <span>in</span><span> infolist:
    rank </span>= <span>""</span>.join(item.xpath(<span>"</span><span>./div[@class="num"]/text()</span><span>"</span><span>))
    video_link </span>= <span>""</span>.join(item.xpath(<span>"</span><span>.//div[@class="info"]/a/@href</span><span>"</span><span>))
    title </span>= <span>""</span>.join(item.xpath(<span>"</span><span>.//div[@class="info"]/a/text()</span><span>"</span><span>))
    payinfo </span>= <span>""</span>.join(item.xpath(<span>"</span><span>.//div[@class="detail"]/span/text()</span><span>"</span>)).split(<span>"</span><span>万</span><span>"</span><span>)
    play </span>= payinfo[0] + <span>"</span><span>万</span><span>"</span><span>
    comment </span>= payinfo[1<span>]
    </span><span>if</span> comment.isdigit() ==<span> False:
      comment </span>+= <span>"</span><span>万</span><span>"</span><span>
    upname </span>= <span>""</span>.join(item.xpath(<span>"</span><span>.//div[@class="detail"]/a/span/text()</span><span>"</span><span>))
    uplink </span>= <span>"</span><span>http://</span><span>"</span> + <span>""</span>.join(item.xpath(<span>"</span><span>.//div[@class="detail"]/a/@href</span><span>"</span><span>))
    hot </span>= <span>""</span>.join(item.xpath(<span>"</span><span>.//div[@class="pts"]/div/text()</span><span>"</span><span>))
    video_list.append({
      </span><span>"</span><span>rank</span><span>"</span><span>: rank,
      </span><span>"</span><span>videolink</span><span>"</span><span>: video_link,
      </span><span>"</span><span>title</span><span>"</span><span>: title,
      </span><span>"</span><span>play</span><span>"</span><span>: play,
      </span><span>"</span><span>comment</span><span>"</span><span>: comment,
      </span><span>"</span><span>upname</span><span>"</span><span>: upname,
      </span><span>"</span><span>uplink</span><span>"</span><span>: uplink,
      </span><span>"</span><span>hot</span><span>"</span><span>: hot
    })
  </span><span>return</span><span> video_list


</span><span>def</span><span> write_Excel():
  </span><span>#</span><span> 将爬取的信息添加到Excel</span>
  video_list =<span> spider()
  workbook </span>= xlwt.Workbook() <span>#</span><span> 定义表格</span>
  sheet = workbook.add_sheet(<span>"</span><span>b站热门视频</span><span>"</span>)  <span>#</span><span> 添加sheet的name</span>
  xstyle = xlwt.XFStyle()  <span>#</span><span> 实例化表格样式对象</span>
  xstyle.alignment.horz = 0x02 <span>#</span><span> 字体居中</span>
  xstyle.alignment.vert = 0x01<span>
  head </span>= [<span>"</span><span>视频名</span><span>"</span>, <span>"</span><span>up主</span><span>"</span>,<span>"</span><span>排名</span><span>"</span>, <span>"</span><span>热度</span><span>"</span>,<span>"</span><span>播放量</span><span>"</span>,<span>"</span><span>评论数</span><span>"</span><span>]
  </span><span>for</span> h <span>in</span><span> range(len(head)):
    sheet.write(0, h, head[h], xstyle)
  i </span>= 1
  <span>for</span> item <span>in</span><span> video_list:
    </span><span>#</span><span> 向单元格(视频名)添加该视频的超链接</span>
    <span>if</span> <span>"</span><span>"</span><span>"</span> <span>in</span> item[<span>"</span><span>title</span><span>"</span><span>]:
      item[</span><span>"</span><span>title</span><span>"</span>] = item[<span>"</span><span>title</span><span>"</span>].split(<span>"</span><span>"</span><span>"</span>)[1<span>]
    title_data </span>= <span>"</span><span>HYPERLINK("</span><span>"</span>+item[<span>"</span><span>videolink</span><span>"</span>]+<span>"</span><span>";"</span><span>"</span>+item[<span>"</span><span>title</span><span>"</span>]+<span>"</span><span>")</span><span>"</span>  <span>#</span><span> 设置超链接</span>
    sheet.col(0).width = int(256 * len(title_data) * 3/5)  <span>#</span><span> 设置列宽</span>
<span>    sheet.write(i, 0, xlwt.Formula(title_data), xstyle)
    name_data </span>= <span>"</span><span>HYPERLINK("</span><span>"</span>+item[<span>"</span><span>uplink</span><span>"</span>]+<span>"</span><span>";"</span><span>"</span>+item[<span>"</span><span>upname</span><span>"</span>]+<span>"</span><span>")</span><span>"</span><span>
    sheet.col(</span>1).width = int(256 * len(name_data) * 3/5<span>)
    sheet.write(i, </span>1<span>, xlwt.Formula(name_data), xstyle)
    sheet.write(i, </span>2, item[<span>"</span><span>rank</span><span>"</span><span>], xstyle)
    sheet.write(i, </span>3, item[<span>"</span><span>hot</span><span>"</span><span>], xstyle)
    sheet.write(i, </span>4, item[<span>"</span><span>play</span><span>"</span><span>], xstyle)
    sheet.write(i, </span>5, item[<span>"</span><span>comment</span><span>"</span><span>], xstyle)
    i </span>+= 1
  <span>#</span><span> 如果文件存在,则将其删除</span>
  file = <span>"</span><span>b站热门视频信息.xls</span><span>"</span>
  <span>if</span><span> os.path.exists(file):
    os.remove(file)
  workbook.save(file)

</span><span>if</span> <span>__name__</span> == <span>"</span><span>__main__</span><span>"</span><span>:
  write_Excel()</span>

www#gaodaima.com来源gaodaimacom搞#^代%!码&网搞代码

 

结果展示:


搞代码网(gaodaima.com)提供的所有资源部分来自互联网,如果有侵犯您的版权或其他权益,请说明详细缘由并提供版权或权益证明然后发送到邮箱[email protected],我们会在看到邮件的第一时间内为您处理,或直接联系QQ:872152909。本网站采用BY-NC-SA协议进行授权
转载请注明原文链接:每天B站的热门视频都是哪些?爬取B站热门视频
喜欢 (0)
[搞代码]
分享 (0)
发表我的评论
取消评论

表情 贴图 加粗 删除线 居中 斜体 签到

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址