前言
本文的文字及图片来源于网络,仅供学习、交流使用,不具有任何商业用途,如有问题请及时联系我们以作处理。
项目目标
爬取米胖天气网站,保存实时数据
受害者地址
<code class="hljs cpp">(<a href="https://www.gaodaima.com/tag/https" title="查看更多关于https的文章" target="_blank">https</a>:<span class="hljs-comment">//weather.mipang.<a href="https://www.gaodaima.com/tag/com" title="查看更多关于com的文章" target="_blank">com</a>/) </span></code>
www#gaodaima.com来源gaodai$ma#com搞$代*码*网搞代码
开始代码
导入工具
<span><a href="https://www.gaodaima.com/tag/import" title="查看更多关于import的文章" target="_blank">import</a></span><span> requests </span><span>import</span><span> parsel </span><span>import</span><span> csv </span><span>import</span> time
请求网站
url = <span>"</span><span>https://weather.mipang.com/changsha/{}yuefen.html</span><span>"</span><span>.format(page) headers </span>=<span> { </span><span>"</span><span>User-Agent</span><span>"</span>: <span>"</span><span>Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36</span><span>"</span><span> } response </span>= requests.get(url=url, headers=headers)
爬取网站数据
selector =<span> parsel.Selector(response.text) trs </span>= selector.css(<span>"</span><span>.tb tr</span><span>"</span><span>) </span><span>for</span> tr <span>in</span><span> trs: dit </span>=<span> {} date </span>= tr.css(<span>"</span><span>td:nth-child(1)::text</span><span>"</span><span>).get() dit[</span><span>"</span><span>日期</span><span>"</span>] =<span> date max_temperature </span>= tr.css(<span>"</span><span>td:nth-child(2)::text</span><span>"</span><span>).get() dit[</span><span>"</span><span>最高温度</span><span>"</span>] =<span> max_temperature min_temperature </span>= tr.css(<span>"</span><span>td:nth-child(3)::text</span><span>"</span><span>).get() dit[</span><span>"</span><span>最低温度</span><span>"</span>] =<span> min_temperature weather </span>= tr.css(<span>"</span><span>td:nth-child(4)::text</span><span>"</span><span>).get() dit[</span><span>"</span><span>天气</span><span>"</span>] =<span> weather wind </span>= tr.css(<span>"</span><span>td:nth-child(5)::text</span><span>"</span><span>).get() dit[</span><span>"</span><span>风向</span><span>"</span>] =<span> wind wind_power </span>= tr.css(<span>"</span><span>td:nth-child(6)::text</span><span>"</span><span>).get() dit[</span><span>"</span><span>风力</span><span>"</span>] =<span> wind_power </span><span>if</span> <span>not</span> dit[<span>"</span><span>日期</span><span>"</span>] ==<span> None: csv_writer.writerow(dit) </span><span>print</span><span>(dit) </span><span>else</span><span>: </span><span>print</span>(None)
保存数据
f = open(<span>"</span><span>天气.csv</span><span>"</span>, mode=<span>"</span><span>a</span><span>"</span>, encoding=<span>"</span><span>utf-8-sig</span><span>"</span>, newline=<span>""</span><span>) csv_writer </span>= csv.DictWriter(f, fieldnames=[<span>"</span><span>日期</span><span>"</span>, <span>"</span><span>最高温度</span><span>"</span>, <span>"</span><span>最低温度</span><span>"</span>, <span>"</span><span>天气</span><span>"</span>, <span>"</span><span>风向</span><span>"</span>, <span>"</span><span>风力</span><span>"</span><span>]) csv_writer.writeheader() f.close()</span>
运行代码,效果如下图