哈喽 ,大家好!明天小编为大家带来一个十分实用的小技巧 咱们如何把爬取的信息保留到excel。
首先咱们讲存入excel常常用到的库,xlrd,xlwt,xlwings,openpyxl,xlsxwriter等等有很多,然而我用的是openpyxl这个库去保留的信息到excel。
openpyxl是一个用于读取和编写Excel 2010 xlsx/xlsm/xltx/xltm文件的库。
下载openpyxl
windows +R 关上cmd 输出命令 pip install openpyxl
pip install openpyxl
接下来就是如何创立excel
import openpyxl #关上文件 wb = openpyxl.Workbook() #应用的工作对象创立一张表 sheet1 = wb.active #在sheet1表中写入内容 插入内容 sheet1.append['姓名','性别'] #等等 这是在excel第一行插入,能够相当于一个文件的表头 #默认创立新的表单 放在最初 sheet2 = wb.create_sheet('title') #批改表名 sheet2.title ='new sheet' #色彩 sheet1.sheet_properties.tabColor= '000000' # 敞开保留工作簿 wb.save('文件名.xlsx')
比方我爬取的一个婚恋网站-我主良缘
首先咱们要解析它的网页地址
因为咱们所须要的信息和内容是在list上面 ,而list又在data上面 所以咱们能够用一个for循环把它遍历进去,代码如下:
for item in json['data']['list']: username=item['username'] gender = item['gender'] userid=item['userid'] province=item['province'] height = item['height'] city = item['city'] astro = item['astro'] birthdayyear = item['birthdayyear'] salary = item['salary'] avatar=item['avatar'] monolog = item['monolog'] print("ID:"+userid,"姓名:"+username,"性别:"+gender,"省份::"+province,"城市:"+city,"出世年日:"+birthdayyear,"身高:"+height,"工资:"+salary,"照片:"+avatar,"星座:"+astro,"内心独白:"+monolog)
既然咱们须要把信息保留到excel中 那么就须要把下面这段代码放在创立excel表的代码当中。
残缺代码如下:
#关上文件 wb = openpyxl.Workbook() #应用的工作对象创立一张表 sheet1 = wb.active #在sheet1表中写入内容 插入内容 sheet1.append(['ID','姓名','性别','省份','城市','出世年日','身高(cm)','工资','照片','星座','内心独白']) for page in range(1,10): #获取1到10页的内容 #依据用户输出的数据,获取服务返回的数据 json = get_data(page,startage,endage,gender,startheight,endheight,salary) #print(json['data']['list']) for item in json['data']['list']: username=item['username'] gender = item['gender'] userid=item['userid'] province=item['province'] height = item['height'] city = item['city'] astro = item['astro'] birthdayyear = item['birthdayyear'] salary = item['salary'] avatar=item['avatar'] monolog = item['monolog'] print("ID:"+userid,"姓名:"+username,"性别:"+gender,"省份::"+province,"城市:"+city,"出世年日:"+birthdayyear,"身高:"+height,"工资:"+salary,"照片:"+avatar,"星座:"+astro,"内心独白:"+monolog) print('开始写入ecxel,请稍等...',end='') xx_info = [userid,username,gender,province,city,birthdayyear,height,salary,avatar,astro,monolog] sheet1.append(xx_info) print('写入胜利\n') # 敞开保留工作簿 wb.save('相亲网站数据抓取.xlsx')
好了 明天小编就讲到这里啦