admin管理员组文章数量:1130349
想做竞品分析,打算先从应用市场爬一些应用的用户用户评论作为素材;这次爬取的是360手机助手网站,结尾附爬取完的百度地图和高德地图的用户评论文件~
网页链接:http://zhushou.360/detail/index/soft_id/7655?recrefer=SE_D_%E7%99%BE%E5%BA%A6%E5%9C%B0%E5%9B%BE#nogo
以中评为例 ,打开f12开发者模式,点击“查看更多评论”,可以看到一条getComments的网页链接(如下),分析一下参数可得:
start指开始的评论索引,count指每次加载的评论个数(经试验最多可调至count = 50),type分为三种,best、good和bad分别对应好评、中评和差评;level与前面type参数对应,分别是1,2,3;其余参数不影响数据获取
https://comment.mobilem.360/comment/getComments?callback=jQuery1720035670320680676326_1571120471046&baike=%E7%99%BE%E5%BA%A6%E6%89%8B%E6%9C%BA%E5%9C%B0%E5%9B%BE+for+android&c=message&a=getmessage&start=10&count=10&type=good&level=2&_=1571120557227
查看response发现包含版本信息、评论时间、打分、评论内容等,稍加改动就可以将其作为json格式的数据提取我们想要的参数了:
try{jQuery1720035670320680676326_1571120471046({"errno":0,"error":"","data":{"total":"1550","messages":[{"likes":"9","replies":"0","weight":"0","create_time":"2019-07-27 20:57:03","version_name":"10.17.2","score":"2","text_score":"0","m_type":"0","puid":"0","pid":"0","support_type":"0","content":"\u4e00\u8d77\u5f88\u597d\u7528\u7684\uff0c\u73b0\u5728\u5bfc\u822a\u8def\u7ebf\u90fd\u4e0d\u52a8\u4e86\uff0cGPS\u4fe1\u53f7\u5dee\uff01","imgs":"","username":"\u514b\u4ec0\u7c73\u5c14\u56fd\u738b","image_url":"http:\/\/p1.qhmsg\/dm\/50_50_100\/t01be171c8c069b324b.jpg","msgid":"58675464","type":"good","qid":"223943267","isadmin":"","liked":"0"}
接下来就是找到对应的url修改参数进行爬取并将结果保存到本地文件啦:
import requests
import re
import json
import time
headers = {
"Accept":想做竞品分析,打算先从应用市场爬一些应用的用户用户评论作为素材;这次爬取的是360手机助手网站,结尾附爬取完的百度地图和高德地图的用户评论文件~
网页链接:http://zhushou.360/detail/index/soft_id/7655?recrefer=SE_D_%E7%99%BE%E5%BA%A6%E5%9C%B0%E5%9B%BE#nogo
以中评为例 ,打开f12开发者模式,点击“查看更多评论”,可以看到一条getComments的网页链接(如下),分析一下参数可得:
start指开始的评论索引,count指每次加载的评论个数(经试验最多可调至count = 50),type分为三种,best、good和bad分别对应好评、中评和差评;level与前面type参数对应,分别是1,2,3;其余参数不影响数据获取
https://comment.mobilem.360/comment/getComments?callback=jQuery1720035670320680676326_1571120471046&baike=%E7%99%BE%E5%BA%A6%E6%89%8B%E6%9C%BA%E5%9C%B0%E5%9B%BE+for+android&c=message&a=getmessage&start=10&count=10&type=good&level=2&_=1571120557227
查看response发现包含版本信息、评论时间、打分、评论内容等,稍加改动就可以将其作为json格式的数据提取我们想要的参数了:
try{jQuery1720035670320680676326_1571120471046({"errno":0,"error":"","data":{"total":"1550","messages":[{"likes":"9","replies":"0","weight":"0","create_time":"2019-07-27 20:57:03","version_name":"10.17.2","score":"2","text_score":"0","m_type":"0","puid":"0","pid":"0","support_type":"0","content":"\u4e00\u8d77\u5f88\u597d\u7528\u7684\uff0c\u73b0\u5728\u5bfc\u822a\u8def\u7ebf\u90fd\u4e0d\u52a8\u4e86\uff0cGPS\u4fe1\u53f7\u5dee\uff01","imgs":"","username":"\u514b\u4ec0\u7c73\u5c14\u56fd\u738b","image_url":"http:\/\/p1.qhmsg\/dm\/50_50_100\/t01be171c8c069b324b.jpg","msgid":"58675464","type":"good","qid":"223943267","isadmin":"","liked":"0"}
接下来就是找到对应的url修改参数进行爬取并将结果保存到本地文件啦:
import requests
import re
import json
import time
headers = {
"Accept":版权声明:本文标题:Python爬取360手机助手评论——以百度地图为例 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://it.en369.cn/jiaocheng/1763747381a2959890.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。


发表评论