1
linzhi OP 有个问题请教下大家,仿照 https://github.com/xchaoinfo/fuck-login 写的知乎模拟登录,一直报验证码无效的 msg ,即使用个错误的密码也是报这个,代码在 https://github.com/linzhi/minerva/blob/master/minerva/zhihu.py ,不知道为啥呢
|
2
linzhi OP 知道原因了。。。获取验证码的 url https://www.zhihu.com/captcha.gif?r=1492661961962&type=login ,即使中间的参数 r (时间戳)是一样的,获取到的验证码也不一样
|
3
linzhi OP 再次更新。。抓取知乎的问题&答案 不需要模拟登陆。。。。之前方向错了
|
4
creatorYC 2017-04-23 16:11:15 +08:00
我想问问为什么我写的爬虫运行一段时间就会报 requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",)),用的 python 和 requests 库,我没有使用多线程,在请求之前都添加了 time.sleep(0.5) ,按说不至于请求太频繁啊,请问这个问题该怎么解决啊!谢谢了,找了好久答案也没辙
|
5
linzhi OP |
6
creatorYC 2017-04-23 16:33:27 +08:00
@linzhi 我看了那个解决方案,应该不是那个问题,要贴异常代码吗?还是程序代码?
Traceback (most recent call last): File "zhihuSprider.py", line 306, in <module> sprider.bfs_search() File "zhihuSprider.py", line 286, in bfs_search self.analyze_user(user_url, followee_url, follower_url) File "zhihuSprider.py", line 155, in analyze_user result = json.loads(self.get_user_data(user_url)) File "zhihuSprider.py", line 144, in get_user_data response = self.session.get(url, headers=self.headers) File "D:\Python27\lib\site-packages\requests\sessions.py", line 501, in get return self.request('GET', url, **kwargs) File "D:\Python27\lib\site-packages\requests\sessions.py", line 488, in request resp = self.send(prep, **send_kwargs) File "D:\Python27\lib\site-packages\requests\sessions.py", line 609, in send r = adapter.send(request, **kwargs) File "D:\Python27\lib\site-packages\requests\adapters.py", line 473, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",)) |