爬虫利器-cURL转换

前言

在爬虫的过程,经常需要为程序添加请求头,参数,cookie等信息,但是这些信息的添加都需要手动的去浏览器中找,然后一点一点的慢慢复制粘贴,这样效率就非常的低了。今天就分享一个网站,解决这些问题,让你脱离这些没有意义的劳动

网站介绍

网址: https://curl.trillworks.com

从上图可以看到网站的教程,只要根据教程三步走,就可以快速的添加相应的请求信息

示范

将需要爬取的请求复制curl到网站中转换,然后复制到pycharm中就可以直接爬取到整个网站的源码了,接下来就可以直接在这个基础上开始逻辑工作了

生成的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import requests

cookies = {
'_octo': 'GH1.1.681056136.1509806877',
'_gat': '1',
'logged_in': 'no',
'_ga': 'GA1.2.70269906.1509806877',
'tz': 'Asia%2FShanghai',
'has_recent_activity': '1',
'_gh_sess': 'cGpmdExmZUZpckZ0R1pSQlFxZlpsS2ZvT3NZbUU0YW1qTVloSzdFeWNxeWdNaGxsNzVveTJ3Vndrc2ZaN3ZoRDNYMm10TW9OdUdGVHhwbVRmMEU3ZWVwTUx4dUpZTUgrbHdKZkV0RnpzN3hodG12TGdLbHpSemVaQ0ZMM201MGdxMlkxdk5JNUZ6em1SWGp5ZEJUYTNQMjRFcCtqUDZaWVVFNXl3VDJRRUU4MFpqYkpvekY1VmZpY2t1R01ZcGRPQlZBUEJUOTJaWnNESjVnMnlkcncyWWhCVDl1OE5aVDhpR2Z4Z1NYVkFVNk5ReDRtTVphOXFXQWJNSVZYcnEyVktLTERLMHBTYjNwa2tUQUJaaWREQ0N4NzJYTG9sM1dpUktPaWFETFVpWGZlWFNvb2ZxazU1OUxMazVjZ3VNNTJteEdENzJPQlFKeDV3YXZCbmdHSGdGVmx5OVNjU2VaZXh3eEVwSlptczZXV3lQZXgrOGEyVGFwcUpPcFhIZTRWaDIwZExMRWhDRE8yMUdJT2xmS1grQ3I3bEYySGJvWFhNTFR3VmNpRnlLTT0tLXlRMmJZanl4Z0tUU0c0N1ZrRHpqbkE9PQ%3D%3D--1899440138004359a97b156d0ac8941135684ab5',
}

headers = {
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'zh-CN,zh;q=0.8',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Referer': 'https://ghbtns.com/github-btn.html?user=NickCarneiro&repo=curlconverter&type=watch&count=true&size=large',
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
}

response = requests.get('https://github.com/NickCarneiro/curlconverter/', headers=headers, cookies=cookies)

可以看到生成的代码非常的规整,是不是很方便~~

0%