python3好用的requests库

时间:2023-12-18 08:58:32

python3好用的requests库


requests是什么?

requests是基于urllib编写的http库,支持python3,比urllib更好用,更简单。之前使用python写一些http请求或者爬虫的脚本使用的是urllib来获取html,后来发现用requests方便很多。

安装方法

1、使用pip进行安装

pip install requests

2、下载代码安装

在github上(https://github.com/kennethreitz/requests) 直接下载源代码(或者clone到本地,需要安装git),然后再目录下执行

python setup.py install

3、通过IDE工具去安装,许多IDE工具提供安装方法,例如pycharm之类的

使用方法

首先,查看一下帮助文件,里面有最基本的说明

import requests
help(requests)

帮助文档里给出了发送get和post请求的最简单实例:

 Requests is an HTTP library, written in Python, for human beings. Basic GET
usage: >>> import requests
>>> r = requests.get('https://www.python.org')
>>> r.status_code
200
>>> 'Python is a programming language' in r.content
True ... or POST: >>> payload = dict(key1='value1', key2='value2')
>>> r = requests.post('http://httpbin.org/post', data=payload)
>>> print(r.text)
{
...
"form": {
"key2": "value2",
"key1": "value1"
},
...
}

写一个简单请求的例子:

import requests

#post
url = 'xxx'
FromData = {key1:value,key2:value}
RequestHeaders = {"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding":"utf-8",
"Accept-Language":"zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3",
"Connection":"keep-alive",
"Host":"xxx",
"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0"}
test = requests.post(url , data = FromData , headers = RequestHeaders) #get
url = 'xxx'
#FromData = {key1:value,key2:value}
RequestHeaders = {"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding":"utf-8",
"Accept-Language":"zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3",
"Connection":"keep-alive",
"Host":"xxx",
"Referer":"xxx",
"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0",
"cookie":"sid=xxx"}
test = requests.get(url , headers = RequestHeaders, params={xx})

发送请求后会返回一个requests的类

type(test)
>>><class 'requests.models.Response'>
dir(test)
>>>['__attrs__', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_content', '_content_consumed', 'apparent_encoding', 'close', 'connection', 'content', 'cookies', 'elapsed', 'encoding', 'headers', 'history', 'is_permanent_redirect', 'is_redirect', 'iter_content', 'iter_lines', 'json', 'links', 'ok', 'raise_for_status', 'raw', 'reason', 'request', 'status_code', 'text', 'url']

可以选择相应的方法,对其进行处理

ps:配合beautifulsoup效果更佳

使用requests进行下载文件

a = requests.get(r'http://192.168.1.245:8080/static/stage/headImg/test.jpg',stream=True)    #stream = True,进行请求后不回立即下载文件,这样避免文件过大,内存不足,后面设置chunk_size来将文件分块下载
with open('C:\\test.jpg','wb')as f:
for i in a.iter_content(chunk_size=512): #iter_content:一块一块的遍历要下载的内容;iter_lines:一行一行的遍历要下载的内容
if i:
f.write(i)