程序员求职经验分享与学习资料整理平台

网站首页 > 文章精选 正文

《requests库(网络请求)》(python网络请求库)

balukai 2025-07-27 18:37:37 文章精选 4 ℃

一、requests库简介

requests是Python最流行的HTTP客户端库,语法简洁,功能强大,支持:

  • 发送GET/POST/PUT/DELETE等HTTP请求
  • 自动处理重定向、Cookie、会话(Session)
  • 支持JSON数据解析、文件上传、代理设置等

安装

pip install requests

二、基础用法

1. 发送GET请求

import requests

response = requests.get(" https://api.github.com ")
print(response.status_code)  # 输出:200(成功)
print(response.text)         # 输出HTML/JSON内容

2. 发送POST请求

payload = {"key1": "value1", "key2": "value2"}
response = requests.post(" https://httpbin.org/post ", data=payload)
print(response.json())  # 输出服务器返回的JSON数据

3. 添加请求头(Headers)

模拟浏览器行为,避免被服务器屏蔽:

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
    "Accept-Language": "en-US,en;q=0.5"
}
response = requests.get(" https://example.com ", headers=headers)

4. 处理查询参数(Query Parameters)

params = {"q": "Python", "page": 2}
response = requests.get(" https://api.example.com/search ", params=params)
print(response.url)  # 输出: https://api.example.com/search?q=Python&page=2 

三、高级功能

1. 处理JSON数据

  • 自动解析JSON响应:
response = requests.get(" https://api.github.com/users/octocat ")
print(response.json()["login"])  # 输出:octocat
  • 发送JSON数据:
data = {"name": "John", "age": 30}
response = requests.post(" https://api.example.com/user ", json=data)

2. 文件上传

files = {"file": open("report.pdf", "rb")}
response = requests.post(" https://api.example.com/upload ", files=files)

3. 超时设置

防止请求无限等待:

response = requests.get(" https://slow-api.com ", timeout=5)  # 5秒超时

4. 会话管理(Session)

保持Cookie和连接池(适用于需要登录的场景):

session = requests.Session()
session.post(" https://example.com/login ", data={"username": "user", "password": "pass"})
response = session.get(" https://example.com/dashboard ")  # 自动携带Cookie

四、异常处理

try:
    response = requests.get(" https://invalid-url.com ", timeout=2)
    response.raise_for_status()  # 若状态码非2xx,抛出HTTPError异常
except requests.exceptions.RequestException as e:
    print(f"请求失败:{e}")

五、实战案例

案例1:爬取网页标题

import requests
from bs4 import BeautifulSoup

url = " https://www.python.org/ "
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
title = soup.title.string
print(title)  # 输出:Welcome to Python.org

案例2:调用REST API(GitHub)

import requests

# 获取用户仓库列表
response = requests.get(" https://api.github.com/users/octocat/repos ")
repos = response.json()

# 创建新仓库(需认证)
headers = {"Authorization": "token YOUR_GITHUB_TOKEN"}
new_repo = {"name": "test-repo"}
response = requests.post(" https://api.github.com/user/repos ", headers=headers, json=new_repo)

案例3:处理分页请求

base_url = " https://api.example.com/items?page= {}"
all_items = []

for page in range(1, 4):  # 请求前3页
    response = requests.get(base_url.format(page))
    items = response.json()
    all_items.extend(items)

print(f"共获取 {len(all_items)} 条数据")

六、常见问题与解决方案

1. SSL证书验证失败

# 禁用证书验证(仅测试环境使用!)
response = requests.get(" https://invalid-ssl-site.com ", verify=False)

2. 代理设置

proxies = {
    "http": " http://proxy.example.com:8080 ",
    "https": " http://proxy.example.com:8080 "
}
response = requests.get(" https://example.com ", proxies=proxies)

3. 响应编码问题

response.encoding = "utf-8"  # 手动指定编码
print(response.text)

4. 上传文件失败

# 确保文件路径正确,且以二进制模式打开
files = {"file": ("report.pdf", open("report.pdf", "rb"), "application/pdf")}
response = requests.post(" https://api.example.com/upload ", files=files)

七、总结与下一步

  • 核心收获
    • 掌握requests库基础用法(GET/POST/JSON)。
    • 学会处理异常、会话管理、文件上传。
    • 能调用REST API并解析响应数据。

Tags:

最近发表
标签列表