requests library, an additional dependency is required: pip install 'requests[socks]'.socks5:// and socks5h:// schemes determines where the DNS resolution of the target domain occurs. Using socks5 results in client-side DNS resolution. Conversely, when using socks5h, the DNS request is routed through the proxy server. To achieve maximum anonymity and prevent DNS leakage, which can reveal the client’s geographical location regardless of the proxy’s IP address, the socks5h scheme is highly recommended.
requests library allows for easy proxy application using the proxies argument in any request method (get, post, etc.). Proxies are passed as a dictionary where the key corresponds to the protocol (http or https), and the value is the full URL of the proxy server. The proxy server URL must include the scheme defining the protocol used to communicate with the proxy itself (e.g., http:// or socks5://).
import requests
proxies = {
'http': 'http://10.10.1.10:3128', # For http traffic
'https': 'http://10.10.1.10:1080', # For https traffic
}
response = requests.get('http://example.org', proxies=proxies)
scheme://username:password@host:port.
If the password contains special characters, such as @, :, or %, they can disrupt the URL structure. In such cases, URL encoding is mandatory. This is done using the urllib.parse module, which ensures the correct transfer of complex credentials without violating the URL syntax.
import urllib.parse
password = "p@ss:word-with-special-chars"
encoded_password = urllib.parse.quote(password)
proxies = {
"http": f"http://user123:{encoded_password}@192.168.1.100:8080",
"https": f"http://user123:{encoded_password}@192.168.1.100:8080"
}
HTTP_PROXY, HTTPS_PROXY, and ALL_PROXY (for universal protocols, including SOCKS). This provides a convenient way to globally configure proxies without changing the Python code.
However, in production systems, storing confidential data, such as proxy credentials, directly in environment variables or versioned files should be avoided due to heightened security risk. Priority must be given to secret management through specialized managers.
requests.Session() object is critical for improving performance and convenience. It preserves state (headers, cookies), but most importantly, it implements Connection Pooling. This means that the same underlying TCP connection is reused for multiple requests to the same host, substantially reducing network overhead and accelerating the process.
Proxies can be set for the entire session via session.proxies = {...}.
session = requests.Session()
session.proxies = {
'http': 'http://103.167.135.111:80',
'https': 'http://116.98.229.237:10003'
}
response = session.get(url)
A critical nuance: settings established in session.proxies can conflict with proxies that Requests reads from environment variables. To guarantee the use of a dynamically selected proxy (especially important during rotation), the most reliable approach is explicit overriding, i.e., passing the proxies dictionary with every session method call: session.get(url, proxies=proxies). This method ensures environment variables are ignored, and your current, dynamically selected proxy configuration is used.
api.ipify.org?format=json).200 OK) is received.requests.exceptions.Timeout and requests.exceptions.ProxyError must be built in. These errors immediately exclude the proxy from the active pool.X-Forwarded-For and Via.
For accurate anonymity determination, specialized test sites that return all received headers (e.g., http://azenv.net/) must be used. Using general APIs like httpbin.org can be unreliable, as they may intentionally strip proxy headers, creating a false impression of anonymity.
Analysis of these headers allows for strict proxy classification:
| Anonymity Level | Description | Header Signatures |
|---|---|---|
| Transparent | Transmits the client’s real IP and identifies itself as a proxy. | Via and/or X-Forwarded-For (with real client IP) are present. |
| Anonymous | Conceals the client’s real IP but identifies itself as a proxy. | Via is present, but X-Forwarded-For is absent. |
| Elite (High Anonymity) | Conceals the real IP and does not identify itself as a proxy. | Headers Via, X-Forwarded-For, Proxy-Connection are absent. |
random.choice(). Effective for uniform load distribution.import requests
import random
# List of proxies in URL format
proxies_list = [
'http://user:pass@1.1.1.1:8080',
#...
]
url_to_scrape = "https://httpbin.org/ip"
for i in range(5):
proxy_url = random.choice(proxies_list) # Random proxy selection
proxies = {
"http": proxy_url,
"https": proxy_url,
}
try:
response = requests.get(url_to_scrape, proxies=proxies, timeout=10)
print(f"Request {i+1} successful. IP: {response.json().get('origin')}")
except requests.exceptions.RequestException as e:
print(f"Request {i+1} failed. Error: {e}")
python-requests/X.Y.Z) immediately reveals an automated script.
To ensure reliable masking, IP rotation must be integrated with realistic User Agent rotation. For each request, a random User Agent (mimicking Chrome, Firefox, etc.) must be selected and passed via the headers dictionary to ensure maximum similarity to real browser traffic.
import requests
import random
headers_pool = [
{'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36'},
#... add more realistic User Agents
]
#... logic for selecting proxy_url...
random_headers = random.choice(headers_pool)
response = requests.get(
url,
proxies=proxies,
headers=random_headers
)
requests.exceptions.Timeout or requests.exceptions.ProxyError exceptions occur due to proxy failure, the most effective action is to immediately switch to the next proxy in the pool.
urllib3.util.Retry class via the requests.adapters.HTTPAdapter mechanism. The adapter is mounted to a requests.Session object, applying the specified retry strategy to all requests within that session. Key parameters include: total (max attempts) and status_forcelist (list of HTTP codes for retry, including 5xx and 429).
429 Too Many Requests is a direct signal from the server that the limit has been exceeded. The Exponential Backoff strategy is used to handle this error, where the waiting time between consecutive attempts increases according to a power dependency: $backoff\_factor \times (2^{(\text{current\_number\_of\_retries} – 1)})$. This approach prevents aggressive request spamming and is a “polite” method for interacting with APIs.
Retry-After header) before sending the request. This ensures the IP address is switched in response to detection and blocking.Timeout, ProxyError, 4xx blocks) and, upon their occurrence, changes the current proxy before retrying the attempt. Upon success, the manager returns the response; upon failure, it switches the IP and tries again until the pool or attempt limit is exhausted.
aiohttp and asyncio. Since requests are I/O-bound operations (much time is spent waiting for a response), the asynchronous approach allows processing thousands of URLs simultaneously, maximizing the use of each active proxy server.
Roman Bulatov brings 15+ years of hands-on experience:
- Web Infrastructure Expert: Built and scaled numerous data-heavy projects since 2005
- Proxy Specialist: Designed and deployed a distributed proxy verification system with a daily throughput capacity of 120,000+ proxies across multiple performance and security metrics.
- Security Focus: Creator of ProxyVerity's verification methodology
- Open Internet Advocate: Helps journalists and researchers bypass censorship
"I created ProxyVerity after years of frustration with unreliable proxies - now we do the hard work so you get working solutions."