How can I use a SOCKS 4/5 proxy with urllib2?
You can use SocksiPy module. Simply copy the file "socks.py" to your Python's lib/site-packages directory, and you're ready to go.
You must use socks before urllib2. (Try it pip install PySocks
)
For example:
import socks
import socket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 8080)
socket.socket = socks.socksocket
import urllib2
print urllib2.urlopen('http://www.google.com').read()
You can also try pycurl lib and tsocks, for more detail, click on here.
Using urllib2 with SOCKS proxy
Try with pycurl:
import pycurl
c1 = pycurl.Curl()
c1.setopt(pycurl.URL, 'http://www.google.com')
c1.setopt(pycurl.PROXY, 'localhost')
c1.setopt(pycurl.PROXYPORT, 8080)
c1.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)
c2 = pycurl.Curl()
c2.setopt(pycurl.URL, 'http://www.yahoo.com')
c2.setopt(pycurl.PROXY, 'localhost')
c2.setopt(pycurl.PROXYPORT, 8081)
c2.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)
c1.perform()
c2.perform()
Removing SOCKS 4/5 proxy
Abra kadabra
import socks,socket,urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 8080)
temp = socket.socket
socket.socket = socks.socksocket
print urllib2.urlopen('http://www.google.com').read() // Proxy
socket.socket=temp
print urllib2.urlopen('http://www.google.com').read() // No proxy
using tor as a SOCKS5 proxy with python urllib2 or mechanize
See end of question.
import socks
import socket
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
# patch the socket module
socket.socket = socks.socksocket
socket.create_connection = create_connection
import urllib2
print urllib2.urlopen('http://icanhazip.com').read()
import mechanize
from mechanize import Browser
br = Browser()
print br.open('http://icanhazip.com').read()
Timeout not working using urllib2, socks5 proxy and socksipy
It turns out the "hanging/timeout" issue i mentioned above was in fact a "blocking" issue in the sockssipy socks.py code. If you are hitting an endpoint that still responds with 200 but sends no data (0 bytes) then socks.py will block cause that's how it's written. Here is the before and after for creating your own timeout:
socks.py BEFORE:
def __recvall(self, bytes):
"""__recvall(bytes) -> data
Receive EXACTLY the number of bytes requested from the socket.
Blocks until the required number of bytes have been received.
"""
data = ""
while len(data) < bytes:
data = data + self.recv(bytes-len(data))
return data
socks.py AFTER with timeout:
def __recvall(self, bytes):
"""__recvall(bytes) -> data
Receive EXACTLY the number of bytes requested from the socket.
Blocks until the required number of bytes have been received.
"""
data = self.recv(bytes, socket.MSG_WAITALL)
if type(data) not in (str, unicode) or len(data) != bytes:
raise socket.timeout('timeout')
return data
unable to access website with urllib and proxy
Should you be using http
as the protocol, not socks
? Thus:
proxyhand = urllib.request.ProxyHandler({"http" : "http://localhost:5678"})
How to make python Requests work via SOCKS proxy
The modern way:
pip install -U requests[socks]
then
import requests
resp = requests.get('http://go.to',
proxies=dict(http='socks5://user:pass@host:port',
https='socks5://user:pass@host:port'))
I set a proxy server on urllib2, and then I can't change it
That does seem strange. I've always found the httplib2 module to be the easiest Python HTTP client to work with. There is an example of using httplib2 with the socks module.
Sorry, I know this isn't a specific answer to your question, but it might be a workaround to try.
Related Topics
Pyinstaller Unable to Access Data Folder
Finding What Methods a Python Object Has
Explaining Python's '_Enter_' and '_Exit_'
Python Image Library Fails with Message "Decoder Jpeg Not Available" - Pil
How to Set the Current Working Directory
How to Create Nested Dict in Python
Get Human Readable Version of File Size
In Python, How to Capture the Stdout from a C++ Shared Library to a Variable
Passing a Matplotlib Figure to HTML (Flask)
Generating HTML Documents in Python
Understand the Find() Function in Beautiful Soup
How to Put Parameterized SQL Query into Variable and Then Execute in Python
A Mutable Type Inside an Immutable Container
Efficiently Convert Uneven List of Lists to Minimal Containing Array Padded with Nan
Python Giving Filenotfounderror for File Name Returned by Os.Listdir