How to Use a Socks 4/5 Proxy with Urllib2

How can I use a SOCKS 4/5 proxy with urllib2?

You can use SocksiPy module. Simply copy the file "socks.py" to your Python's lib/site-packages directory, and you're ready to go.

You must use socks before urllib2. (Try it pip install PySocks )

For example:

import socks
import socket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 8080)
socket.socket = socks.socksocket
import urllib2
print urllib2.urlopen('http://www.google.com').read()

You can also try pycurl lib and tsocks, for more detail, click on here.

Using urllib2 with SOCKS proxy

Try with pycurl:

import pycurl
c1 = pycurl.Curl()
c1.setopt(pycurl.URL, 'http://www.google.com')
c1.setopt(pycurl.PROXY, 'localhost')
c1.setopt(pycurl.PROXYPORT, 8080)
c1.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c2 = pycurl.Curl()
c2.setopt(pycurl.URL, 'http://www.yahoo.com')
c2.setopt(pycurl.PROXY, 'localhost')
c2.setopt(pycurl.PROXYPORT, 8081)
c2.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c1.perform()
c2.perform()

Removing SOCKS 4/5 proxy

Abra kadabra

import socks,socket,urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 8080)
temp = socket.socket
socket.socket = socks.socksocket
print urllib2.urlopen('http://www.google.com').read() // Proxy
socket.socket=temp
print urllib2.urlopen('http://www.google.com').read() // No proxy

using tor as a SOCKS5 proxy with python urllib2 or mechanize

See end of question.

import socks
import socket
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock

socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)

# patch the socket module
socket.socket = socks.socksocket
socket.create_connection = create_connection

import urllib2

print urllib2.urlopen('http://icanhazip.com').read()

import mechanize
from mechanize import Browser

br = Browser()
print br.open('http://icanhazip.com').read()

Timeout not working using urllib2, socks5 proxy and socksipy

It turns out the "hanging/timeout" issue i mentioned above was in fact a "blocking" issue in the sockssipy socks.py code. If you are hitting an endpoint that still responds with 200 but sends no data (0 bytes) then socks.py will block cause that's how it's written. Here is the before and after for creating your own timeout:

socks.py BEFORE:

def __recvall(self, bytes):
"""__recvall(bytes) -> data
Receive EXACTLY the number of bytes requested from the socket.
Blocks until the required number of bytes have been received.
"""
data = ""
while len(data) < bytes:
data = data + self.recv(bytes-len(data))
return data

socks.py AFTER with timeout:

def __recvall(self, bytes):
"""__recvall(bytes) -> data
Receive EXACTLY the number of bytes requested from the socket.
Blocks until the required number of bytes have been received.
"""
data = self.recv(bytes, socket.MSG_WAITALL)
if type(data) not in (str, unicode) or len(data) != bytes:
raise socket.timeout('timeout')
return data

unable to access website with urllib and proxy

Should you be using http as the protocol, not socks? Thus:

proxyhand = urllib.request.ProxyHandler({"http" : "http://localhost:5678"})

How to make python Requests work via SOCKS proxy

The modern way:

pip install -U requests[socks]

then

import requests

resp = requests.get('http://go.to',
proxies=dict(http='socks5://user:pass@host:port',
https='socks5://user:pass@host:port'))

I set a proxy server on urllib2, and then I can't change it

That does seem strange. I've always found the httplib2 module to be the easiest Python HTTP client to work with. There is an example of using httplib2 with the socks module.

Sorry, I know this isn't a specific answer to your question, but it might be a workaround to try.



Related Topics



Leave a reply



Submit