How do I get the IP address from a http request using the requests library?
It turns out that it's rather involved.
Here's a monkey-patch while using requests
version 1.2.3:
Wrapping the _make_request
method on HTTPConnectionPool
to store the response from socket.getpeername()
on the HTTPResponse
instance.
For me on python 2.7.3, this instance was available on response.raw._original_response
.
from requests.packages.urllib3.connectionpool import HTTPConnectionPool
def _make_request(self,conn,method,url,**kwargs):
response = self._old_make_request(conn,method,url,**kwargs)
sock = getattr(conn,'sock',False)
if sock:
setattr(response,'peer',sock.getpeername())
else:
setattr(response,'peer',None)
return response
HTTPConnectionPool._old_make_request = HTTPConnectionPool._make_request
HTTPConnectionPool._make_request = _make_request
import requests
r = requests.get('http://www.google.com')
print r.raw._original_response.peer
Yields:('2a00:1450:4009:809::1017', 80, 0, 0)
Ah, if there's a proxy involved or the response is chunked, the
HTTPConnectionPool._make_request
isn't called.So here's a new version patching httplib.getresponse
instead:
import httplib
def getresponse(self,*args,**kwargs):
response = self._old_getresponse(*args,**kwargs)
if self.sock:
response.peer = self.sock.getpeername()
else:
response.peer = None
return response
httplib.HTTPConnection._old_getresponse = httplib.HTTPConnection.getresponse
httplib.HTTPConnection.getresponse = getresponse
import requests
def check_peer(resp):
orig_resp = resp.raw._original_response
if hasattr(orig_resp,'peer'):
return getattr(orig_resp,'peer')
Running:>>> r1 = requests.get('http://www.google.com')
>>> check_peer(r1)
('2a00:1450:4009:808::101f', 80, 0, 0)
>>> r2 = requests.get('https://www.google.com')
>>> check_peer(r2)
('2a00:1450:4009:808::101f', 443, 0, 0)
>>> r3 = requests.get('http://wheezyweb.readthedocs.org/en/latest/tutorial.html#what-you-ll-build')
>>> check_peer(r3)
('162.209.99.68', 80)
Also checked running with proxies set; proxy address is returned.Update 2016/01/19
est offers an alternative that doesn't need the monkey-patch:
rsp = requests.get('http://google.com', stream=True)
# grab the IP while you can, before you consume the body!!!!!!!!
print rsp.raw._fp.fp._sock.getpeername()
# consume the body, which calls the read(), after that fileno is no longer available.
print rsp.content
Update 2016/05/19
From the comments, copying here for visibility, Richard Kenneth Niescior offers the following that is confirmed working with requests 2.10.0 and Python 3.
rsp=requests.get(..., stream=True)
rsp.raw._connection.sock.getpeername()
Update 2019/02/22
Python3 with requests version 2.19.1.
resp=requests.get(..., stream=True)
resp.raw._connection.sock.socket.getsockname()
Update 2020/01/31
Python3.8 with requests 2.22.0
resp = requests.get('https://www.google.com', stream=True)
resp.raw._connection.sock.getsockname()
Python - get IP from HTTP request using requests module
urllib3
will automatically skip unroutable addresses for a given DNS name. This is not something that needs preventing.
What happens internally when creating a connection is this:
- DNS information is requested; if your system supports IPv6 (binding to
::1
succeeds) then that includes IPv6 addresses. - In the order that the addresses are listed, they are tried one by one
- for each address a suitable socket is configured and
- The socket is told to connect to the IP address
- If connecting fails, the next IP address is tried, otherwise the connected socket is returned.
urllib3.util.connection.create_connection()
function. Private networks are usually not routable and are thus skipped automatically.However, if you are on a private network yourself, then it is possible that an attempt is made to connect to that IP address anyway, which can take some time to resolve.
The solution is to adapt a previous answer of mine that lets you resolve the hostname at the point where the socket connection is created; this should let you skip private use addresses. Create your own loop over socket.getaddrinfo()
and raise an exception at that point if a private network address would be attempted:
import socket
from ipaddress import ip_address
from urllib3.util import connection
class PrivateNetworkException(Exception):
pass
_orig_create_connection = connection.create_connection
def patched_create_connection(address, *args, **kwargs):
"""Wrap urllib3's create_connection to resolve the name elsewhere"""
# resolve hostname to an ip address; use your own
# resolver here, as otherwise the system resolver will be used.
family = connection.allowed_gai_family()
host, port = address
err = None
for *_, sa in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
ip, port = sa
if ip_address(ip).is_private:
# Private network address, raise an exception to prevent
# connecting
raise PrivateNetworkException(ip)
try:
# try to create connection for this one address
return _orig_create_connection((ip, port), *args, **kwargs)
except socket.error as err:
last_err = err
continue
if last_err is not None:
raise last_err
connection.create_connection = patched_create_connection
So this code looks up the IP addresses for a host early, then raises a custom exception. Catch that exception:with requests.Session(max_redirects=5) as s:
try:
r = s.get(url, timeout=5, stream=True)
return {'url': url, 'staus_code': r.status_code}
except PrivateNetworkException:
return 'Private IP'
except requests.exceptions.RequestException:
return 'ERROR'
How to Get the IP address of the URL in requests module in Python 3.5.1
You got to the BufferedReader
instance; it is a wrapper around the actual file object adding a buffer. The original file object is reachable via the raw
attribute:
print(rsp.raw._fp.fp.raw._sock.getpeername())
Demo:>>> import requests
>>> rsp = requests.get('http://google.com', stream=True)
>>> print(rsp.raw._fp.fp.raw._sock.getpeername())
('2a00:1450:400b:c02::69', 80, 0, 0)
To make the code work on both Python 2 and 3, see if the raw
attribute is there:fp = rsp.raw._fp.fp
sock = fp.raw._sock if hasattr(fp, 'raw') else fp._sock
print(sock.getpeername())
Python requests, change IP address
As already mentioned in the comments and from yourself, changing the IP could help. To do this quite easily have a look at vpngate.py:
https://gist.github.com/Lazza/bbc15561b65c16db8ca8
An How to is provided at the link.
Correct way of getting Client's IP Addresses from http.Request
Looking at http.Request you can find the following member variables:
// HTTP defines that header names are case-insensitive.
// The request parser implements this by canonicalizing the
// name, making the first character and any characters
// following a hyphen uppercase and the rest lowercase.
//
// For client requests certain headers are automatically
// added and may override values in Header.
//
// See the documentation for the Request.Write method.
Header Header
// RemoteAddr allows HTTP servers and other software to record
// the network address that sent the request, usually for
// logging. This field is not filled in by ReadRequest and
// has no defined format. The HTTP server in this package
// sets RemoteAddr to an "IP:port" address before invoking a
// handler.
// This field is ignored by the HTTP client.
RemoteAddr string
You can use RemoteAddr
to get the remote client's IP address and port (the format is "IP:port"), which is the address of the original requestor or the last proxy (for example a load balancer which lives in front of your server).This is all you have for sure.
Then you can investigate the headers, which are case-insensitive (per documentation above), meaning all of your examples will work and yield the same result:
req.Header.Get("X-Forwarded-For") // capitalisation
req.Header.Get("x-forwarded-for") // doesn't
req.Header.Get("X-FORWARDED-FOR") // matter
This is because internally http.Header.Get
will normalise the key for you. (If you want to access header map directly, and not through Get
, you would need to use http.CanonicalHeaderKey first.)Finally, "X-Forwarded-For"
is probably the field you want to take a look at in order to grab more information about client's IP. This greatly depends on the HTTP software used on the remote side though, as client can put anything in there if it wishes to. Also, note the expected format of this field is the comma+space separated list of IP addresses. You will need to parse it a little bit to get a single IP of your choice (probably the first one in the list), for example:
// Assuming format is as expected
ips := strings.Split("10.0.0.1, 10.0.0.2, 10.0.0.3", ", ")
for _, ip := range ips {
fmt.Println(ip)
}
will produce:10.0.0.1
10.0.0.2
10.0.0.3
Related Topics
Scipy Curve_Fit Doesn't Like Math Module
What Does Model.Train() Do in Pytorch
Where Do the Python Unit Tests Go
Rotating a Two-Dimensional Array in Python
Brew Installation of Python 3.6.1: [Ssl: Certificate_Verify_Failed] Certificate Verify Failed
How to Convert an Integer to the Shortest Url-Safe String in Python
Use Index in Pandas to Plot Data
How to Check If Stdin Has Some Data
Suppressing Scientific Notation in Pandas
Pycharm: Set Environment Variable for Run Manage.Py Task
How to Get All the Request Headers in Django
Broken References in Virtualenvs
How to Plot Empirical Cdf (Ecdf)
What Is the Relationship Between Google's App Engine Sdk and Cloud Sdk