Handling urllib2's timeout? - Python
There are very few cases where you want to use except:
. Doing this captures any exception, which can be hard to debug, and it captures exceptions including SystemExit
and KeyboardInterupt
, which can make your program annoying to use..
At the very simplest, you would catch urllib2.URLError
:
try:
urllib2.urlopen("http://example.com", timeout = 1)
except urllib2.URLError, e:
raise MyException("There was an error: %r" % e)
The following should capture the specific error raised when the connection times out:
import urllib2
import socket
class MyException(Exception):
pass
try:
urllib2.urlopen("http://example.com", timeout = 1)
except urllib2.URLError, e:
# For Python 2.6
if isinstance(e.reason, socket.timeout):
raise MyException("There was an error: %r" % e)
else:
# reraise the original error
raise
except socket.timeout, e:
# For Python 2.7
raise MyException("There was an error: %r" % e)
How to handle urllib's timeout in Python 3?
Catch the different exceptions with explicit clauses, and check the reason for the exception with URLError (thank you Régis B.)
from socket import timeout
try:
response = urllib.request.urlopen(url, timeout=10).read().decode('utf-8')
except HTTPError as error:
logging.error('HTTP Error: Data of %s not retrieved because %s\nURL: %s', name, error, url)
except URLError as error:
if isinstance(error.reason, timeout):
logging.error('Timeout Error: Data of %s not retrieved because %s\nURL: %s', name, error, url)
else:
logging.error('URL Error: Data of %s not retrieved because %s\nURL: %s', name, error, url)
else:
logging.info('Access successful.')
NB For recent comments, the original post referenced python 3.2 where you needed to catch timeout errors explicitly with socket.timeout
. For example
# Warning - python 3.2 code
from socket import timeout
try:
response = urllib.request.urlopen(url, timeout=10).read().decode('utf-8')
except timeout:
logging.error('socket timed out - URL %s', url)
setting the timeout on a urllib2.request() call
Although urlopen
does accept data
param for POST
, you can call urlopen
on a Request
object like this,
import urllib2
request = urllib2.Request('http://www.example.com', data)
response = urllib2.urlopen(request, timeout=4)
content = response.read()
How to catch timeout error with urllib
On python 3.x, TimeoutError
is a builtin exception class. You can catch it with
except TimeoutError:
...
On python 2.x, you have two options:
Catch
urllib.socket.timeout
exceptionCatch
requests.Timeout
exception
How to handle urllib2 socket timeouts?
Explicitly catch the timeout exception: https://docs.python.org/3/library/socket.html#socket.timeout
try:
image_file = urllib2.urlopen(submission.url, timeout = 5)
except urllib2.URLError as e:
print(e)
continue
except socket.Timeouterror:
print("timed out")
# Your timeout handling code here...
else:
with open('/home/mona/computer_vision/image_retrieval/images/'+category+'/' + datetime.datetime.now().strftime('%y-%m-%d-%s') + submission.url[-5:], 'wb') as output_image:
output_image.write(image_file.read())
OP:
Thanks!
I had these thanks to your suggestion and my problem was solved for Python2.7:
except socket.timeout as e:
print(e)
continue
except socket.error as e:
print(e)
continue
Python urllib2 does not respect timeout
If you run
import urllib2
url = 'https://www.5giay.vn/'
urllib2.urlopen(url, timeout=1.0)
wait for a few seconds, and then use C-c to interrupt the program, you'll see
File "/usr/lib/python2.7/ssl.py", line 260, in read
return self._sslobj.read(len)
KeyboardInterrupt
This shows that the program is hanging on self._sslobj.read(len)
.
SSL timeouts raise socket.timeout
.
You can control the delay before socket.timeout is raised by callingsocket.setdefaulttimeout(1.0)
.
For example,
import urllib2
import socket
socket.setdefaulttimeout(1.0)
url = 'https://www.5giay.vn/'
try:
urllib2.urlopen(url, timeout=1.0)
except IOError as err:
print('timeout')
% time script.py
timeout
real 0m3.629s
user 0m0.020s
sys 0m0.024s
Note that the requests module succeeds here although urllib2
did not:
import requests
r = requests.get('https://www.5giay.vn/')
How to enforce a timeout on the entire function call:
socket.setdefaulttimeout
only affects how long Python waits before an exception is raised if the server has not issued a response.
Neither it nor urlopen(..., timeout=...)
enforce a time limit on the entire function call.
To do that, you could use eventlets, as shown here.
If you don't want to install eventlets
, you could use multiprocessing
from the standard library; though this solution will not scale as well as an asynchronous solution such as the one eventlets
provides.
import urllib2
import socket
import multiprocessing as mp
def timeout(t, cmd, *args, **kwds):
pool = mp.Pool(processes=1)
result = pool.apply_async(cmd, args=args, kwds=kwds)
try:
retval = result.get(timeout=t)
except mp.TimeoutError as err:
pool.terminate()
pool.join()
raise
else:
return retval
def open(url):
response = urllib2.urlopen(url)
print(response)
url = 'https://www.5giay.vn/'
try:
timeout(5, open, url)
except mp.TimeoutError as err:
print('timeout')
Running this will either succeed or timeout in about 5 seconds of wall clock time.
How can I force urllib2 to time out?
I usually use netcat to listen on port 80 of my local machine:
nc -l 80
Then I use http://localhost/ as the request URL in my application. Netcat will answer at the http port but won't ever give a response, so the request is guaranteed to time out provided that you have specified a timeout in your urllib2.urlopen()
call or by calling socket.setdefaulttimeout()
.
Timing out urllib2 urlopen operation in Python 2.4
You can achieve this using signals.
Here's an example of my signal decorator that you can use to set the timeout for individual functions.
Ps. not sure if this is syntactically correct for 2.4. I'm using 2.6 but the 2.4 supports signals.
import signal
import time
class TimeOutException(Exception):
pass
def timeout(seconds, *args, **kwargs):
def fn(f):
def wrapped_fn(*args, **kwargs):
signal.signal(signal.SIGALRM, handler)
signal.alarm(seconds)
f(*args, **kwargs)
return wrapped_fn
return fn
def handler(signum, frame):
raise TimeOutException("Timeout")
@timeout(5)
def my_function_that_takes_long(time_to_sleep):
time.sleep(time_to_sleep)
if __name__ == '__main__':
print 'Calling function that takes 2 seconds'
try:
my_function_that_takes_long(2)
except TimeOutException:
print 'Timed out'
print 'Calling function that takes 10 seconds'
try:
my_function_that_takes_long(10)
except TimeOutException:
print 'Timed out'
Related Topics
Pygame: Problems with Shooting in Space Invaders
How Is Tuple Implemented in Cpython
Why Isn't the Regular Expression's "Non-Capturing" Group Working
How to Remove Specific Tag/Sticker/Object from Images Using Opencv
How to Find Out My Pythonpath Using Python
When to Use Sys.Path.Append and When Modifying %Pythonpath% Is Enough
How to Programmatically Set a Global (Module) Variable
Str.Startswith with a List of Strings to Test For
How to Get the Domain Name of My Site Within a Django Template
Why Is the Exit Window Button Work But the Exit Button in the Game Does Not Work
Removing Control Characters from a String in Python
Ipython Notebook Clear Cell Output in Code
Appending a Dictionary to a List in a Loop
Loop Over a List Containing Path to Sound Files