Handling Urllib2's Timeout? - Python

Handling urllib2's timeout? - Python

There are very few cases where you want to use except:. Doing this captures any exception, which can be hard to debug, and it captures exceptions including SystemExit and KeyboardInterupt, which can make your program annoying to use..

At the very simplest, you would catch urllib2.URLError:

try:
urllib2.urlopen("http://example.com", timeout = 1)
except urllib2.URLError, e:
raise MyException("There was an error: %r" % e)

The following should capture the specific error raised when the connection times out:

import urllib2
import socket

class MyException(Exception):
pass

try:
urllib2.urlopen("http://example.com", timeout = 1)
except urllib2.URLError, e:
# For Python 2.6
if isinstance(e.reason, socket.timeout):
raise MyException("There was an error: %r" % e)
else:
# reraise the original error
raise
except socket.timeout, e:
# For Python 2.7
raise MyException("There was an error: %r" % e)

How to handle urllib's timeout in Python 3?

Catch the different exceptions with explicit clauses, and check the reason for the exception with URLError (thank you Régis B.)

from socket import timeout
try:
response = urllib.request.urlopen(url, timeout=10).read().decode('utf-8')
except HTTPError as error:
logging.error('HTTP Error: Data of %s not retrieved because %s\nURL: %s', name, error, url)
except URLError as error:
if isinstance(error.reason, timeout):
logging.error('Timeout Error: Data of %s not retrieved because %s\nURL: %s', name, error, url)
else:
logging.error('URL Error: Data of %s not retrieved because %s\nURL: %s', name, error, url)
else:
logging.info('Access successful.')

NB For recent comments, the original post referenced python 3.2 where you needed to catch timeout errors explicitly with socket.timeout. For example


# Warning - python 3.2 code
from socket import timeout

try:
response = urllib.request.urlopen(url, timeout=10).read().decode('utf-8')
except timeout:
logging.error('socket timed out - URL %s', url)

setting the timeout on a urllib2.request() call

Although urlopen does accept data param for POST, you can call urlopen on a Request object like this,

import urllib2
request = urllib2.Request('http://www.example.com', data)
response = urllib2.urlopen(request, timeout=4)
content = response.read()

How to catch timeout error with urllib

On python 3.x, TimeoutError is a builtin exception class. You can catch it with

except TimeoutError:
...

On python 2.x, you have two options:

  1. Catch urllib.socket.timeout exception

  2. Catch requests.Timeout exception

How to handle urllib2 socket timeouts?

Explicitly catch the timeout exception: https://docs.python.org/3/library/socket.html#socket.timeout

try:
image_file = urllib2.urlopen(submission.url, timeout = 5)
except urllib2.URLError as e:
print(e)
continue
except socket.Timeouterror:
print("timed out")
# Your timeout handling code here...
else:
with open('/home/mona/computer_vision/image_retrieval/images/'+category+'/' + datetime.datetime.now().strftime('%y-%m-%d-%s') + submission.url[-5:], 'wb') as output_image:
output_image.write(image_file.read())

OP:
Thanks!
I had these thanks to your suggestion and my problem was solved for Python2.7:

except socket.timeout as e:
print(e)
continue
except socket.error as e:
print(e)
continue

Python urllib2 does not respect timeout

If you run

import urllib2

url = 'https://www.5giay.vn/'
urllib2.urlopen(url, timeout=1.0)

wait for a few seconds, and then use C-c to interrupt the program, you'll see

  File "/usr/lib/python2.7/ssl.py", line 260, in read
return self._sslobj.read(len)
KeyboardInterrupt

This shows that the program is hanging on self._sslobj.read(len).

SSL timeouts raise socket.timeout.

You can control the delay before socket.timeout is raised by calling
socket.setdefaulttimeout(1.0).

For example,

import urllib2
import socket

socket.setdefaulttimeout(1.0)
url = 'https://www.5giay.vn/'
try:
urllib2.urlopen(url, timeout=1.0)
except IOError as err:
print('timeout')

% time script.py
timeout

real 0m3.629s
user 0m0.020s
sys 0m0.024s

Note that the requests module succeeds here although urllib2 did not:

import requests
r = requests.get('https://www.5giay.vn/')

How to enforce a timeout on the entire function call:

socket.setdefaulttimeout only affects how long Python waits before an exception is raised if the server has not issued a response.

Neither it nor urlopen(..., timeout=...) enforce a time limit on the entire function call.

To do that, you could use eventlets, as shown here.

If you don't want to install eventlets, you could use multiprocessing from the standard library; though this solution will not scale as well as an asynchronous solution such as the one eventlets provides.

import urllib2
import socket
import multiprocessing as mp

def timeout(t, cmd, *args, **kwds):
pool = mp.Pool(processes=1)
result = pool.apply_async(cmd, args=args, kwds=kwds)
try:
retval = result.get(timeout=t)
except mp.TimeoutError as err:
pool.terminate()
pool.join()
raise
else:
return retval

def open(url):
response = urllib2.urlopen(url)
print(response)

url = 'https://www.5giay.vn/'
try:
timeout(5, open, url)
except mp.TimeoutError as err:
print('timeout')

Running this will either succeed or timeout in about 5 seconds of wall clock time.

How can I force urllib2 to time out?

I usually use netcat to listen on port 80 of my local machine:

nc -l 80

Then I use http://localhost/ as the request URL in my application. Netcat will answer at the http port but won't ever give a response, so the request is guaranteed to time out provided that you have specified a timeout in your urllib2.urlopen() call or by calling socket.setdefaulttimeout().

Timing out urllib2 urlopen operation in Python 2.4

You can achieve this using signals.

Here's an example of my signal decorator that you can use to set the timeout for individual functions.

Ps. not sure if this is syntactically correct for 2.4. I'm using 2.6 but the 2.4 supports signals.

import signal
import time

class TimeOutException(Exception):
pass

def timeout(seconds, *args, **kwargs):
def fn(f):
def wrapped_fn(*args, **kwargs):
signal.signal(signal.SIGALRM, handler)
signal.alarm(seconds)
f(*args, **kwargs)
return wrapped_fn
return fn

def handler(signum, frame):
raise TimeOutException("Timeout")

@timeout(5)
def my_function_that_takes_long(time_to_sleep):
time.sleep(time_to_sleep)

if __name__ == '__main__':
print 'Calling function that takes 2 seconds'
try:
my_function_that_takes_long(2)
except TimeOutException:
print 'Timed out'

print 'Calling function that takes 10 seconds'
try:
my_function_that_takes_long(10)
except TimeOutException:
print 'Timed out'


Related Topics



Leave a reply



Submit