python: urllib2 how to send cookie with urlopen request
Cookie is just another HTTP header.
import urllib2
opener = urllib2.build_opener()
opener.addheaders.append(('Cookie', 'cookiename=cookievalue'))
f = opener.open("http://example.com/")
See urllib2 examples for other ways how to add HTTP headers to your request.
There are more ways how to handle cookies. Some modules like cookielib try to behave like web browser - remember what cookies did you get previously and automatically send them again in following requests.
How to send cookies with urllib
To do this with urllib
, you need to:
- Construct a
Cookie
object. The constructor isn't documented in the docs, but if youhelp(http.cookiejar.Cookie)
in the interactive interpreter, you can see that its constructor demands values for all 16 attributes. Notice that the docs say, "It is not expected that users of http.cookiejar construct their own Cookie instances." - Add it to the cookiejar with
cj.set_cookie(cookie)
. - Tell the cookiejar to add the correct headers to the request with
cj.add_cookie_headers(req)
.
Assuming you've configured the policy correctly, you're set.
But this is a huge pain. As the docs for urllib.request
say:
See also The Requests package is recommended for a higher-level HTTP client interface.
And, unless you have some good reason you can't install requests
, you really should go that way. urllib
is tolerable for really simple cases, and it can be handy when you need to get deep under the covers—but for everything else, requests
is much better.
With requests
, your whole program becomes a one-liner:
webpage = requests.get('https://www.thewebsite.com/', cookies={'required_cookie': required_value}, headers={'User-Agent': 'Mozilla/5.0'}).text
… although it's probably more readable as a few lines:
cookies = {'required_cookie': required_value}
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get('https://www.thewebsite.com/', cookies=cookies, headers=headers)
webpage = response.text
How do you send cookies to a website in urllib2 (and urllib) python?
"to set multiple cookie add set-cookie header as many times as you need. About quotation marks, in python should be header = {... "Set-Cookie":"name=vale" ...} if your question is about http specification of Set-Cookie see link – ejrav"
Using urllib/urllib2 get a session cookie and use it to login to a final page
The reason why the 'final page' was rejecting the cookies is because Python was adding 'User-agent', 'Python-urllib/2.7'
to the header. After removing this element I was able to login to a website:
opener.addheaders.pop(0)
Enabling cookies with urllib
By using a CookieJar
, of course!
And urllib2
.
import cookielib
import urllib2
cookiejar= cookielib.LWPCookieJar()
opener= urllib2.build_opener( urllib2.HTTPCookieProcessor(cookiejar) )
opener.urlopen(...)
As an aside:
In my experience, a site you want to parse telling you to enable cookies is a good indicator this is going to be a unpleasant experience, and you'll be asking how to enable javascript in urllib2
next (which is not really answerable, by the way).
If you think you'll benefit from a higher-level approach, you should probably evaluate mechanize
and selenium.
pass session cookies in http header with python urllib2?
The latest version of requests
has support for sessions (as well as being really simple to use and generally great):
with requests.session() as s:
s.post(url, data=user_data)
r = s.get(url_2)
Python form POST using urllib2 (also question on saving/using cookies)
There are quite a few problems with the code that you've posted. Typically you'll want to build a custom opener which can handle redirects, https, etc. otherwise you'll run into trouble. As far as the cookies themselves so, you need to call the load and save methods on your cookiejar
, and use one of subclasses, such as MozillaCookieJar
or LWPCookieJar
.
Here's a class I wrote to login to Facebook, back when I was playing silly web games. I just modified it to use a file based cookiejar, rather than an in-memory one.
import cookielib
import os
import urllib
import urllib2
# set these to whatever your fb account is
fb_username = "your@facebook.login"
fb_password = "secretpassword"
cookie_filename = "facebook.cookies"
class WebGamePlayer(object):
def __init__(self, login, password):
""" Start up... """
self.login = login
self.password = password
self.cj = cookielib.MozillaCookieJar(cookie_filename)
if os.access(cookie_filename, os.F_OK):
self.cj.load()
self.opener = urllib2.build_opener(
urllib2.HTTPRedirectHandler(),
urllib2.HTTPHandler(debuglevel=0),
urllib2.HTTPSHandler(debuglevel=0),
urllib2.HTTPCookieProcessor(self.cj)
)
self.opener.addheaders = [
('User-agent', ('Mozilla/4.0 (compatible; MSIE 6.0; '
'Windows NT 5.2; .NET CLR 1.1.4322)'))
]
# need this twice - once to set cookies, once to log in...
self.loginToFacebook()
self.loginToFacebook()
self.cj.save()
def loginToFacebook(self):
"""
Handle login. This should populate our cookie jar.
"""
login_data = urllib.urlencode({
'email' : self.login,
'pass' : self.password,
})
response = self.opener.open("https://login.facebook.com/login.php", login_data)
return ''.join(response.readlines())
test = WebGamePlayer(fb_username, fb_password)
After you've set your username and password, you should see a file, facebook.cookies
, with your cookies in it. In practice you'll probably want to modify it to check whether you have an active cookie and use that, then log in again if access is denied.
urllib2 with cookies
import urllib2
opener = urllib2.build_opener()
opener.addheaders.append(('Cookie', 'cookiename=cookievalue'))
f = opener.open("http://example.com/")
Related Topics
Matplotlib: How to Draw a Rectangle on Image
Matrix Multiplication in Pure Python
How to Change the Datetime Tick Label Frequency for Matplotlib Plots
How to Add an Empty Column to a Dataframe
Plot a Bar Using Matplotlib Using a Dictionary
How to Use 'Cv2.Findcontours' in Different Opencv Versions
Reading File Opened with Python Paramiko Sftpclient.Open Method Is Slow
Using Self.Xxxx as a Default Parameter - Python
Selenium - Chromedriver Executable Needs to Be in Path
Find Element's Index in Pandas Series
Matplotlib Figure Facecolor (Background Color)
For Loops and Iterating Through Lists
Anyone Know of a Good Python Based Web Crawler That I Could Use
Predict Classes or Class Probabilities