Mechanize and NTLM Authentication
Mechanize 2 supports NTLM auth:
m = Mechanize.new
m.agent.username = 'user'
m.agent.password = 'password'
m.agent.domain = 'addomain'
Use python mechanize to log into pages with NTLM authentication
After tons of reaserch I managed to find out the reason behind this.
Find of all the site uses a so called NTLM authentication, which is not supported by mechanize.
This can help to find out the authentication mechanism of a site:
wget -O /dev/null -S http://www.the-site.com/
So the code was modified a little bit:
import sys
import urllib2
import mechanize
from ntlm import HTTPNtlmAuthHandler
print("LOGIN...")
user = sys.argv[1]
password = sys.argv[2]
url = sys.argv[3]
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
# create the NTLM authentication handler
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
browser = mechanize.Browser()
handlersToKeep = []
for handler in browser.handlers:
if not isinstance(handler,
(mechanize._http.HTTPRobotRulesProcessor)):
handlersToKeep.append(handler)
browser.handlers = handlersToKeep
browser.add_handler(auth_NTLM)
response = browser.open(url)
response = browser.open("http://www.the-site.com")
print(response.read())
and finally mechanize needs to be patched, as mentioned here:
--- _response.py.old 2013-02-06 11:14:33.208385467 +0100
+++ _response.py 2013-02-06 11:21:41.884081708 +0100
@@ -350,8 +350,13 @@
self.fileno = self.fp.fileno
else:
self.fileno = lambda: None
- self.__iter__ = self.fp.__iter__
- self.next = self.fp.next
+
+ if hasattr(self.fp, "__iter__"):
+ self.__iter__ = self.fp.__iter__
+ self.next = self.fp.next
+ else:
+ self.__iter__ = lambda self: self
+ self.next = lambda self: self.fp.readline()
def __repr__(self):
return '<%s at %s whose fp = %r>' % (
Python mechanize with NTLM getting AttributeError: HTTPResponse instance has no attribute '__iter__'
I patched mechanize to work around this:
--- _response.py.old 2013-02-06 11:14:33.208385467 +0100
+++ _response.py 2013-02-06 11:21:41.884081708 +0100
@@ -350,8 +350,13 @@
self.fileno = self.fp.fileno
else:
self.fileno = lambda: None
- self.__iter__ = self.fp.__iter__
- self.next = self.fp.next
+
+ if hasattr(self.fp, "__iter__"):
+ self.__iter__ = self.fp.__iter__
+ self.next = self.fp.next
+ else:
+ self.__iter__ = lambda self: self
+ self.next = lambda self: self.fp.readline()
def __repr__(self):
return '<%s at %s whose fp = %r>' % (
Accessing a web page with basic auth on python
The site you are using as a sample page requires the NTLM authentication. You can see this by looking at the returned HEADER fields. For example curl -I http://www.dogus.edu.tr/dusor/FrmMain.aspx
returns:
HTTP/1.1 401 Unauthorized
Content-Length: 1293
Content-Type: text/html
Server: Microsoft-IIS/7.0
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM
X-Powered-By: ASP.NET
Date: Mon, 07 Apr 2014 21:24:09 GMT
The line WWW-Authenticate: NTLM
says, which authentication method is used. I think the answer to this question Use python mechanize to log into pages with NTLM authentication will help you.
Scraping a webpage with a pop up alert auth in python 3
I got the same problem with the same identification popup on another site and resolved it with the Basic Authentication of requests
from requests.auth import HTTPBasicAuth
requests.get('https://api.github.com/user', auth=HTTPBasicAuth('user', 'pass'))
https://docs.python-requests.org/en/latest/user/authentication/
HTTP login in mechanize
from base64 import b64encode
import mechanize
url = 'http://192.168.3.5/table.js'
username = 'admin'
password = 'password'
# I have had to add a carriage return ('%s:%s\n'), but
# you may not have to.
b64login = b64encode('%s:%s' % (username, password))
br = mechanize.Browser()
# # I needed to change to Mozilla for mine, but most do not
# br.addheaders= [('User-agent', 'Mozilla/5.0')]
br.addheaders.append(
('Authorization', 'Basic %s' % b64login )
)
br.open(url)
r = br.response()
data = r.read()
print data
And perhaps didn't try this could also work:
import urllib
import re
import mechanize
br = mechanize.Browser()
response = br.open("http://USERNAME:PASSWORD@ab/cabs");
print response.geturl()
print response.read(
)
Perl WWW::Mechanize -- Authentication Error GETing URL
With older versions of Mechanize you could subclass the WWW::Mechanize package and provide your own credentials routine:
package MyMech;
use vars qw(@ISA);
@ISA = qw(WWW::Mechanize);
sub get_basic_credentials {
my ($self, $realm, $uri) = @_;
return( "user", "password" );
}
Then in your program use this package instead of WWW::Mechanize:
package main;
my $mech = MyMech->new();
$mech->get( $url );
Update
You've updated your question to indicate the requirement of NTLM authentication. Check out LWP::Authen::Ntlm on CPAN.
Related Topics
Ruby Gsub Problem When Using Backreference and Hashes
Run Ruby Script That Is Stored on Internet
"/#Action" Route in Routes.Rb in Ruby on Rails
Rubocop, How to Disable/Enable Cops on Blocks of Code
Recursive Rails Nested Resources
How to Make Empty Tags Self-Closing with Nokogiri
Ruby's "Foo = True If !Defined? Foo" Won't Work as Expected
Capture Webcam's Image with Ruby
Handling Has_One Nested Resource in Rails 3
Ruby: Automatically Wrapping Methods in Event Triggers
Ruby Gem Cucumber Ssl Error and Gem Sources
Actionmailer Smtp "Certificate Verify Failed"
How to Pass Multi Value Query Params in Swagger
Empty Strings at the Beginning and End of Split
Is /Etc/Irbrc Installed by Os X? Does Irb Read It