Unable to use Selenium to automate Chase site login
I took your code and simplified the structure and ran the test with minimal lines of code as follows:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://secure07c.chase.com/web/auth/#/logon/logon/chaseOnline?")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.jpui.input.logon-xs-toggle.clientSideError"))).send_keys("jsmiao")
driver.find_element_by_css_selector("input.jpui.input.logon-xs-toggle#password-input-field").send_keys("hello")
driver.find_element_by_css_selector("button#signin-button>span.label").click()
Similarly, as per your observation I have hit the same roadblock with the error as:
It seems the click()
on the element with text as Sign in does happens. Though the username / password lookup is initiated but the process is interupted. While inspecting the DOM Tree of the webpage you will find that some of the <script>
tag refers to JavaScripts having keyword dist. As an example:
<script src="https://static.chasecdn.com/web/library/blue-boot/dist/2.20.3/blue-boot/js/main-ver.js"></script>
<script type="text/javascript" charset="utf-8" async="" data-requirecontext="_" data-requiremodule="blue-vendor/main" src="https://static.chasecdn.com/web/library/blue-vendor/dist/2.11.1/blue-vendor/js/main.js"></script>
<script type="text/javascript" charset="utf-8" async="" data-requirecontext="_" data-requiremodule="blue/main" src="https://static.chasecdn.com/web/library/blue-core/dist/2.16.3/blue/js/main.js"></script>
<script type="text/javascript" charset="utf-8" async="" data-requirecontext="_" data-requiremodule="blue-app/main" src="https://static.chasecdn.com/web/library/blue-app/dist/2.15.1/blue-app/js/main.js"></script>
Which is a clear indication that the website is protected by Bot Management service provider Distil Networks and the navigation by ChromeDriver gets detected and subsequently blocked.
Distil
As per the article There Really Is Something About Distil.it...:
Distil protects sites against automatic content scraping bots by observing site behavior and identifying patterns peculiar to scrapers. When Distil identifies a malicious bot on one site, it creates a blacklisted behavioral profile that is deployed to all its customers. Something like a bot firewall, Distil detects patterns and reacts.
Further,
"One pattern with Selenium was automating the theft of Web content"
, Distil CEO Rami Essaid said in an interview last week."Even though they can create new bots, we figured out a way to identify Selenium the a tool they're using, so we're blocking Selenium no matter how many times they iterate on that bot. We're doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious".
Reference
You can find a couple of detailed discussion in:
- Is there a way to use Selenium WebDriver without informing the document that it is controlled by WebDriver?
- Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection
- Akamai Bot Manager detects WebDriver driven Chrome Browsing Context
- Is there a version of selenium webdriver that is not detectable?
Can a website detect when you are using Selenium with chromedriver?
Replacing cdc_
string
You can use Vim or Perl to replace the cdc_
string in chromedriver
. See the answer by @Erti-Chris Eelmaa to learn more about that string and how it's a detection point.
Using Vim or Perl prevents you from having to recompile source code or use a hex editor.
Make sure to make a copy of the original chromedriver
before attempting to edit it.
Our goal is to alter the cdc_
string, which looks something like $cdc_lasutopfhvcZLmcfl
.
The methods below were tested on chromedriver version 2.41.578706
.
Using Vim
vim /path/to/chromedriver
After running the line above, you'll probably see a bunch of gibberish. Do the following:
- Replace all instances of
cdc_
withdog_
by typing:%s/cdc_/dog_/g
.dog_
is just an example. You can choose anything as long as it has the same amount of characters as the search string (e.g.,cdc_
), otherwise thechromedriver
will fail.
- To save the changes and quit, type
:wq!
and pressreturn
.- If you need to quit without saving changes, type
:q!
and pressreturn
.
- If you need to quit without saving changes, type
Using Perl
The line below replaces all cdc_
occurrences with dog_
. Credit to Vic Seedoubleyew:
perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver
Make sure that the replacement string (e.g., dog_
) has the same number of characters as the search string (e.g., cdc_
), otherwise the chromedriver
will fail.
Wrapping Up
To verify that all occurrences of cdc_
were replaced:
grep "cdc_" /path/to/chromedriver
If no output was returned, the replacement was successful.
Go to the altered chromedriver
and double click on it. A terminal window should open up. If you don't see killed
in the output, you've successfully altered the driver.
Make sure that the name of the altered chromedriver
binary is chromedriver
, and that the original binary is either moved from its original location or renamed.
My Experience With This Method
I was previously being detected on a website while trying to log in, but after replacing cdc_
with an equal sized string, I was able to log in. Like others have said though, if you've already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try accessing the site that was detecting you using a VPN, different network, etc.
Can Selenium automation be tracked?
Selenium driven ChromeDriver / GeckoDriver initiated google-chrome / firefox Browsing Context can be easily detected deploying either of the following Bot Management services:
- Imperva Advanced Bot Protection formerly known as Distil.
- Akamai Bot Manager
- DataDome
- Cloudflare
You can find a relevant detailed discussion in Can a website detect when you are using Selenium with chromedriver?
How to login into suntrust bank account using Selenium through Python
I have modified your code a bit and tried to login as follows:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
options = webdriver.ChromeOptions()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://login.onlinebanking.suntrust.com/olb/login")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.suntrust-input-text.ng-pristine.ng-valid.ng-touched#userId"))).send_keys("username")
driver.find_element_by_css_selector("input.suntrust-input-text.ng-untouched.ng-pristine.ng-valid#password").send_keys("password")
driver.find_element_by_css_selector("button.suntrust-sign-on.suntrust-button-text>span").click()
But was still unable to login.
Now on inspecting the DOM Tree of SUNTRUST - Online Banking Sign On login page you will find the following tags within the <body>
tag:
<script type="text/javascript" src="dist/runtime.7d6aba6a1596ee0b757c.js"></script>
<script type="text/javascript" src="dist/polyfills.65913a8531010587b6fe.js"></script>
<script type="text/javascript" src="dist/scripts.46e57c2d57ad1b3d210d.js"></script>
<script type="text/javascript" src="dist/vendor.43f2240dc35276d98b10.js"></script>
<script type="text/javascript" src="dist/main.5d227767baa37ef78819.js"></script>
Snapshot
The presence of the phrase dist is a clear indication that the website is protected by Bot Management service provider Distil Networks and the navigation by ChromeDriver gets detected and subsequently blocked.
Distil
As per the article There Really Is Something About Distil.it...:
Distil protects sites against automatic content scraping bots by observing site behavior and identifying patterns peculiar to scrapers. When Distil identifies a malicious bot on one site, it creates a blacklisted behavioral profile that is deployed to all its customers. Something like a bot firewall, Distil detects patterns and reacts.
Further,
"One pattern with **Selenium** was automating the theft of Web content"
, Distil CEO Rami Essaid said in an interview last week."Even though they can create new bots, we figured out a way to identify Selenium the a tool they're using, so we're blocking Selenium no matter how many times they iterate on that bot. We're doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious".
Reference
You can find a couple of relevant discussions in:
- Chrome browser initiated through ChromeDriver gets detected
- Unable to use Selenium to automate Chase site login
Webpage Is Detecting Selenium Webdriver with Chromedriver as a bot
You have mentioned about pandas.get_html
only in your question and options.add_argument('headless')
only in your code so not sure if you are implementing them. However taking out minimum code from your code attempt as follows:
Code Block:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://www.controller.com/')
print(driver.title)
I have faced the same issue.
- Browser Snashot:
When I inspected the HTML DOM it was observed that the website refers the distil_referrer on window.onbeforeunload
as follows:
<script type="text/javascript" id="">
window.onbeforeunload=function(a){"undefined"!==typeof sessionStorage&&sessionStorage.removeItem("distil_referrer")};
</script>
Snapshot:
This is a clear indication that the website is protected by Bot Management service provider Distil Networks and the navigation by ChromeDriver gets detected and subsequently blocked.
Distil
As per the article There Really Is Something About Distil.it...:
Distil protects sites against automatic content scraping bots by observing site behavior and identifying patterns peculiar to scrapers. When Distil identifies a malicious bot on one site, it creates a blacklisted behavioral profile that is deployed to all its customers. Something like a bot firewall, Distil detects patterns and reacts.
Further,
"One pattern with Selenium was automating the theft of Web content"
, Distil CEO Rami Essaid said in an interview last week."Even though they can create new bots, we figured out a way to identify Selenium the a tool they're using, so we're blocking Selenium no matter how many times they iterate on that bot. We're doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious".
Reference
You can find a couple of detailed discussion in:
- Distil detects WebDriver driven Chrome Browsing Context
- Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection
- Akamai Bot Manager detects WebDriver driven Chrome Browsing Context
Is there a way to use Selenium WebDriver without informing the document that it is controlled by WebDriver?
No, there is no way to conceal that you are runing automated test.
WebDriver Interface
When using the WebDriver interface the webdriver-active flag is set to true as the user agent is under remote control. It is initially false.
WebIDL
Navigator includes NavigatorAutomationInformation;
Note that the NavigatorAutomationInformation
interface should not be exposed on WorkerNavigator.
WebIDL
interface mixin NavigatorAutomationInformation {
readonly attribute boolean webdriver;
};
webdriver
- Returns true if webdriver-active flag is set, false otherwise.
Example
For web authors :
navigator.webdriver
Defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, for example so that alternate code paths can be triggered during automation.
The above mentioned implementation is based on a couple of Security Considerations as follows:
A user agent can rely on a command-line flag or a configuration option to test whether to enable WebDriver, or alternatively make the user agent initiate or confirm the connection through a privileged content document or control widget, in case the user agent does not directly implement the HTTP endpoints.
It is strongly suggested that user agents require users to take explicit action to enable WebDriver, and that WebDriver remains disabled in publicly consumed versions of the user agent.
It is also suggested that user agents make an effort to visually distinguish a user agent session that is under control of WebDriver from those used for normal browsing sessions. This can be done through a browser chrome element such as a door hanger, colorful decoration of the OS window, or some widget element that is prevalent in the window so that it easy to identify automation windows.
Reference
You can find a couple of detailed discussion in:
- Distil detects WebDriver driven Chrome Browsing Context
- Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection
- Akamai Bot Manager detects WebDriver driven Chrome Browsing Context
Related Topics
Matplotlib: Specify Format of Floats for Tick Labels
Opencv Videocapture and Error: (-215:Assertion Failed) !_Src.Empty() in Function 'Cv::Cvtcolor'
Django Gunicorn Not Load Static Files
Typeerror: Use() Got an Unexpected Keyword Argument 'Warn' When Importing Matplotlib
List Comprehension in Haskell, Python and Ruby
Swift Playground Error: Module 'Python' Has No Member Named 'Import'
How to Fetch a Non-Ascii Url with Urlopen
Fastapi Runs API-Calls in Serial Instead of Parallel Fashion
What Is the Problem with Shadowing Names Defined in Outer Scopes
What Is the Purpose of _Str_ and _Repr_
Style Active Navigation Element with a Flask/Jinja2 MACro
Plotting of 2D Data:Heatmap with Different Colormaps
Pyobjc VS Rubycocoa for MAC Development: Which Is More Mature
Is There a Function That Checks If a Character in a String Is a Letter in the Alphabet? (Swift)