Catching timeout errors with ruby mechanize
Instead of retrying some timeouts on some mechanize requests I think you'd better set Mechanize::HTTP::Agent::read_timeout
attribute to a reasonable amount of seconds like 2 or 5, anyway one that prevent timeouts errors for this request.
Then, it seem's that your log out procedure only required access to a simple HTTP GET request. I mean there is no form to fill in so no HTTP POST request.
So if I were you, I would prefere inspected the page source code (Ctrl+U with Firefox or Chrome) in order to identify the link which is reached by your agent.click(page.link_with(:text => /Log Out/i))
It should be faster because these type of pages are usually blank and Mechanize will not have to load a full html web page in memory.
Here is the code I would prefer use :
def logmeout(agent)
begin
agent.read_timeout=2 #set the agent time out
page = agent.get('http://www.example.com/logout_url.php')
agent.history.pop() #delete this request in the history
rescue Timeout::Error
puts "Timeout!"
puts "read_timeout attribute is set to #{agent.read_timeout}s" if !agent.read_timeout.nil?
#retry #retry is no more needed
end
end
but you can use your retry function too :
def trythreetimes
tries = 0
begin
yield
rescue Exception => e
tries += 1
puts "Error: #{e.message}"
puts "Trying again!" if tries <= 3
retry if tries <= 3
puts "No more attempt!"
end
end
def logmeout(agent)
trythreetimes do
agent.read_timeout=2 #set the agent time out
page = agent.get('http://www.example.com/logout_url.php')
agent.history.pop() #delete this request in the history
end
end
hope it helps ! ;-)
mechanize dealing with errors
You'd want to rescue on failed request, just like here
task :estimateone => :environment do
require 'mechanize'
require 'csv'
begin
# ...
page = mechanize.get('http://www.theurbanlist.com/brisbane/a-list/50-brisbane-cafes-you-should-have-eaten-breakfast-at')
rescue Mechanize::ResponseCodeError
# do something with the result, log it, write it, mark it as failed, wait a bit and then continue the job
next
end
end
My guess is that you're hitting API rate limits. This will not solve your problem as it is not in your side but at the server's; but will give you range to work as now you can flag the links that did not work and continue from there on.
Handling Timeout error
Well, that's expected behaviour of Timeout
. If the block takes too long, its execution gets terminated and an exception thrown.
You would probably like to catch the exception and handle it appropriately:
require 'timeout'
begin
status = Timeout::timeout(5) {
# Something that should be interrupted if it takes too much time...
}
rescue Timeout::Error
puts 'That took too long, exiting...'
end
What is proper way to test error handling?
Sort of. You shouldn't test dependent libraries again in your application. It's enough to catch the Net::HTTP::Persistent::Error without ensuring the underlying functionality is working. Well written gems should provide their own tests, and you should be able to access those tests as needed by testing that gem (Mechanize, for example).
You could mock for those errors, but you should be judicious. Here is some code to mock an SMTP connection
class Mock
require 'net/smtp'
def initialize( options )
@options = options
@username = options[:username]
@password = options[:password]
options[:port] ? @port = options[:port] : @port = 25
@helo_domain = options[:helo_domain]
@from_addr = options[:from_address]
@from_domain = options[:from_domain]
#Mock object for SMTP connections
mock_config = {}
mock_config[:address] = options[:server]
mock_config[:port] = @port
@connection = RSpec::instance_double(Net::SMTP, mock_config)
allow(@connection).to receive(:start).and_yield(@connection)
allow(@connection).to receive(:send_message).and_return(true)
allow(@connection).to receive(:started?).and_return(true)
allow(@connection).to receive(:finish).and_return(true)
end
#more stuff here
end
I don't see you testing for any custom errors which would make more sense here. For example, you might test for url-unfriendly characters in your parameter and rescue from that. In that case, your test would offer something explicit.
expect(get("???.net")).to raise_error(CustomError)
Ruby Mechanize Connection timed out
After some more programming experience, I realized that this was a simple error on my part: my code did not catch the error thrown and appropriately move to the next link when a link was corrupted.
For any novice Ruby programmers that encounter a similar problem:
The Connection timed out error is usually due to an invalid link, etc. on the page being scrapped.
You need to wrap the code that is accessing link in a statement such as the below
begin
#[1 your scraping code here ]
rescue
#[2 code to move to the next link/page/etc. that you are scraping instead of sticking to the invalid one]
end
For instance, if you have a for loop that is iterating over links and extracting information from each link, then that should be at [1] and code to move to the next link (consider using something like ruby "next") should be placed at [2]. You might also consider printing something to the console to let the user know that a link was invalid.
Retry testing sites after timeout error in Watir
Your loop can be:
#Use Ruby's method for iterating through the array
testsite_array.each do |site|
attempt = 1
begin
ie.goto site
if ie.html.include? 'teststring'
puts site + ' yes'
else
puts site + ' no'
end
rescue
attempt += 1
#Retry accessing the site or stop trying
if attempt > MAX_ATTEMPTS
puts site + ' site failed, moving on'
else
retry
end
end
end
Related Topics
Convert Ruby Source Code from Old Style to New Style Hash
Regex to Validate String Having Only Characters (Not Special Characters), Blank Spaces and Numbers
Ruby - Extracting the Unique Values Per Key from an Array of Hashes
Colorized Output Breaks Linewrapping with Readline
Script to Run Against Stdin If No Arg; Otherwise Input File =Argv[0]
Rails - How to Check for Online Users
How to Use Activesupport::Configurable with Rails Engine
How to Use Mongodb Ruby Driver to Do a "Group" (Group By)
Ruby Google_Drive Gem Oauth2 Saving
Exception_Notification for Delayed_Job
Ruby 1.9.2 - Read and Parse a Remote CSV
How to 'Unload' ('Un-Require') a Ruby Library
Rails Initializes Extremely Slow on Ruby 1.9.1
Sidekiq Worker Not Getting Triggered
How to Fix a Slow Implicit Query on Pg_Attribute Table in Rails
Rails 5.0.0 When Installing "Nio4R":Failed to Build Gem Native Extension
How to Cleanly Verify If the User Input Is an Integer in Ruby