Ruby Mechanize https error
Sometimes you need to tell mechanize to use sslv3:
page = Mechanize.new{|a| a.ssl_version, a.verify_mode = 'SSLv3', OpenSSL::SSL::VERIFY_NONE}.get "https://sis-app.sph.harvard.edu:9030/prod/bwckschd.p_disp_dyn_sched"
Notice that I use OpenSSL::SSL::VERIFY_NONE. That means you are theoretically vulnerable to man-in-the-middle attack, but that's not something I generally worry about when I'm scraping a website.
mechanize dealing with errors
You'd want to rescue on failed request, just like here
task :estimateone => :environment do
require 'mechanize'
require 'csv'
begin
# ...
page = mechanize.get('http://www.theurbanlist.com/brisbane/a-list/50-brisbane-cafes-you-should-have-eaten-breakfast-at')
rescue Mechanize::ResponseCodeError
# do something with the result, log it, write it, mark it as failed, wait a bit and then continue the job
next
end
end
My guess is that you're hitting API rate limits. This will not solve your problem as it is not in your side but at the server's; but will give you range to work as now you can flag the links that did not work and continue from there on.
Mechanize won't conect to site
While it's true that mechanize doesn't support javascript, your problem is that you are trying to access a site that doesn't exist. You are trying to access www.imbd.com
instead of www.imdb.com
. So, the error message is accurate.
And FWIW, IMDB doesn't want you to scrape their site:
Robots and Screen Scraping: You may not use data mining, robots, screen scraping, or similar data gathering and extraction tools on this site, except with our express written consent as noted below.
Why does accessing a SSL site with Mechanize on Windows fail, but on Mac work?
The version of OpenSSL (the library used to establish secure connections with Net::HTTPS
) is not able to properly find the certificate chain in your computer.
To our bad, OpenSSL was never able to use the Windows installed cert storage to validate remote servers so is failing because of that.
From your example, you can do:
a.agent.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
To avoid the verification, however that is far from ideal (due clear security issues)
I recommend you download some cert bundles (like the ones from curl):
http://curl.haxx.se/ca
And modify your code to something like this:
require "rbconfig"
require "mechanize"
a = Mechanize.new
# conditionally set certificate under Windows
# http://blog.emptyway.com/2009/11/03/proper-way-to-detect-windows-platform-in-ruby/
if RbConfig::CONFIG["host_os"] =~ /mingw|mswin/
# http://curl.haxx.se/ca
ca_path = File.expand_path "~/Tools/bin/curl-ca-bundle.crt"
a.agent.http.ca_file = ca_path
end
page = a.get "https://github.com/"
That seems to work, Ruby 1.9.3-p0 (i386-mingw32), Windows 7 x64 and mechanize 2.1.pre.1
Hope that helps.
Mechanize with SSL thru proxy error
It was problem in 'openssl' program. I've installed postgresql.app on my system and it change PATH env to it self. So, same programs comes from poesgresql.app and openssl too. Problem was solved with correcting path, to make system's openssl preferred by default.
Ruby's Mechanize Error 401 while sending a POST request (Steam trade offer send)
I found the issue by debugging the python POST request.
What was happening: when I log in, I get a sessionid indeed, however that sessionid is valid for 'store.steampowered.com' and 'help.steampowered.com' precisely '.storesteapowered.com'.
in my code I was blindly identifying my session cookie (without paying attention to which website it belongs), as a result a the sessionid variable that was being sent in the POST request params was not equal to the cookie the POST request was sending the in header so I got 401 Unauthorized.
so we need to set/get a session id for steamcommunity.com.
fixes :
1)set a random CSRF sessionid cookie for steamcommunity.com or, like I did, set steampowered.com's session id cookie to steamcommunity.com (marked in the code)
2)in params => 'json_tradeoffer' => "new_version"
should be "newversion"
to avoid error 400 BAD REQUEST
3)the headers of the post request should be:
{'Referer' =>'https://steamcommunity.com/tradeoffer/new', 'Origin' =>'https://steamcommunity.com' }
4)convert params => json_tradeoffer
& params => 'trade_offer_create_params'
values to string using to_json
IMPORTANT: this code is for 1 offer send, if you are going to send more than 1 you MUST always update your sessionid variable cause the cookie value will change every time you communicate with steamcommunity.com
here is the code fixed:
require 'mechanize'
require 'json'
require 'open-uri'
require 'openssl'
require 'base64'
require 'time'
def fa(shared_secret)
timestamp = Time.new.to_i
math = timestamp / 30
math = math.to_i
time_buffer =[math].pack('Q>')
hmac = OpenSSL::HMAC.digest('sha1', Base64.decode64(shared_secret), time_buffer)
start = hmac[19].ord & 0xf
last = start + 4
pre = hmac[start..last]
fullcode = pre.unpack('I>')[0] & 0x7fffffff
chars = '23456789BCDFGHJKMNPQRTVWXY'
code= ''
for looper in 0..4 do
copy = fullcode #divmod
i = copy % chars.length #divmod
fullcode = copy / chars.length #divmod
code = code + chars[i]
end
puts code
return code
end
def pass_stamp(username,password,mech)
response = mech.post('https://store.steampowered.com/login/getrsakey/', {'username' => username})
data = JSON::parse(response.body)
mod = data["publickey_mod"].hex
exp = data["publickey_exp"].hex
timestamp = data["timestamp"]
key = OpenSSL::PKey::RSA.new
key.e = OpenSSL::BN.new(exp)
key.n = OpenSSL::BN.new(mod)
ep = Base64.encode64(key.public_encrypt(password.force_encoding("utf-8"))).gsub("\n", '')
return {'password' => ep, 'timestamp' => timestamp }
end
user = 'user'
password = 'password'
session = Mechanize.new { |agent|
agent.user_agent_alias = 'Windows Mozilla'
agent.follow_meta_refresh = true
agent.add_auth('https://steamcommunity.com/tradeoffer/new/send/', user, password)
agent.log = Logger.new("mech.log")
}
data = pass_stamp(user,password, session)
ep = data["password"]
timestamp = data["timestamp"]
session.add_auth('https://steamcommunity.com/tradeoffer/new/send/', user, ep)
send = {
'password' => ep,
'username' => user,
'twofactorcode' =>fa('twofactorcode'), #update
'emailauth' => '',
'loginfriendlyname' => '',
'captchagid' => '-1',
'captcha_text' => '',
'emailsteamid' => '',
'rsatimestamp' => timestamp,
'remember_login' => 'false'
}
login = session.post('https://store.steampowered.com/login/dologin', send )
responsejson = JSON::parse(login.body)
if responsejson["success"] != true
puts "didn't sucded"
puts "probably 2fa code time diffrence, retry "
exit
end
responsejson["transfer_urls"].each { |url|
getcookies = session.post(url, responsejson["transfer_parameters"])
}
## SET COOKIE FOR STEAM COMMUNITY.COM
steampowered_sessionid = ''
session.cookies.each { |c|
if c.name == "sessionid"
steampowered_sessionid = c.value
puts c.domain
end
}
cookie = Mechanize::Cookie.new :domain => 'steamcommunity.com', :name =>'sessionid', :value =>steampowered_sessionid, :path => '/'
session.cookie_jar << cookie
sessionid = steampowered_sessionid
### END SET COOKIE
offer_link = 'https://steamcommunity.com/tradeoffer/new/?partner=410155236&token=H-yK-GFt'
token = offer_link.split('token=', 2)[1]
theirs = [{"appid" => 753,"contextid"=> "6","assetid" => "6705710171","amount" => 1 }]
mine = []
params = {
'sessionid' => sessionid,
'serverid' => 1,
'partner' => '76561198370420964',
'tradeoffermessage' => '',
'json_tradeoffer' => {
"newversion" => true, ## FIXED newversion to avoid 400 BAD REQUEST
"version" => 4,
"me" => {
"assets" => mine, #create this array
"currency" => [],
"ready" => false
},
"them" => {
"assets" => theirs, #create this array
"currency" => [],
"ready" => false
}
}.to_json, # ADDED TO JSON TO AVOID 400 BAD REQUEST
'captcha' => '',
'trade_offer_create_params' => {'trade_offer_access_token' => token}.to_json ## ADDED TO JSON FIX TO AVOID ERROR 400 BAD REQUEST
}
begin
send_offer = session.post(
'https://steamcommunity.com/tradeoffer/new/send',
params,
{'Referer' => 'https://steamcommunity.com/tradeoffer/new', 'Origin' => 'https://steamcommunity.com' } ##FIXED THIS
)
puts send_offer.body
rescue Mechanize::UnauthorizedError => e
puts e
puts e.page.content
end
Related Topics
Rails 3 Joins -- Select Only Certain Columns
Weird Rails Error "Permission Denied: Bin/Rails" for Old Rails Apps
Aptana 3 Ruby Debugger - Exception in Debugthread Loop: Undefined Method 'Is_Binary_Data'
How to Add Child Nodes in Nodeset Using Nokogiri
Ruby Create Recursive Directory Tree
Singleton Method VS. Class Method
Installing Ruby 2.3 on Wsl (Windows Subsystem for Linux)
What Is the Best Wysiwyg for Rails - Ruby on Rails Blog
Using Gets() Gives "No Such File or Directory" Error When I Pass Arguments to My Script
How to Split a String into Only Two Parts, by the Last Occurrence of the Split Char
Circular Dependency Detected While Autoloading Constant When Loading Constant
How to Specify "Http Request Header" in Openuri
How to Validate a Specific Attribute on an Activerecord Without Instantiating an Object First
Ruby $Stdin.Gets Without Showing Chars on Screen
How to Route Controllers Without Crud Actions
Webkit_Server Hangs Periodically When Run from Capybara in Ruby