mechanize how to get current url
next_page.uri.to_s
See http://www.rubydoc.info/gems/mechanize/Mechanize/Page/Link#uri-instance_method and http://ruby-doc.org/stdlib-2.4.1/libdoc/uri/rdoc/URI.html
For testing purposes, I did the following in irb:
require 'mechanize'
@agent = Mechanize.new
page = @agent.get('http://news.ycombinator.com/news')
=> #<Mechanize::Page
{url #<URI::HTTP:0x00000001ad3198 URL:http://news.ycombinator.com/news>}
{meta_refresh}
{title "Hacker News"}
{iframes}
{frames}
{links
#<Mechanize::Page::Link "" "http://ycombinator.com">
#<Mechanize::Page::Link "Hacker News" "news">
#<Mechanize::Page::Link "new" "newest">
#<Mechanize::Page::Link "comments" "newcomments">
#<Mechanize::Page::Link "ask" "ask">
#<Mechanize::Page::Link "jobs" "jobs">
#<Mechanize::Page::Link "submit" "submit">
#<Mechanize::Page::Link "login" "newslogin?whence=%6e%65%77%73">
#<Mechanize::Page::Link "" "vote?for=3803568&dir=up&whence=%6e%65%77%73">
#<Mechanize::Page::Link
"Don’t Be Evil: How Google Screwed a Startup"
"http://blog.hatchlings.com/post/20171171127/dont-be-evil-how-google-screwed-a-startup">
#<Mechanize::Page::Link "mikeknoop" "user?id=mikeknoop">
#<Mechanize::Page::Link "64 comments" "item?id=3803568">
#<Mechanize::Page::Link "" "vote?for=3802515&dir=up&whence=%6e%65%77%73">
# Omitted for brevity...
next_page.uri
=> #<URI::HTTP:0x00000001fa7818 URL:http://news.ycombinator.com/news2>
next_page.uri.to_s
=> "http://news.ycombinator.com/news2"
How to find the current URL in python mechanize?
Well, it may not be very thorough, but still there is what you need:
import mechanize
br = mechanize.Browser()
br.open("http://www.example.com/")
# follow second link with element text matching regular expression
response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1)
print response1.geturl()
As a side-note, when I'm looking for method like that and I don't find them in the docs, I usually open an IPython shell and I play with the autocompletion to see if there is some method that seems nice.
How to get current URL from Mechanize in Python?
br.geturl()
should do it. Using httpbin.org's redirect endpoint to test:
br = mechanize.Browser()
url = 'http://httpbin.org/redirect-to?url=http%3A%2F%2Fstackoverflow.com'
br.open( url )
>>> print br.geturl()
http://stackoverflow.com
How to get the current URL for a HTML page
I'm assuming you're using the open_uri_redirections
gem because :allow_redirections
is not necessary in Ruby 2.4+.
Save the result of OpenURI's open
:
require 'open-uri'
r = open('http://www.google.com/gmail')
r.base_uri
# #<URI::HTTPS https://accounts.google.com/ServiceLogin?service=mail&passive=true&rm=false&continue=https://mail.google.com/mail/&ss=1&scc=1<mpl=default<mplcache=2&emr=1&osid=1#>
page = Nokogiri::HTML(r)
Python Mechanize, how to get URL parameters
from urllib.parse import urlparse
parsed = urlparse(url)
print(parsed)
The output:
ParseResult(scheme='https', netloc='example.com', path='/something.php', params='', query='sid=123456789', fragment='')
Then, you can access:
print(parsed.query)
The output:
sid=123456789
Then, you can extract:
sid = parsed.query.split('sid=')[-1]
print(sid)
The output:
123456789
Related Topics
How to Write a Rails Mixin That Spans Across Model, Controller, and View
How to Create Temp Dir in Ruby
What Grammar Based Parser-Generator Tools Exist for Ruby
Ruby: Eval with String Interpolation
Why Do I Get "Including Capybara::Dsl in the Global Scope Is Not Recommended!"
Breaking Ruby Module Across Several Files
How to Make a Specific Gem Version as Default
Good Explanation of Ruby Object Model -- Mainly, 'Classes Are Objects'
Ruby 2.0.0 String#Match Argumenterror: Invalid Byte Sequence in Utf-8
What Rails Plugins Are Good, Stable and *Really* Enhance Your Code
File.Open with Block VS Without
Regex to Match Hashtags in a Sentence Using Ruby
Automatically Precompile Assets Before Pushing to Heroku
Devise Nomethoderror 'For' Parametersanitizer
How to Pluck Email from Array of Users