How to get the HTML source of a webpage in Ruby
Use Net::HTTP:
require 'net/http'
source = Net::HTTP.get('stackoverflow.com', '/index.html')
How to get the raw HTML source code for a page by using Ruby or Nokogiri?
Don't use Nokogiri at all if you want the raw source of a web page. Just fetch the web page directly as a string, and then do not feed that to Nokogiri. For example:
require 'open-uri'
html = open('http://phrogz.net').read
puts html.length #=> 8461
puts html #=> ...raw source of the page...
If, on the other hand, you want the post-JavaScript-modified contents of a page (such as an AJAX library that executes JavaScript code to fetch new content and change the page), then you can't use Nokogiri. You need to use Ruby to control a web browser (e.g. read up on Selenium or Watir).
Get the html from a website with ruby on rails
You can use httparty to just get the data
Sample code (from example):
require File.join(dir, 'httparty')
require 'pp'
class Google
include HTTParty
format :html
end
# google.com redirects to www.google.com so this is live test for redirection
pp Google.get('http://google.com')
puts '', '*'*70, ''
# check that ssl is requesting right
pp Google.get('https://www.google.com')
Nokogiri really excels at parsing that data.. Here's some example code from the Railscast:
url = "http://www.walmart.com/search/search-ng.do?search_constraint=0&ic=48_0&search_query=batman&Find.x=0&Find.y=0&Find=Find"
doc = Nokogiri::HTML(open(url))
puts doc.at_css("title").text
doc.css(".item").each do |item|
title = item.at_css(".prodLink").text
price = item.at_css(".PriceCompare .BodyS, .PriceXLBold").text[/\$[0-9\.]+/]
puts "#{title} - #{price}"
puts item.at_css(".prodLink")[:href]
end
ruby watir to get html of a page
This should do it:
puts browser.html
ruby code to search and get a string from a html content
key = get()[/commit\s+([a-f0-9]{10,})/i, 1]
puts key
Regex explanation here.
Ruby On Rails: Display html source code instead of rendering it
I fixed it. It was something to do with mongrel. I found the solution here:
https://rails.lighthouseapp.com/projects/8994/tickets/4690
:)
(RUBY) How to read HTML tag contents and print them in the console
Use nokogiri
to parse html. Run gem install nokogiri
.
require 'nokogiri'
html = Nokogiri::HTML(open("http://#{website}"))
html.css('h3').each do |title_node|
puts "Title: #{title_node.content}"
end
Related Topics
Set Div to Have Its Siblings Width
CSS - Border Where Only Half of a Border Is Visible
What Is The Browser-Default Background Color When Selecting Text
Stop Google Chrome Auto Fill The Input
Change Color of One Character in a Text Box HTML/CSS
How to Handle Xml/HTML in Git Feature Branch Workflow
How to Make a Div with Irregular Shapes with CSS3 and HTML5
Set CSS Border to End in a 90 Instead of a 45 Degree Angle
How to Hide My Source Code So to Not Be Copied
Why Does Width and Height of a Flex Item Affect How a Flex Item Is Rendered
CSS Animate Custom Properties/Variables
HTML5 Input Box with Type="Number" Does Not Accept Comma in Chrome Browser
How to Set Character Encoding to Utf-8 for Default.HTML
How to Escape HTML-Specific Characters in a String (Powershell)
Why Is There a Default Margin on The <Body> Element
Center a Large Image of Unknown Size Inside a Smaller Div with Overflow Hidden