Scraping/Parsing Google Search Results in Ruby

What is the correct way to get google search results?

According to http://code.google.com/apis/websearch/ , the Search API has been deprecated -- but there's a replacement, the Custom Search API. Will that do what you want?

If so, a quick Web search turned up https://github.com/alexreisner/google_custom_search , among other gems.

What is the correct way to get google search results?

According to http://code.google.com/apis/websearch/ , the Search API has been deprecated -- but there's a replacement, the Custom Search API. Will that do what you want?

If so, a quick Web search turned up https://github.com/alexreisner/google_custom_search , among other gems.

How to parse a Google search page to get result statistics and AdWords count using Nokogiri

You need to use Selenium WebDriver to get dynamic content. Nokogiri alone cannot parse it.

require 'selenium-webdriver'

driver = Selenium::WebDriver.for :firefox
driver.get "https://www.google.com/search?q=cardiovascular+diesese"
doc = Nokogiri::HTML driver.page_source
doc.at_css('[id="result-stats"]').text

How to retrieve top 10 Google search results from a keyword through an API in Ruby?

You're not very explicit in your question about the trade offs you're willing to make, But you might want to think about this more:

I think the Google Custom Search might be an option but the daily limit of 100 queries is restricting. I would prefer to not scrape Google as it's a violation of their terms.

I've used google custom search, and it is very easy but the limit is in place. If you are concerned about not violating Google's TOS, this is the only way to go. You need to decide if you're willing to violate the TOS, and if not you should just use the google custom search.

Scraping and parsing Google search results using Python

You may find xgoogle useful... much of what you seem to be asking for is there...

When web scraping with Watir, how do I parse results in same class and enter them into separate CSV cells?

The #xpath method returns a NodeSet, which is a collection of matching nodes. The NodeSet includes Enumerable, which provides a number of methods for iterating over the collection. Rather than getting the text of the entire node set, you want to iterate over each node and collect its text.

sn_auth_name = row.xpath('span[@class="sn_auth_name"]').map { |node| node.text.strip }
#=> ["Plenge", "Wyk"]

As an Array of names, sn_auth_name will still get written to the CSV in a single cell. If you want each name written into its own cell, you will need to flatten the Array. You can either flatten the individual column using a splat:

csv << [*sn_auth_name, sn_target_lang]

If there are multiple to flatten, you can also flatten the whole array:

csv << [sn_auth_name, sn_target_lang].flatten

Doing the above will mean that each row has a different number of columns. You can pad all of the rows so that they have the same number of columns:

# Variable to define which column is the first name column
col_auth_name = 0

# Collect the data from the table into an Array
data = []
doc.css('td.res2').each do |row|
sn_auth_name = row.xpath('span[@class="sn_auth_name"]').map { |node| node.text.strip }
sn_target_lang = row.xpath('span[@class="sn_target_lang"]/text()').text.strip
data << [sn_auth_name, sn_target_lang]
end

# Determine max number of names in a row
max_auth_name = data.map { |row| row[col_auth_name].length }.max

CSV.open("file.csv", "a") do |csv|
data.each do |row|
# Fill the Array of names to meet the max length
row[col_auth_name].fill('', row[col_auth_name].length..(max_auth_name - 1))

# Write to the CSV file
csv << row.flatten
end
end

Google custom search API and Ruby

According to Search request metadata, there should be a nextPage value returned next to items when there are additional results. However it always says Note: This API returns up to the first 100 results only. so it looks like you are already getting the maximum number of results.

Google search of: rand between -1 and 1 ruby gives 0 results

The minus sign operator in a Google search is saying "exclude", so you're saying "exclude 1 and include 1", so zero results.

Similarly, if you want to exclude a word entirely, you can add a dash
before it—like justin bieber -sucks if you want sites that only speak
of Justin Bieber in a positive light.

-LifeHacker

As to the question you're searching for, check this out...
-How to get a random number in Ruby



Related Topics



Leave a reply



Submit