Ruby code to extract host from URL string
You could try something like this:
require 'uri'
myUri = URI.parse( 'http://www.mglenn.com/directory' )
print myUri.host
# => www.mglenn.com
How do I get just the sitename from url in ruby?
Using a gem for this might be overkill, but anyway: There's a handy gem called domainatrix that can extract the sitename for your while dealing with things like two element top-level domains and more.
url = Domainatrix.parse("http://www.pauldix.net")
url.url # => "http://www.pauldix.net" (the original url)
url.public_suffix # => "net"
url.domain # => "pauldix"
url.canonical # => "net.pauldix"
url = Domainatrix.parse("http://foo.bar.pauldix.co.uk/asdf.html?q=arg")
url.public_suffix # => "co.uk"
url.domain # => "pauldix"
url.subdomain # => "foo.bar"
url.path # => "/asdf.html?q=arg"
url.canonical # => "uk.co.pauldix.bar.foo/asdf.html?q=arg"
How would you parse a url in Ruby to get the main domain?
This should work with pretty much any URL:
# URL always gets parsed twice
def get_host_without_www(url)
url = "http://#{url}" if URI.parse(url).scheme.nil?
host = URI.parse(url).host.downcase
host.start_with?('www.') ? host[4..-1] : host
end
Or:
# Only parses twice if url doesn't start with a scheme
def get_host_without_www(url)
uri = URI.parse(url)
uri = URI.parse("http://#{url}") if uri.scheme.nil?
host = uri.host.downcase
host.start_with?('www.') ? host[4..-1] : host
end
You may have to require 'uri'
.
How to get domain from URL without using URI Parser. I want to done it using regex
I'd go with @Arup Rakshit's solution. However if you really want a regexp, why not using
/^http:\/\/(.+)\.[a-z]{2,3}/
Given a URL, how can I get just the domain?
Use Addressable::URI.parse and the #host instance method:
Addressable::URI.parse("http://techcrunch.com/foo/bar").host #=> "techcrunch.com"
How to parse a URL and extract the required substring
I'd do it this way:
require 'uri'
uri = URI.parse('http://something.example.com/directory/')
uri.host.split('.').first
=> "something"
URI is built into Ruby. It's not the most full-featured but it's plenty capable of doing this task for most URLs. If you have IRIs then look at Addressable::URI.
What's the best way to parse URLs to extract the domain?
You can use domainatrix gem to get what you want: url.domain + url.public_suffix
, but you can just do some string manipulation like uri[4..-1]
.
Extract all urls inside a string in Ruby
A different approach, from the perfect-is-the-enemy-of-the-good school of thought:
urls = content.split(/\s+/).find_all { |u| u =~ /^https?:/ }
Related Topics
Ruby Rails - Select Only Few Columns from the Data Base
Including a Ruby Class from a Separate File
How to Catch Errno::Econnreset Class in "Case When"
Rails Erb Form Helper Options_For_Select :Selected
How to or Should I Find an Object by the Object_Id Attribute in Ruby
Error Installing Debugger: Failed to Build Gem Native Extension with Ruby-1.9.3-P362
Param Is Missing or the Value Is Empty: User Rails 4
Write CSV in Ruby 1.9 and CSV::Writer
Rails 3:How to Generate Models for Existing Database Tables
How to Cache a Calculated Column in Rails
Rvm Ruby Installation Errors - MAC
Ruby Dsl (Domain Specific Language) Repositories, Examples
Run Ruby Script in Elevated Mode
In Ruby, How to Check If Method "Foo=()" Is Defined