Rails: What's a good way to validate links (URLs)?
Validating an URL is a tricky job. It's also a very broad request.
What do you want to do, exactly? Do you want to validate the format of the URL, the existence, or what? There are several possibilities, depending on what you want to do.
A regular expression can validate the format of the URL. But even a complex regular expression cannot ensure you are dealing with a valid URL.
For instance, if you take a simple regular expression, it will probably reject the following host
http://invalid##host.com
but it will allow
http://invalid-host.foo
that is a valid host, but not a valid domain if you consider the existing TLDs. Indeed, the solution would work if you want to validate the hostname, not the domain because the following one is a valid hostname
http://host.foo
as well the following one
http://localhost
Now, let me give you some solutions.
If you want to validate a domain, then you need to forget about regular expressions. The best solution available at the moment is the Public Suffix List, a list maintained by Mozilla. I created a Ruby library to parse and validate domains against the Public Suffix List, and it's called PublicSuffix.
If you want to validate the format of an URI/URL, then you might want to use regular expressions. Instead of searching for one, use the built-in Ruby URI.parse
method.
require 'uri'
def valid_url?(uri)
uri = URI.parse(uri) && uri.host
rescue URI::InvalidURIError
false
end
You can even decide to make it more restrictive. For instance, if you want the URL to be an HTTP/HTTPS URL, then you can make the validation more accurate.
require 'uri'
def valid_url?(url)
uri = URI.parse(url)
uri.is_a?(URI::HTTP) && !uri.host.nil?
rescue URI::InvalidURIError
false
end
Of course, there are tons of improvements you can apply to this method, including checking for a path or a scheme.
Last but not least, you can also package this code into a validator:
class HttpUrlValidator < ActiveModel::EachValidator
def self.compliant?(value)
uri = URI.parse(value)
uri.is_a?(URI::HTTP) && !uri.host.nil?
rescue URI::InvalidURIError
false
end
def validate_each(record, attribute, value)
unless value.present? && self.class.compliant?(value)
record.errors.add(attribute, "is not a valid HTTP URL")
end
end
end
# in the model
validates :example_attribute, http_url: true
How to validate url in rails model?
I solved this problem with this gem https://github.com/ralovets/valid_url
Url validation rails
URI.regexp returns a URI which will match all valid URIs, but a) it doesn't validate that the string is only that URI, and b) you want to validate URLs, not URIs (URI is the broader term).
For a), you can modify the regex to match only if the string is just the URI by wrapping it in the ^
(start-of-line) and $
(end-of-line) symbols:
validates :website, format: { with: /^#{URI.regexp.to_s}$/ }, if: 'website.present?'
For b), you could change your call to URI.regexp(['http', 'https'])
to limit the allowed schemes, which gets you closer. There are also gems for this problem, like valid_url. Or, just accept that your validation will never be perfect.
How to validate a url in Rails
I found a good way to do validate an url in rails 4 with a method of the URI class.
class User < ActiveRecord::Base
validates_format_of :url, :with => URI::regexp(%w(http https))
end
ActiveRecord validate url if it is present
Separate your two validators.
validates :url, presence: true
validates :url, format: { with: URI.regexp }, if: Proc.new { |a| a.url.present? }
(almost) 2 year anniversary edit
As vrybas and Barry state, the Proc is unnecessary. You can write your validators like this:
validates :url, presence: true
validates :url, format: { with: URI.regexp }, if: 'url.present?'
How to validate unique url with Rails and validates_url?
So the two validations, uniqueness and URL, happen separately, and there is nothing in the uniqueness check to handle the fact that those two URLs are essentially the same - instead, the string values are technically different, and thus it doesn't trip the uniqueness validation.
What you could do is look to tidy up your URL data before validation, with a before_validation
callback in your model:
before_validation :process_url
def process_url
self.homepage = self.homepage.slice(0, self.homepage.length - 1) if self.homepage.present? && self.homepage.ends_with?("/")
end
This is called before the validations kick in, and will make sure that if the homepage attribute is present (even if you add a presence validation later if it becomes non-optional, remember this is running before validations), then any trailing /
is removed.
Those two URL strings will then be the same after tidying up, and thus the second time around the validation will kick in and stop it from being saved.
Hope that helps!
How to check if a URL is valid
Notice:
As pointed by @CGuess, there's a bug with this issue and it's been documented for over 9 years now that validation is not the purpose of this regular expression (see https://bugs.ruby-lang.org/issues/6520).
Use the URI
module distributed with Ruby:
require 'uri'
if url =~ URI::regexp
# Correct URL
end
Like Alexander Günther said in the comments, it checks if a string contains a URL.
To check if the string is a URL, use:
url =~ /\A#{URI::regexp}\z/
If you only want to check for web URLs (http
or https
), use this:
url =~ /\A#{URI::regexp(['http', 'https'])}\z/
how to check if a url/link is safe in ruby on rails
Here is what I found so far, hope this help someone (that have the same requirement):
As pointed by @debugger there are multiple services that provide these functionalities, the best fit in my case are the ones below:
Safe browsing google API no commercial purposes
Web Risk for commercial purposes
The above are google APIs that can be used to check if a URL is safe or not.
In the last case you will be charged after a certain amount of request so maybe is a good idea to check if the URL/link is valid: valid URL gem
Custom URL validation method in Rails 3.2
If you just want to get it working, then how about:
errors.add(:url, 'not valid') if (url =~ URI::regexp).nil?
But if parsing urls is important to you, you might want to consider an alternative to ruby's standard URI implementation such as addressable, which handles UTF-8 characters, normalization and other edge cases that might be important depending on the context.
See also this: check if url is valid ruby
rails validating url with URI.parse throwing unexpected results
Your issue is actually a Ruby gotcha, on this line:
uri = URI.parse(uri)
What you wanted to happen was for uri
to be redefined as the parsed version of uri
. What's actually happening is that Ruby is seeing "oh, you're defining a new local variable uri
"; that new local variable now takes precedence over the method uri
which you defined with your attr_accessor. And for some reason, Ruby evaluates self-references during local variable assignment as nil
. This is an example of shadowing.
So, the statement above causes URI.parse
to always execute on a value of nil
, which is why your test is failing no matter what you set that URI to. To fix it, just use a different variable name:
parsed_uri = URI.parse(uri)
Appendix: short examples that prove I'm not making this up
irb(main):016:0> z
NameError: undefined local variable or method `z' for main:Object
from (irb):16
from /Users/rnubel/.rubies/ruby-2.3.3/bin/irb:11:in `<main>'
irb(main):017:0> z = z
=> nil
The first statement fails, because it just references a local var that doesn't exist. The second statement succeeds, because Ruby evaluates the z
variable in the assignment of z
as nil
.
irb(main):011:0> class Foo; def bar; "http://www.google.com"; end; end
=> :bar
irb(main):012:0> Foo.new.instance_exec { bar }
=> "http://www.google.com"
irb(main):013:0> Foo.new.instance_exec { bar = (puts bar.inspect) }
nil
=> nil
This is the same issue you're having; just with a puts
to inspect the value in-line. It prints nil
.
Related Topics
Get List of a Class' Instance Methods
Ruby 2.0 Rails Gem Install Error "Cannot Load Such File - Openssl"
How to Generate a List of N Unique Random Numbers in Ruby
Understanding Private Methods in Ruby
In Ruby How to Overload the Initialize Constructor
Accessing Elements of Nested Hashes in Ruby
What Do 'I' and '-I' in Regex Mean
How to Keep the Delimiters When Splitting a Ruby String
Measure the Distance Between Two Strings With Ruby
Cannot Install Json Gem in Rails Using Windows
Printing the Source Code of a Ruby Block
Is the Ruby Operator ||= Intelligent
Error When Running Rails App - Execjs::Runtimeerror
Query , Can Not Select Column Count
How to Pass Arguments into a Rake Task With Environment in Rails