my_url =~ /^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$/ix
After I read blog post from Michael Bleigh, I realized that there is a Ruby way to do URL validation. The secret is regexp method of URI module. It will regenerate a regular expression based on the protocol name parameter that you pass in. URI::regexp will return 0 if URL is valid and return nil if URL is not valid.
require 'open-uri'
"http://google.com" =~ URI::regexp("ftp") # => nil
"http://google.com" =~ URI::regexp("http") # => 0
"google.com" =~ URI::regexp("ftp") # => nil
"google.com" =~ URI::regexp(%w(ftp http)) # => nil
"http://google.com" =~ URI::regexp(["ftp", "http", "https"]) # => 0
If you use Rails, URI::regexp can be plugged directly into your model validation.
class ExampleModel < ActiveRecord::Base validates_format_of :site, :with => URI::regexp(%w(http https)) endThank You Michael Bleigh for sharing this.
Update:
This approach seems flawed. When pass "http://" =~ URI::regexp("http") it will returns 0 indicating the URL to be valid. So, I recommend to use the regular expression provided at the beginning of the post.
"http://" =~ URI::regexp("http") # => 0
"http://" =~ /^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$/ix # => nil
Thanks to Losk, who points out in the comments below.
The approach seems flawed. For example if I open console, require 'open-uri' and then pass "http://" =~ URI::regexp("http"), it returns 0 indicating the URL to be valid, and while it may be in terms of what open-uri's doing, it doesn't seem like a URL I'd want to associate to any user who's entering information on my site.
ReplyDeleteIt's worth mentioning that when using the regular expression provided at the beginning of the post , it finds the example "http://" to be invalid.
you're right. when pass "http://" =~ URI::regexp("http"), it returns 0 and it means the "http://" is valid. Thanks for your pointer, Losk.
ReplyDeleteYour regexp doesn't support port numbers?
ReplyDelete