问题
I would like detect and filter if html text sent from a form, contains url or urls.
For example, i send from a form this html:
RESOURCES<br></u></b><a target="_blank" rel="nofollow" href="http://stackoverflow.com/users/778094/hyperrjas">http://stackoverflow.com/users/778094/hyperrjas</a> <br><a target="_blank" rel="nofollow" href="https://github.com/hyperrjas">https://github.com/hyperrjas</a> <br><a target="_blank" rel="nofollow" href="http://www.linkedin.com/pub/juan-ardila-serrano/11/2a7/62">http://www.linkedin.com/pub/juan-ardila-serrano/11/2a7/62</a> <br>
I don't want allow a or various url/urls in html text. it could be something like:
validate :no_urls
def no_urls
if text_contains_url
errors.add(:url, "#{I18n.t("mongoid.errors.models.profile.attributes.url.urls_are_not_allowed_in_this_text", url: url)}")
end
end
I would like to know, how can I filter if a html text contain a or various urls?
回答1:
You can take Ruby´s build in URI module, which could already extract URIs from a text.
require "uri"
links = URI.extract("your text goes here http://example.com mailto:test@example.com foo bar and more...")
links => ["http://example.com", "mailto:test@example.com"]
So you could modify your validation like following:
validate :no_html
def no_html(text)
links = URI.extract(text)
unless links.empty?
errors.add(:url, "#{I18n.t("mongoid.errors.models.profile.attributes.url.urls_are_not_allowed_in_this_text", url: url)}")
end
end
回答2:
You could use a regex to parse strings that look like urls e.g. something like this: /^http:\/\/.*/
But if you want to detect html tags like a
you should look into libraries for parsing html.
Nokogiri is one such library.
回答3:
The Mattherick's answer only works if the string not contains the colon symbol ":".
With Ruby 1.9.3 right thing is to add a second parameter to fix this problem.
Also, If you add a email address as plain text, this code is not filtering this email address. The fix to this problem is:
html_text = "html text with email address e.g. info@test.com"
email_address = html_text.match(/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}/i)[0]
So, this is my code that works correctly for me:
def no_urls
whitelist = %w(attr1, attr2, attr3, attr4)
attributes.select{|el| whitelist.include?(el)}.each do |key, value|
links = URI.extract(value, /http(s)?|mailto/)
email_address = "#{value.match(/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}/i)}"
unless links.empty? and email_address.empty?
logger.info links.first.inspect
errors.add(key, "#{I18n.t("mongoid.errors.models.cv.attributes.no_urls")}")
end
end
end
Regards!
来源:https://stackoverflow.com/questions/16234613/validate-presence-of-url-in-html-text