How to extract URLs from text

前端 未结 6 2115
有刺的猬
有刺的猬 2020-12-03 02:56

How do I extract all URLs from a plain text file in Ruby?

I tried some libraries but they fail in some cases. What\'s the best way?

6条回答
  •  死守一世寂寞
    2020-12-03 03:51

    What cases are failing?

    According to the library regexpert, you can use

    regexp = /(^$)|(^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$)/ix
    

    and then perform a scan on the text.

    EDIT: Seems like the regexp supports the empty string. Just remove the initial (^$) and you're done

提交回复
热议问题