How to get fully-qualified URL from anchor href?

人走茶凉 提交于 2019-12-23 00:48:39

问题


I am writing a web crawler in php. Given a current URL, and an array of links to absolute, relative, and root URLs, how would I determine the fully-qualified URL for each link?

For example, I let's say I am crawling the URL:

http://www.example.com/path/to/my/file.html

And the array of links that the webpage contains is:

array(
    'http://www.some-other-domain.com/',
    '../../',
    '/search',
);

How would I determine the fully-qualified URL for each of those links? The result I am looking for in this example would be, respectively:

http://www.some-other-domain.com/
http://www.example.com/path/
http://www.example.com/search/

回答1:


I think the easiest way is to use a library like this: http://www.electrictoolbox.com/php-resolve-relative-urls-absolute/

Examples from the link:

url_to_absolute('http://www.example.com/sitemap.html', 'aboutus.html');

resolves to http://www.example.com/aboutus.html

or

url_to_absolute('http://www.example.com/content/sitemap.html', '../images/somephoto.jpg');

resolves to http://www.example.com/images/somephoto.jpg



来源:https://stackoverflow.com/questions/28314414/how-to-get-fully-qualified-url-from-anchor-href

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!