问题
There are a lot of posts on converting relative to absolute paths in PHP. I'm looking for a specific implementation beyond these posts (hopefully). Could anyone please help me with this specific implementation?
I have a PHP variable containing diverse HTML, including href
s and img
s containing relative urls. Mostly (for example) /en/discover
or /img/icons/facebook.png
I want to process this PHP variable in such a way that the values of my href
s and img
s will be converted to http://mydomain.com/en/discover
and http://mydomain.com/img/icons/facebook.png
I believe the question below covers the solution for href
s. How can we expand this to also consider img
s?
- Change a relative URL to absolute URL
Would a regex be in order? Or since we're dealing with a lot of output should we use DOMDocument?
回答1:
After some further research I've stumbled upon this article from Gerd Riesselmann on how to solve the absence of a base href
solution for RSS-feeds. His snippet actually solves my question!
http://www.gerd-riesselmann.net/archives/2005/11/rss-doesnt-know-a-base-url
<?php
function relToAbs($text, $base)
{
if (empty($base))
return $text;
// base url needs trailing /
if (substr($base, -1, 1) != "/")
$base .= "/";
// Replace links
$pattern = "/<a([^>]*) " .
"href=\"[^http|ftp|https|mailto]([^\"]*)\"/";
$replace = "<a\${1} href=\"" . $base . "\${2}\"";
$text = preg_replace($pattern, $replace, $text);
// Replace images
$pattern = "/<img([^>]*) " .
"src=\"[^http|ftp|https]([^\"]*)\"/";
$replace = "<img\${1} src=\"" . $base . "\${2}\"";
$text = preg_replace($pattern, $replace, $text);
// Done
return $text;
}
?>
Thank you Gerd! And thank you shadyyx to point me in the direction of base href
!
回答2:
Excellent solution. However, there is a small typo in the pattern. As written above, it truncates the first character of the href or src. Here are patterns that work as intended:
// Replace links
$pattern = "/<a([^>]*) " .
"href=\"([^http|ftp|https|mailto][^\"]*)\"/";
and
// Replace images
$pattern = "/<img([^>]*) " .
"src=\"([^http|ftp|https][^\"]*)\"/";
The opening parenthesis of the second replacement references are moved. This brings the first character of the href or src which doesn't match http|ftp|https into the replacement references.
回答3:
I found that when the href src and base url started getting more complex, the accepted answer solution didn't work for me.
for example:
base url:
http://www.journalofadvertisingresearch.com/ArticleCenter/default.asp?ID=86411&Type=Article
href src:
/ArticleCenter/LeftMenu.asp?Type=Article&FN=&ID=86411&Vol=&No=&Year=&Any=
incorrectly returned:
/ArticleCenter/LeftMenu.asp?Type=Article&FN=&ID=86411&Vol=&No=&Year=&Any=
I found the below function which correctly returns the url. I got this from a comment here: http://php.net/manual/en/function.realpath.php from Isaac Z. Schlueter.
This correctly returned:
http://www.journalofadvertisingresearch.com/ArticleCenter/LeftMenu.asp?Type=Article&FN=&ID=86411&Vol=&No=&Year=&Any=
function resolve_href ($base, $href) {
// href="" ==> current url.
if (!$href) {
return $base;
}
// href="http://..." ==> href isn't relative
$rel_parsed = parse_url($href);
if (array_key_exists('scheme', $rel_parsed)) {
return $href;
}
// add an extra character so that, if it ends in a /, we don't lose the last piece.
$base_parsed = parse_url("$base ");
// if it's just server.com and no path, then put a / there.
if (!array_key_exists('path', $base_parsed)) {
$base_parsed = parse_url("$base/ ");
}
// href="/ ==> throw away current path.
if ($href{0} === "/") {
$path = $href;
} else {
$path = dirname($base_parsed['path']) . "/$href";
}
// bla/./bloo ==> bla/bloo
$path = preg_replace('~/\./~', '/', $path);
// resolve /../
// loop through all the parts, popping whenever there's a .., pushing otherwise.
$parts = array();
foreach (
explode('/', preg_replace('~/+~', '/', $path)) as $part
) if ($part === "..") {
array_pop($parts);
} elseif ($part!="") {
$parts[] = $part;
}
return (
(array_key_exists('scheme', $base_parsed)) ?
$base_parsed['scheme'] . '://' . $base_parsed['host'] : ""
) . "/" . implode("/", $parts);
}
来源:https://stackoverflow.com/questions/13457693/php-find-images-and-links-with-relative-path-in-output-and-convert-them-to-abso