问题
The question is simple, I need to get the value of all attributes whose value starts withhttp://example.com/api/v3?
. For example, if a page contains
<iframe src="http://example.com/api/v3?download=example%2Forg">
<meta twitter="http://example.com/api/v3?return_to=%2F">
Then I should get an array/list with 2 member :http://example.com/api/v3?return_to=%2F
andhttp://example.com/api/v3?download=example%2Forg
(the order doesn’t matter).
I don’t want the elements, just the attribute’s value.
Basically I need the regex that returns strings starting with http://example.com/api/v3?
and ending with a space.
回答1:
A regular expression would likely look like this:
/http:\/\/example\.com\/api\/v3\?\S+/g
Make sure to escape each /
and ?
with a backslash. \S+
yields all subsequent non-space characters. You can also try [^\s"]+
instead of \S
if you also want to exclude quote marks.
In my experience, though, regexes are usually slower than working on already parsed objects directly, so I’d recommend you try these Array
and DOM functions instead:
Get all elements, map them to their attributes and filter those that start with http://example.com/api/v3?
, reduce all attributes lists to one Array and map those attributes to their values.
Array.from(document.querySelectorAll("*"))
.map(elem => Object.values(elem.attributes)
.filter(attr => attr.value.startsWith("http://example.com/api/v3?")))
.reduce((list, attrList) => list.concat(attrList), [])
.map(attr => attr.value);
You can find polyfills for ES6 and ES5 functions and can use Babel or related tools to convert the code to ES5 (or replace the arrow functions by hand).
回答2:
There is the CSS selector *
meaning "any element".
There is no CSS selector meaning "any attribute with this value". Attribute names are arbitrary. While there are several attributes defined in the HTML specs, it's possible to use custom ones like the twitter
attribute in your example. This means you'll have to iterate over all the attributes on a given element.
With out a global attribute value selector, you will need to manually iterate over all elements and values. It may be possible for you to determine some heuristics to help narrow down your search before going brute force.
来源:https://stackoverflow.com/questions/39822557/regex-to-return-all-attributes-of-a-web-page-that-starts-by-a-specific-value