Matching Product Prices from an HTML text

喜夏-厌秋 提交于 2019-12-02 08:33:34

问题


I'm trying a simple regex on a string for pricing information, but my preg_match_all is simply not finding what it should.

I'm looking for instance of e.g. $**.** or £**.** or sometimes the currency symbol might be encoded as an HTML entity e.g. for GBP £ or £

Is there an issue with using preg_match_all to find html entities?

Here's what I'm trying:

$price = preg_match_all(
    '#(?:\$|\£|\€|\£|\£)(\d+(?:\.\d+)?)#', 
    $string, 
    $matches
);

But I get: Unknown modifier '1'


回答1:


Here is some obvious errors:

1) preg_match_all() expects at least 3 parameters, so it has to be

preg_match_all(
    '#(?:\$|\£|\€|\£|\£)(\d+(?:\.\d+)?)#', 
    $string, 
    $matches
);

The $matches variable will contain the matched strings. Your $price will contain the number of times the pattern matched. Please see http://php.net/preg_match_all for further information.

2) You have an unescaped delimiter:

'#(?:\$|\£|\€|\£|\£)(\d+(?:\.\d+)?)#'
 ^                       ^                    ^
 Start                   Unescaped            End

Fixing these two issues will make the code run without any parsing errors. It should also answer your literal question about matching entities.

However, I somewhat doubt the Regex achieves what you are trying to do. Prices are not always listed [CurrencySymbol][Amount]. For instance, Euros are usually written as 100€ or 100 €. So you'd have to check for digits before the symbols and whitespace after as well.



来源:https://stackoverflow.com/questions/12686350/matching-product-prices-from-an-html-text

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!