Ignore date in a string with numbers using regular expression

十年热恋 提交于 2019-12-24 03:51:28

问题


I have a little Problem.

i use [0-9\,.]* to finde a decimal in a string. And ([^\s]+) to find the text behind the first number.

The string looks normally like this. 1 number a text and than a date:

1.023,45 stück

24.05.10

but sometimes I had just the date and then i become 240510 as decimal. And sometimes I had just the decimal.

How should I modify the regex to find the date if existing and remove it? And then look for a decimal an select this if existing.

Thanks in advance.


回答1:


Divide and conquer

  1. Check for the date first and remove the match from the string

    ([0-9]{1,2}\.){2}[0-9]{1,2}

  2. Find the number using your original regex

    [0-9\,.]*

  3. If you need it find the unit of quantity (assuming that you will only have it as lower case with u Umlaut)

    ([a-zü]+)

See http://regexe.de/ (German) and http://www.regexr.com/ (English) for some useful information and tools for dealing with regex.




回答2:


I suggest matching the number in a more restricted way (1-3 digits, then . + 3 digits groups if any, and a decimal separator with digits, optional).

(?s)(?<number>\d{1,3}(?:\.\d{3})*(?:,\d+)?)\s+(.*?)(?:$|\n|(?<date>\d{2}\.?`\d{2}\.?(?:\d{4}|\d{2})))

See demo

The number will be held in ${number}, and the date in ${date}. If the string starts with something very similar to a date (6 or 8 digits with optional periods), it won't be captured. If the date format is known (say, the periods are always present), remove the ?s from \.?s.

(?s) at the beginning will force the period . to match a new line (maybe it is not necessary).



来源:https://stackoverflow.com/questions/29996500/ignore-date-in-a-string-with-numbers-using-regular-expression

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!