I need to remove ordinals via regex, but my regex skills are quite lacking. The following locates the ordinals, but includes the digit just prior in the return value. I need
I came across this question, because I needed to replace ordinal numbers with dot, i. e. 1.
, 2.
, 4.
etc.
Here is the solution for this problem (in PHP):
$entry = preg_replace('/^\d+\. /', '', $entry);
Test: https://regex101.com/r/xLB6Ov/1
Try a negative lookbehind:
(?<=[0-9])(?:st|nd|rd|th)
assuming the dialect of regex supports it.
You need to use a look-behind assertion so that only st|nd|rd|th
preceded by a [0-9]
are matched, but the [0-9]
isn't included in the match. i.e.:
(?<=[0-9])(?:st|nd|rd|th)
I've linked to the perl-compatible syntax, but if you're using posix, posix extended, vi or one of many other regex syntaxes you'll need to look up the syntax.
If you want to remove as well the numbers followed by ordinals you could use this one:
[0-9]+(?:st| st|nd| nd|rd| rd|th| th)
So for a given text: "The 3rd person is missing but the 2 nd and the 1st is here" you'll have this output: "The person is missing but the and the is here"
In perl:
$var =~ s{\b(\d+)(?:st|nd|rd|th)\b}{$1};
In PHP:
$var = preg_replace('/\\b(\d+)(?:st|nd|rd|th)\\b/', '$1', $var);
In .NET:
var = Regex.Replace(@"\b(\d+)(?:st|nd|rd|th)\b", "$1");