I checked on the stackExchange description, and algorithm questions are one of the allowed topics. So here goes.
Given an input of a range, where begin and ending number
You cannot cover your requirement with Character Groups only. Imagine the Range 129-131
. The Pattern 1[2-3][1-9]
would also match 139
which is out of range.
So in this example you need to change the last group to something else: 1[2-3](1|9)
. You can now find this effect as well for the tens and hundrets, leading to the problem that aapattern that basically represents each valid number as a fixed sequence of numbers is the only working solution. (if you don't want an algorithm that needs to track overflows in order to decide whether it should use [2-8]
or (8,9,0,1,2)
)
if you anyway autogenerate the pattern - keep it simple:
128-132
can be written as (I left out the non-matching group addition ?:
for better readability)
(128|129|130|131|132)
algorithm should be ovious, a for, an array, string concatenation and join.
That would already work as expected, but you can also perform some "optimization" on this if you like it more compact:
(128|129|130|131|132) <=>
1(28|29|30|31|32) <=>
1(2(8|9)|3(0|1|2))
more optimization
1(2([8-9])|3([0-2]))
Algorithms for the last steps are out there, look for factorization. An easy way would be to push all the numbers to a tree, depending on the character position:
1
2
8
9
3
0
1
2
and finally iterate over the three and form the pattern 1(2(8|9)|3(0|1|2))
. As a last step, replace anything of the pattern (a|(b|)*?c)
with [a-c]
Same goes for 11-29
:
11-29 <=>
(11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29) <=>
(1(1|2|3|4|5|7|8|9)|2(1|2|3|4|5|7|8|9)) <=>
(1([1-9])|2([1-9])
as an addition you now can proceed with the factorization:
(1([1-9])|2([1-9]) <=>
(1|2)[1-9] <=>
[1-2][1-9]