I have markup language which is similar to markdown and the one used by SO.
Legacy parser was based on regexes and was complete nightmare to maintain, so I\'ve come up w
If one thing includes another, then normally you treat them as separate tokens and then nest them in the grammar. Lepl (http://www.acooke.org/lepl which I wrote) and PyParsing (which is probably the most popular pure-Python parser) both allow you to nest things recursively.
So in Lepl you could write code something like:
# these are tokens (defined as regexps)
stg_marker = Token(r'\*\*')
emp_marker = Token(r'\*') # tokens are longest match, so strong is preferred if possible
spo_marker = Token(r'%%')
....
# grammar rules combine tokens
contents = Delayed() # this will be defined later and lets us recurse
strong = stg_marker + contents + stg_marker
emphasis = emp_marker + contents + emp_marker
spoiler = spo_marker + contents + spo_marker
other_stuff = .....
contents += strong | emphasis | spoiler | other_stuff # this defines contents recursively
Then you can see, I hope, how contents will match nested use of strong, emphasis, etc.
There's much more than this to do for your final solution, and efficiency could be an issue in any pure-Python parser (There are some parsers that are implemented in C but callable from Python. These will be faster, but may be trickier to use; I can't recommend any because I haven't used them).