Matching TV and Movie File names with Regex

♀尐吖头ヾ 提交于 2019-12-02 11:09:46

I made some modifications to your regex, and it seems to work, if I understood you correctly.

^(
  (?P<ShowNameA>.*[^ (_.]) # Show name
    [ (_.]+
    ( # Year with possible Season and Episode
      (?P<ShowYearA>\d{4})
      ([ (_.]+S(?P<SeasonA>\d{1,2})E(?P<EpisodeA>\d{1,2}))?
    | # Season and Episode only
      (?<!\d{4}[ (_.])
      S(?P<SeasonB>\d{1,2})E(?P<EpisodeB>\d{1,2})
    | # Alternate format for episode
      (?P<EpisodeC>\d{3})
    )
|
  # Show name with no other information
  (?P<ShowNameB>.+)
)

See demo on regex101

EDIT: I've updated the regex to handle those last 3 situations you mentioned in comments.

One main problem was that you had no parens around the main alternation, so it included the whole regex. I also had to add an alternation to allow for none of the year/episode formats following the name.

Because you have so many different possible layouts that possibly conflict with each other, the regex ended up being lots of alternation of different scenarios. For example, to match a title that has no year or episode information at all, I had to add an alternation around the whole regex that if it can't find any known pattern, just match the whole thing.

Note: now that you seem to have expanded show years to match any four digits, there's no need for the lookahead. In other words, (?=\d{4,4})(?P<ShowYear>\d{4}) is the same as (?P<ShowYear>\d{4}). This also means that your alternate format for episode must match 3 digits only, not 4. Otherwise, there's no way to distinguish a stand-alone 4-digit sequence as a year or episode.

General pattern:

[ (_.]+                   the delimiter used throughout
(?P<ShowNameA>.*[^ (_.])  the show name, greedy but not including a delimiter
(?P<ShowNameB>.+)         the show name when it's the whole line

Format A (Year with possible Season and Episode):

(?P<ShowYearA>\d{4})
([ (_.]+S(?P<SeasonA>\d{1,2})E(?P<EpisodeA>\d{1,2}))?

Format B (Season and Episode only):

(?<!\d{4}[ (_.])
S(?P<SeasonB>\d{1,2})E(?P<EpisodeB>\d{1,2})

Format C (Alternate format for episode):

(?P<EpisodeC>\d{3})

if i may, i adapted brian's regex to match something like

SHOW.NAME.201X.SXXEXX.XSUB.VOSTFR.720p.HDTV.x264-ADDiCTiON.mkv

here it is (PHP PCRE)

/^(
    (?P<ShowNameA>.*[^ (_.]) # Show name
        [ (_.]+
        ( # Year with possible Season and Episode
            (?P<ShowYearA>\d{4})
            ([ (_.]+S(?P<SeasonA>\d{1,2})E(?P<EpisodeA>\d{1,2}))?
        | # Season and Episode only
            (?<!\d{4}[ (_.])
            S(?P<SeasonB>\d{1,2})E(?P<EpisodeB>\d{1,2})
        )
|
        # Show name with no other information
        (?P<ShowNameB>.+)
)/mx
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!