Lua string.gsub with Multiple Patterns

独自空忆成欢 提交于 2019-12-12 10:49:18

问题


I am working on renaming the Movie titles that has unwanted letters. The string.gsub can replace a string with "" nil value but I have around 200 string patterns that need to be replaces with "".

Right now I have to string.gsub for every pattern. I was thinking is there is a way to put all the string patterns in to single string.gsub line. I have searched around the web for the solution but still didn't got anything.

The movie title is like this B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT and I want to remove the extra characters like 2013, Hindi, 720p, DvDRip, CROPPED, AAC, x264, RickyKT.


回答1:


You can pass to string.gsub a table as the third argument like this:

local movie = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"
movie = movie:gsub("%S+", {["2013"] = "", ["Hindi"] = "", ["720p"] = "", 
                       ["DvDRip"] = "", ["CROPPED"] = "", ["AAC"] = "", 
                       ["x264"] = "", ["RickyKT"] = ""})

print(movie)



回答2:


Put all of the patterns in a table and then enumerate the table, calling string.gsub() for each pattern:

str = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"

patterns = {"pattern1", "pattern2", "pattern3"}
for i,v in ipairs(patterns) do
    str = string.gsub(str, v, "")
end

This will require many invocations of string.gsub(), but the code should be much more maintainable than having a lot of string.gsub() calls.




回答3:


You could do it in a simple function, that way you do not need to write the code each time per string, or just put string.gsub, and the replacement value for the string you need

Function:

local large_name = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"

function clean_name(str)
  local v = string.gsub(str, "(.-)%s([%(%[']?%d%d%d?%d?[%)%]]?)%s*(.*)", "%1")
  return v
end

print(clean_name(large_name))

Only string.gsub for value

local large_name = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"
local clean_name = string.gsub(large_name, "(.-)%s([%(%[']?%d%d%d?%d?[%)%]]?)%s*(.*)", "%1")

print(clean_name)

The replacement pattern places the first value (name of the movie) separated by a space and prints it, also identifies the year as the second value, to avoid error in the titles, so it is not necessary to place all the values ​​that can exist within the name of the movie and will avoid many false positives

I add a testing function to test different movie names

local testing = {"Whiplash 2014 [1080p]",
"Anon (2018) [WEBRip] [1080p] [YTS.AM]",
"Maze Runner The Death Cure 2018 [WEBRip] [1080p] [YTS.AM]",
"12 Strong [2018] [WEBRip] [1080p] [YTS.AM]",
"Kingsman The Secret Service (2014) [1080p]",
"The Equalizer [2014] [1080p]",
"Annihilation 2018 [WEBRip] [1080p] [YTS.AM]",
"The Shawshank Redemption '94",
"Assassin's Creed 2016 HC 720p HDRip 850 MB - iExTV",
"Captain Marvel (2019) [WEBRip] [1080p] [YTS.AM]",}

for k,v in pairs(testing) do
  local result = string.gsub(v, "(.-)%s([%(%[']?%d%d%d?%d?[%)%]]?)%s*(.*)", "%1")
  print(result)
end

Output:

Whiplash
Anon
Maze Runner The Death Cure
12 Strong
Kingsman The Secret Service
The Equalizer
Annihilation
The Shawshank Redemption
Assassin's Creed
Captain Marvel



回答4:


To avoid to write keys and values on a table for every new entry, i'd write a function to handle a numerically indexed table (the patterns being the values).

This way I dont need to write {["pattern_n"] = ""} for every new pattern.

Ex:

PATTERNS = {"2013", "Hindi", "720p", "DvDRip", "CROPPED", "AAC", "x264", "RickyKT"}
function replace(match)
    local ret = nil
    for i, v in ipairs(PATTERNS) do
        if v:find(match) then
            ret = ""
        end
    end
    return ret
end


local movie = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"
movie = movie:gsub("%S+", replace)

print(movie)


来源:https://stackoverflow.com/questions/25280374/lua-string-gsub-with-multiple-patterns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!