How to isolate non english words separated by spaces in Lua?

自闭症网瘾萝莉.ら 提交于 2019-12-11 01:49:18

问题


I have this string

"Hello there, this is some line-aa."

how to slice it into an array like this?

Hello
there,
this
is
some
line-aa.

this is what I have tried so far

function sliceSpaces(arg)
  local list = {}
  for k in arg:gmatch("%w+") do
    print(k)
    table.insert(list, k)
  end
  return list
end

local sentence = "مرحبا يا اخوتي"
print("sliceSpaces")
print(sliceSpaces(sentence))

this code works for English text, but not for arabic, how can I make it work for arabic too?


回答1:


Lua strings are sequences of bytes, not Unicode characters. The pattern %w matches alphanumeric characters, but it applies to ASCII only.

Instead, use %S to match a non-whitespace character:

for k in arg:gmatch("%S+") do


来源:https://stackoverflow.com/questions/38652593/how-to-isolate-non-english-words-separated-by-spaces-in-lua

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!