mysql regex get position of matched first alphabetic character

て烟熏妆下的殇ゞ 提交于 2019-12-11 14:54:53

问题


I have a mysql query with REGEXP which match the starting of field with 'A', 'An' and 'The' Followed by space if match then trim the field from starting of first space, then i match the starting of field with special character like (','',[:space:]) if yes then trim all the leading special character. Mysql query is with CASE like this:

CASE
  WHEN field_data_field_display_title_field_display_title_value REGEXP '(^(A|An|The)[[:space:]])' = 1 THEN
  TRIM(SUBSTR(field_data_field_display_title_field_display_title_value , INSTR(field_data_field_display_title_field_display_title_value ,' ')))
  WHEN field_data_field_display_title_field_display_title_value REGEXP '(^[\"\'[:space:]])' = 1 THEN
    TRIM(SUBSTR(field_data_field_display_title_field_display_title_value ,2))
  ELSE field_data_field_display_title_field_display_title_value
END

I am not able to trim all leading special character while i can trim the first leading special character by passing '2' in SUBSTR function. As mysql doesn't support capturing group so i can't get the matched value in captured group.

So my question is how can i get the position of first alphabetic character in field with mysql query so that i can pass that position in SUBSTR function to trim all the leading special character. I tried with [:alpha:] class like:

TRIM(SUBSTR(field_data_field_display_title_field_display_title_value ,
 INSTR(field_data_field_display_title_field_display_title_value ,[:alpha:])))

but it give mysql syntax error. Or Anybody can suggest me any other approach to trim all the leading special characters.

Thanks in Advance!


回答1:


There's no regexp match function that reports the position in the string, nor is there any regexp replace function in MySQL.

If you know you're searching for a short list of specific words, you could pick the least position among several matches:

SUBSTRING(field_data_field_display_title_field_display_title_value,
  LEAST(
    INSTR(field_data_field_display_title_field_display_title_value, 'A '),
    INSTR(field_data_field_display_title_field_display_title_value, 'An '),
    INSTR(field_data_field_display_title_field_display_title_value, 'The ')
  )
)

It's usually awkward to do substring matches or replaces in SQL, because SQL is fundamentally designed to treat a column as an irreducible piece of data. Any functions to work with substrings are extensions to the language, not something built-in.

If you want better handling by string functions, it would be easier to fetch the whole string into an application, and write code using a more rich set of functions. Though I understand this is not practical if the reason for the substring manipulation you describe is for expressions that affect query results, such as the WHERE clause to restrict rows, or in the ORDER BY clause to sort.

If so, then the better solution is to change the way you store strings. Split up the strings in a prefix portion with the special characters, then a separate column for the portion starting with A, An, or The, and then perhaps even a third column with trailing text that you don't want to be part of the main text.

The advantage of splitting it up is that SQL expressions to work on the main string are much simpler, and you can even index it normally to gain a lot of performance for certain queries.




回答2:


I was using the mysql snippet i posted in question in ORDER BY clause to sort the data. As i was having small list of matches which i want to remove so i followed @BillKarwin suggestion. ORDER BY clause in query become something like

ORDER BY 
  CASE
    WHEN field_data_field_display_title_field_display_title_value REGEXP '^(A|An|The)[[:space:]]' = 1 THEN
      TRIM(SUBSTR(field_data_field_display_title_field_display_title_value , INSTR(field_data_field_display_title_field_display_title_value ,' ')))
    WHEN field_data_field_display_title_field_display_title_value REGEXP '^[\']' = 1 THEN
      TRIM(LEADING '\'' FROM field_data_field_display_title_field_display_title_value)
    WHEN field_data_field_display_title_field_display_title_value REGEXP '^[[:space:]]' = 1 THEN
      TRIM(LEADING ' ' FROM field_data_field_display_title_field_display_title_value)
    WHEN field_data_field_display_title_field_display_title_value REGEXP '^[\"]' = 1 THEN
      TRIM(LEADING '"' FROM field_data_field_display_title_field_display_title_value)
    ELSE field_data_field_display_title_field_display_title_value
  END ASC


来源:https://stackoverflow.com/questions/24848117/mysql-regex-get-position-of-matched-first-alphabetic-character

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!