问题
This is a function which successfully grabs single lines out of strings until it's a text with some Polish special characters
DELIMITER $$
DROP FUNCTION SPLIT_STR $$
CREATE FUNCTION SPLIT_STR(x VARCHAR(1500) CHARSET utf8 COLLATE utf8_unicode_ci, delim VARCHAR(12) CHARSET utf8 COLLATE utf8_unicode_ci, pos INTEGER)
RETURNS VARCHAR(500) CHARSET utf8 COLLATE utf8_unicode_ci
BEGIN
DECLARE output VARCHAR(1500) CHARSET utf8 COLLATE utf8_unicode_ci;
SET output = REPLACE(SUBSTRING(SUBSTRING_INDEX(x, delim, pos)
, LENGTH(SUBSTRING_INDEX(x, delim, pos - 1)) + 1)
, delim
, '');
RETURN output;
END $$
As you can see, I am manually setting charset and collation (the same that whole database uses). I have also tried without charset and collation settings and it doesn't work.
Output to reproduce (that's how it's stored in DB as a single field):
śńąśąńśąńśąńóńśńąśąńśąńśąńóń
śńąśąńśąńśąńóń
sas
By doing
SELECT
SPLIT_STR(slides.content1, '\n', 1),
SPLIT_STR(slides.content1, '\n', 2),
SPLIT_STR(slides.content1, '\n', 3),
I actually only get the first line (the other 2 fields are empty)
śńąśąńśąńśąńóńśńąśąńśąńśąńóń
回答1:
CHAR_LENGTH() returns the length in characters, while LENGTH() returns the length in bytes. You should always use CHAR_LENGTH() when you intend to deal with the length in characters, and especially when dealing with multi-byte character sets, where the result between the two functions may differ.
Replacing LENGTH() with CHAR_LENGTH() in your function will likely fix the issue.
来源:https://stackoverflow.com/questions/28816726/mysql-function-to-split-strings-by-delimiter-doenst-work-with-polish-special-ch