I am using Hotaru CMS with the Image Upload plugin, I get this error if I try to attach an image to a post, otherwise there is no error:
unserialize()
The corruption in this question is isolated to a single substring at the end of the serialized string with was probably manually replaced by someone who lazily wanted to update the image
filename. This fact will be apparent in my demonstration link below using the OP's posted data -- in short, C:fakepath100.jpg
does not have a length of 19
, it should be 17
.
Since the serialized string corruption is limited to an incorrect byte/character count number, the following will do a fine job of updating the corrupted string with the correct byte count value.
It looks like many of the earlier posts are just copy-pasting a regex pattern from someone else. There is no reason to capture the potentially corrupted byte count number if it isn't going to be used in the replacement. Also, adding the s
pattern modifier is a reasonable inclusion in case a string value contains newlines/line returns.
*For those that are not aware of the treatment of multibyte characters with serializing, you must not use mb_strlen()
in the custom callback because it is the byte count that is stored not the character count, see my output...
Code: (Demo with OP's data) (Demo with arbitrary sample data) (Demo with condition replacing)
$corrupted = <<
Output:
a:4:{i:0;s:3:"three";i:1;s:5:"five";i:2;s:2:"newline1
Newline2";i:3;s:6:"garçon";}
a:4:{i:0;s:5:"three";i:1;s:4:"five";i:2;s:17:"newline1
Newline2";i:3;s:7:"garçon";}
---
array (
0 => 'three',
1 => 'five',
2 => 'newline1
Newline2',
3 => 'garçon',
)
One leg down the rabbit hole... The above works fine even if double quotes occur in a string value, but if a string value contains ";
or some other monkeywrenching sbustring, you'll need to go a little further and implement "lookarounds". My new pattern
checks that the leading s
is:
;
and checks that the ";
is:
}
ors:
or i:
I haven't test each and every possibility; in fact, I am relatively unfamiliar with all of the possibilities in a serialized string because I never elect to work with serialized data -- always json in modern applications. If there are additional possible leading or trailing characters, leave a comment and I'll extend the lookarounds.
Extended snippet: (Demo)
$corrupted_byte_counts = <<
Output:
corrupted serialized array:
a:12:{i:0;s:3:"three";i:1;s:5:"five";i:2;s:2:"newline1
newline2";i:3;s:6:"garçon";i:4;s:111:"double " quote \"escaped";i:5;s:1:"a,comma";i:6;s:9:"a:colon";i:7;s:0:"single 'quote";i:8;s:999:"semi;colon";s:5:"assoc";s:3:"yes";i:9;s:1:"monkey";wrenching doublequote-semicolon";s:3:"s:";s:9:"val s: val";}
---
repaired serialized array:
a:12:{i:0;s:5:"three";i:1;s:4:"five";i:2;s:17:"newline1
newline2";i:3;s:7:"garçon";i:4;s:24:"double " quote \"escaped";i:5;s:7:"a,comma";i:6;s:7:"a:colon";i:7;s:13:"single 'quote";i:8;s:10:"semi;colon";s:5:"assoc";s:3:"yes";i:9;s:39:"monkey";wrenching doublequote-semicolon";s:2:"s:";s:10:"val s: val";}
---
Array
(
[0] => three
[1] => five
[2] => newline1
newline2
[3] => garçon
[4] => double " quote \"escaped
[5] => a,comma
[6] => a:colon
[7] => single 'quote
[8] => semi;colon
[assoc] => yes
[9] => monkey";wrenching doublequote-semicolon
[s:] => val s: val
)