Regex/ Code to Fix Corrupt Serialized PHP Data

How to repair a serialized string that has been corrupted due to a removed slash before a single quote?

After doing further research I have found a work around solution. According to this blog post:

"It turns out that if there's a ", ', :, or ; in any of the array
values the serialization gets corrupted."

If I was working on a site that hadn't yet been put live, a prevention method would have been to base64_encode my serialized data before it was stored in the database like so:

base64_encode( serialize( $my_data ) );

And then:

unserialize( base64_decode( $encoded_serialized_string ) );

when retrieving the data.

However, as I cannot change what has already been stored in the database, this very helpful post(original post no longer available, but looks like this) provides a solution that works around the problem:

$fixed_serialized_data = preg_replace_callback ( '!s:(\d+):"(.*?)";!', function($match) {
return ($match[1] == strlen($match[2])) ? $match[0] : 's:' . strlen($match[2]) . ':"' . $match[2] . '";';
}, $my_data );

$result = unserialize( $fixed_serialized_data );

How to repair a serialized string which has been corrupted by an incorrect byte count length?

unserialize() [function.unserialize]: Error at offset was dues to invalid serialization data due to invalid length

Quick Fix

What you can do is is recalculating the length of the elements in serialized array

You current serialized data

$data = 'a:10:{s:16:"submit_editorial";b:0;s:15:"submit_orig_url";s:13:"www.bbc.co.uk";s:12:"submit_title";s:14:"No title found";s:14:"submit_content";s:12:"dnfsdkfjdfdf";s:15:"submit_category";i:2;s:11:"submit_tags";s:3:"bbc";s:9:"submit_id";b:0;s:16:"submit_subscribe";i:0;s:15:"submit_comments";s:4:"open";s:5:"image";s:19:"C:fakepath100.jpg";}';

Example without recalculation

var_dump(unserialize($data));

Output

Notice: unserialize() [function.unserialize]: Error at offset 337 of 338 bytes

Recalculating

$data = preg_replace('!s:(\d+):"(.*?)";!e', "'s:'.strlen('$2').':\"$2\";'", $data);
var_dump(unserialize($data));

Output

array
'submit_editorial' => boolean false
'submit_orig_url' => string 'www.bbc.co.uk' (length=13)
'submit_title' => string 'No title found' (length=14)
'submit_content' => string 'dnfsdkfjdfdf' (length=12)
'submit_category' => int 2
'submit_tags' => string 'bbc' (length=3)
'submit_id' => boolean false
'submit_subscribe' => int 0
'submit_comments' => string 'open' (length=4)
'image' => string 'C:fakepath100.jpg' (length=17)

Recommendation .. I

Instead of using this kind of quick fix ... i"ll advice you update the question with

  • How you are serializing your data

  • How you are Saving it ..

================================ EDIT 1 ===============================

The Error

The Error was generated because of use of double quote " instead single quote ' that is why C:\fakepath\100.png was converted to C:fakepath100.jpg

To fix the error

You need to change $h->vars['submitted_data'] From (Note the singe quite ' )

Replace

 $h->vars['submitted_data']['image'] = "C:\fakepath\100.png" ;

With

 $h->vars['submitted_data']['image'] = 'C:\fakepath\100.png' ;

Additional Filter

You can also add this simple filter before you call serialize

function satitize(&$value, $key)
{
$value = addslashes($value);
}

array_walk($h->vars['submitted_data'], "satitize");

If you have UTF Characters you can also run

 $h->vars['submitted_data'] = array_map("utf8_encode",$h->vars['submitted_data']);

How to detect the problem in future serialized data

  findSerializeError ( $data1 ) ;

Output

Diffrence 9 != 7
-> ORD number 57 != 55
-> Line Number = 315
-> Section Data1 = pen";s:5:"image";s:19:"C:fakepath100.jpg
-> Section Data2 = pen";s:5:"image";s:17:"C:fakepath100.jpg
^------- The Error (Element Length)

findSerializeError Function

function findSerializeError($data1) {
echo "<pre>";
$data2 = preg_replace ( '!s:(\d+):"(.*?)";!e', "'s:'.strlen('$2').':\"$2\";'",$data1 );
$max = (strlen ( $data1 ) > strlen ( $data2 )) ? strlen ( $data1 ) : strlen ( $data2 );

echo $data1 . PHP_EOL;
echo $data2 . PHP_EOL;

for($i = 0; $i < $max; $i ++) {

if (@$data1 {$i} !== @$data2 {$i}) {

echo "Diffrence ", @$data1 {$i}, " != ", @$data2 {$i}, PHP_EOL;
echo "\t-> ORD number ", ord ( @$data1 {$i} ), " != ", ord ( @$data2 {$i} ), PHP_EOL;
echo "\t-> Line Number = $i" . PHP_EOL;

$start = ($i - 20);
$start = ($start < 0) ? 0 : $start;
$length = 40;

$point = $max - $i;
if ($point < 20) {
$rlength = 1;
$rpoint = - $point;
} else {
$rpoint = $length - 20;
$rlength = 1;
}

echo "\t-> Section Data1 = ", substr_replace ( substr ( $data1, $start, $length ), "<b style=\"color:green\">{$data1 {$i}}</b>", $rpoint, $rlength ), PHP_EOL;
echo "\t-> Section Data2 = ", substr_replace ( substr ( $data2, $start, $length ), "<b style=\"color:red\">{$data2 {$i}}</b>", $rpoint, $rlength ), PHP_EOL;
}

}

}

A better way to save to Database

$toDatabse = base64_encode(serialize($data));  // Save to database
$fromDatabase = unserialize(base64_decode($data)); //Getting Save Format

Making a script that can recover a corrupt serialized string in PHP

function fix_corrupted_serialized_string($string) {
$tmp = explode(':"', $string);
$length = count($tmp);
for($i = 1; $i < $length; $i++) {
list($string) = explode('"', $tmp[$i]);
$str_length = strlen($string);
$tmp2 = explode(':', $tmp[$i-1]);
$last = count($tmp2) - 1;
$tmp2[$last] = $str_length;
$tmp[$i-1] = join(':', $tmp2);
}
return join(':"', $tmp);
}

working demo:
http://codepad.viper-7.com/GNbM25

PHP Unserilize is not working

Your serialized string has been damaged. As I rub my crystal ball, I can imagine someone manually performed string replacements (inappropriately) to update the url that immediately follows Have you listened in the array at index 4.

This is revealed after analyzing the data at this location:

s:1876:"Havе yоu listenеd
http://boletines.consumer.es/?p=50&u=https://gdfgl/96D4u9";

You see this stored value has 81 bytes/characters in it.

The serialized data strictly claims that the value must have 1876 bytes/characters in it.

Ultimately, your serialized data has been compromised -- either the length or the value.

If you are not bothered by the current value, you can manually repair the serialized data with this: https://3v4l.org/GqsHu

This is from a post of mine here: https://stackoverflow.com/a/55074706/2943403

With the provided snippet, you can either repair the corrupted serialized data on the fly each time, or you can take the time to repair all corrupted data and update your database so that this headache doesn't present itself again.

Let this occurrence be a lesson to developers -- Never try to take a short cut to update serialized data. You must unserialize it, modifying it, then re-serialize it so that a valid string is generated.

Regex to match PHP serialized data inside a string

I found out about ini_set('session.serialize_handler', 'php_serialize'); It changes the serialization to use PHP's regular serialize method instead of the alternate, which solves the problem. – Miryafa

Fix serialized data broken due to editing MySQL database in a text editor?

Visit this page: http://unserialize.onlinephpfunctions.com/

On that page you should see this sample serialized string: a:1:{s:4:"Test";s:17:"unserialize here!";}. Take a piece of it-- s:4:"Test";. That means "string", 4 characters, then the actual string. I am pretty sure that what you did caused the numeric character count to be out of sync with the string. Play with the tool on the site mentioned above and you will see that you get an error if you change "Test" to "Tes", for example.

What you need to do is get those character counts to match your new string. If you haven't corrupted any of the other encoding-- removed a colon or something-- that should fix the problem.

Check to see if a string is serialized?

From WordPress core functions:

<?php
function is_serialized( $data, $strict = true ) {
// If it isn't a string, it isn't serialized.
if ( ! is_string( $data ) ) {
return false;
}
$data = trim( $data );
if ( 'N;' === $data ) {
return true;
}
if ( strlen( $data ) < 4 ) {
return false;
}
if ( ':' !== $data[1] ) {
return false;
}
if ( $strict ) {
$lastc = substr( $data, -1 );
if ( ';' !== $lastc && '}' !== $lastc ) {
return false;
}
} else {
$semicolon = strpos( $data, ';' );
$brace = strpos( $data, '}' );
// Either ; or } must exist.
if ( false === $semicolon && false === $brace ) {
return false;
}
// But neither must be in the first X characters.
if ( false !== $semicolon && $semicolon < 3 ) {
return false;
}
if ( false !== $brace && $brace < 4 ) {
return false;
}
}
$token = $data[0];
switch ( $token ) {
case 's':
if ( $strict ) {
if ( '"' !== substr( $data, -2, 1 ) ) {
return false;
}
} elseif ( false === strpos( $data, '"' ) ) {
return false;
}
// Or else fall through.
case 'a':
case 'O':
return (bool) preg_match( "/^{$token}:[0-9]+:/s", $data );
case 'b':
case 'i':
case 'd':
$end = $strict ? '$' : '';
return (bool) preg_match( "/^{$token}:[0-9.E+-]+;$end/", $data );
}
return false;
}


Related Topics



Leave a reply



Submit