Unserializing stored $_SESSION data in PHP

While improving the way we handle and store sessions for the *.facebox.com websites we needed a way to access, read, change and store saved php-sessions (we keep them in a memory cache). Apparently stored session are not simple serialized strings of the $_SESSION-object *. Well, almost, but not exactly. So we wrote our own ... Or at least: Had to write our own.

The following function unserializes encrypted session data and returns an object representing the $_SESSION object. A serialized $_SESSION object differs from a normal serialize () in that each key of the $_SESSION-object/array is seperated by | from it's (usual php-)serialized value.

PHP:
  1. /**
  2. * This function unserializes a stored session. This serialisation is slightly different then the
  3. * php-serializer (each key of $_SESSION is seperated by a |).
  4. *
  5. * @param  string    $serialized_string  a serialized representation of a session object
  6. * @return mixed_var                     an unserialized object
  7. */
  8. function unserialize ($serialized_string)
  9. {
  10.     $object = array ();
  11.     $buffer  = $serialized_string[0];
  12.     $open    = 0;
  13.     $openAcc = 0;
  14.     $key     = '';
  15.  
  16.     for ($i = 1, $count = strlen ($serialized_string); $i <$count; $i++)
  17.     {
  18.         $curChar = $serialized_string[$i];
  19.        
  20.         if ($curChar == '|' && $open == 0)
  21.         {
  22.             $key = $buffer;
  23.             $buffer = '';
  24.             continue;
  25.         }
  26.         elseif ($curChar == '{' && $serialized_string[$i-1] == ':' && $open == 0)
  27.         {
  28.             $openAcc ++;
  29.         }
  30.         elseif ($curChar == '}' && $open == 0 && $openAcc> 0)
  31.         {
  32.             $openAcc --;
  33.             if ($openAcc == 0 && $serialized_string[$i-1] == '{')
  34.             {
  35.                 $object[$key] = array ();
  36.                 $buffer = '';
  37.                 continue;
  38.             }
  39.             elseif ($openAcc == 0 && $serialized_string[$i-1] == ';')
  40.             {
  41.                 $object[$key] = unserialize ($buffer . $curChar);
  42.                 $buffer = '';
  43.                 continue;
  44.             }
  45.             elseif ($openAcc == 0 && $serialized_string[$i-1] == '}')
  46.             {
  47.                 $object[$key] = unserialize ($buffer . $curChar);
  48.                 $buffer = '';
  49.                 continue;
  50.             }
  51.         }
  52.         elseif ($curChar == '"' && $serialized_string[$i-1] == ':' && $serialized_string[$i+1] != ';')
  53.         {
  54.             $open++;
  55.         }
  56.         elseif ($curChar == '"' && $serialized_string[$i+1] == ';' )
  57.         {
  58.             $open--;
  59.         }
  60.         elseif ($curChar == ';' && $open == 0 && $openAcc == 0)
  61.         {
  62.             $object[$key] = unserialize ($buffer . $curChar);
  63.             $buffer = '';
  64.             continue;
  65.         }
  66.         $buffer .= $curChar;
  67.     }
  68.    
  69.     return $object;
  70. }

The function we wrote should do the same as other attempts at writing such a function; as you can find here. But which one is the most efficient?
It is different from these or these other attempts in that it can handle ';' or '|' in values (php serializing doesn't escape those characters).

We opted not to use session_decode () and session_encode () for temporary overwriting of the $_SESSION object (like this solution), because of security reasons.

Still this feels like it's more of a work-around because we're mimicking existing functionality. Who guarantees our function does exactly the same as the way php serializes the sessions internally? All seems to be going fine though, we haven't seen any problems with the sessions related to this function on our sites yet, and if you look at the amount of members we serve on the Facebox sites it feels like a pretty stable and robust solution.

Should add that credit for this algorithm goes out to Toon Coppens.

* Does anyone know why? I will have to remember to ask the mySQL consultant Kristian Köhntopp, that'll be helping us out with mySQL soon. I'm told he's one of the peeps who introduced the sessions-concept in php3. Look forward to learning from him ...

Update (2007-01-07): Fixed a bug in handling empty arrays.
Update 2 (2007-01-09): Fixed a bug in handling nested arrays.