Content-Type and Status Code Leakage
This blog post explores the issue of content-type and status code leakage. It examines the meaning of HTTP status codes and their effect when used with HTML attributes. The typemuchmatch HTML attribute receives special attention. It also explains how to prevent data leaks, and emphasizes the importance of correct implementation.
Your Information will be kept private.
Your Information will be kept private.
The author of a bug bounty write-up published in Medium on March 20, username 'terjanq', demonstrated that the response to a resource varies based on the state of authorization of the user requesting it. As we explained in a previous blog post, referenced below, if the user is authorized to view the resource, the Content-Type
header has the value 'text/html'. However, if the opposite is the case, the response returns without a Content-Type
header, which is equal to the value 'text/plain'.
In this blog post, we analyze the role of the Content-Type
header and HTTP status codes in obtaining user data using terjanq's research. We also suggest methods that can be used to prevent the scenario described in his writeup.
The Meaning of HTTP Status Codes
The responses returned from web pages vary based on various factors, such as:
- The user’s authorization and browser
- The availability of the requested resource on the server
- The relocation status of the resource
In the HTTP protocol, for example, if the user requests a resource that isn’t present in the destination, the 404 Not Found HTTP
code is returned.
Similarly, if the user is not authorized to view the source, 403 Forbidden
is returned. In some cases, the HTTP 401 Unauthorized
code can also be used to remind the user to enter authorization credentials.
Likewise, servers use the data in the Content-Type
header to help browsers to render the contents of the page appropriately on the browser.
The typemustmatch HTML Attribute
The researcher initially focused on the typemustmatch
HTML attribute to determine the user's authentication state. The typemustmatch
attribute is boolean, which means that its features are activated as long as the attribute exists in the HTML element.
This typemustmatch
boolean attribute ensures that if the type of the resource loaded by an object element doesn’t match the value indicated in the Content-Type
header, the resource is prevented from loading. This leads to an interesting information leak. If you knew that a response for authorized users always returns an application/json
content type, and a response for an unauthenticated user always returns text/html
, you could try to load the resource using an object
tag with the typemustmatch
attribute enabled. Then, if you set the type as application/json
and the resource failed to load, you'd be able to determine that the user was not authenticated.
However, this attribute only works on Firefox.
The next question is, how will you know whether the browser blocked the resource from loading?
Usually, the onload
and onerror
event tags in the HTML elements are triggered when a resource is loaded successfully or unsuccessfully. But the object
elements do not support these events. The text feature of the object
element is the next solution the researcher aimed to use. In the code below, the text 'not_loaded' is displayed when the resource isn’t loaded.
<object type= data= typemustmatch> not_loaded </object>
The Effect of HTTP Status Codes Used With HTML Attributes
Let's look next at the status codes. While the typemustmatch
attribute prevents the unmatched resources from loading, it also prevents the resources from loading if the response doesn’t have the HTTP status code '200'. Spotting this through the text feature of the object
element is not possible, since there isn’t an attribute reference to access this text value.
The researcher uses clientHeight
and clientWidth
features of the object
element, whose values change depending on the loaded resource. These features have a default value of '0', so the changes in the value can be used to determine whether the resource has been loaded.
These features aren’t as useful since we don’t know if the resource has finished loading, since the object tag doesn’t have the onload
or onerror
events. Despite this problem, the researcher discovered that an object
element that hasn't yet loaded prevents the window object from firing the onload
event. However, once all the elements are loaded, the window element triggers this event. Afterwards, by observing the changes in the clientHeight
and clientWidth
attributes through the actions of the window object, the user data can be acquired from the values in the Content-Type
header and HTTP status codes.
For further information about terjang's implementation of this concept, check the code on jsfiddle.
How to Prevent Data Leaks
It’s quite straightforward to prevent the type of information leak that arises as a result of a combination of the object
element and the typemustmatch
attribute. This is because such data leaks can be time-based, or Content-Type
header and HTTP status code based.
- If the
Content-Type
header isn’t set in the HTTP response, the browser must determine which type of content the response contains. This can lead to vulnerabilities such as Cross-site Scripting. So you have to be very careful that you don't forget this when implementing theContent-Type
header. - In addition to that, you can prepare information pages for custom errors to return HTTP status code '200' in all circumstances, in order to prevent returning codes such as '404' and '403' in the response.
- Another prevention method would be to check the Referer header in the request. As you can imagine, such attacks are made through requests from an attacker controlled website. So checking the value of the referrer header could prove to be very useful!
- The attacker has to make the requests from your browser. Using the Same Origin Policy will ensure that the attacker cannot arbitrarily change the Referer header when the requests are made from a different origin. However, if you incorrectly implemented the Referer check, the attacker may be able to add the expected value to some part of the Referrer header and bypass your security check. This code from StackOverflow on Checking PHP referrer is an example of an incorrect implementation of the Referer header check, even though it's been accepted as a solution by the original poster.
$ref = $_SERVER['HTTP_REFERER'];
if (strpos($ref, 'example.com') !== FALSE) {
redirect to wherever example.com people should go
}
The attacker may force the user to visit an attacker controlled website (example below) to bypass the control mechanism in the code.
http://www.attacker.com/hello_world?example.com
The Importance of Correct Implementation of the Content-Type Header
HTML status codes and HTTP headers are some of the mechanisms that can help make web browsing easy and secure. However, when partnered with HTML attributes and features, they become dangerous tools when it comes to user data.
The prevention and protection methods we have outlined above will help to develop a more secure implementation. Combined with other security headers, they can help prevent the abuse of these web features.
Further Reading
For full details of terjanq's writeup, see Cross-Site Content and Status Types Leakage.
You can read more about MIME Type sniffing and X-Content-Type-Options security header in our whitepaper on HTTP Security Headers and How They Work.
For further information, see The Importance of the Content-Type Header in HTTP Requests.