browser

How the browser submits a file, differences between x-www-form-urlencoded and form-data

Even though the task of uploading a file to a server might look simple at first, it actually involves several complex steps.
In the text below I’ll try to give a brief overview of the steps involved and some questions that I gathered during the examination process.

Enctypes

The html form element has several different attributes, one of them is the enctype.

Enctypes as defined in the HTML4.0.1 Specification

“The enctype attribute of the FORM element specifies the content type used to encode the form data set for submission to the server”

The possible enctypes are:

  • application/x-www-form-urlencoded: The default value if the attribute is not specified.
  • multipart/form-data: Use this value if you are using an <input> element with the type attribute set to “file”.
  • text/plain (HTML5)

1. application/x-form/urlencoded

When the form’s method is set to POST, there are two options available for data encoding.
The default one is the x-www-form-urlencoded also known as Percent encoding
The encoding mechanism is pretty straight forward.
A set of characters are deemed reserved, and if they need to be used they are encoded using their hexadecimal values prefixed by a percent sign.

The list of reserved chars is:

! * ‘ ( ) ; : @ & = + $ , / ? # [ ]

The unreserved chars are:

A    B    C    D    E    F    G    H    I    J    K    L    M    N    O    P    Q    R    S    T    U    V    W    X    Y    Z
a    b    c    d    e    f    g    h    i    j    k    l    m    n    o    p    q    r    s    t    u    v    w    x    y    z
0    1    2    3    4    5    6    7    8    9    –    _    .    ~

So how the + would look like after encoding?

We know that the decimal value for + as by the ascii table is 43
The binary representation of 43 is: 0010 1011
If we get the decimal values of each nibble we end up with:
0010 => 2
1011 => 11

So we can represent the plus sign in three different ways

  • Char => +
  • Base10 => 43
  • Base16 => 2B
  • Url Encoded => %2D

dec-hex-bin

How does the browser would parse a simple form like this using UrlEncoding?

 
 <form id="simpleForm" action="/" method="POST" name="simeplForm" enctype="application/x-www-form-urlencoded">  
 <input id="simpleInput" type="text" name="simpleInput" />  
 <input id="simpleSubmit" type="submit" name="simpleSubmit" value="Submit Simple Form" />
   

As you can see on the image below it appends all the elements of the form using the name attribute as the key.
In the end request body looks something like:

simpleInput=simple+form&simpleSubmit=Submit+SimpleForm

form-url-encode

If you are curious and wants to see how Firefox can do that you can check the source code implementation here

Most likely, this is the part where the query string gets constructed:

   
 nsresult  
 nsFSURLEncoded::AddNameValuePair(const nsAString& aName,  
 const nsAString& aValue)  
 {  
 // Encode value  
 nsCString convValue;  
 nsresult rv = URLEncode(aValue, convValue);  
 NS_ENSURE_SUCCESS(rv, rv);

// Encode name  
 nsAutoCString convName;  
 rv = URLEncode(aName, convName);  
 NS_ENSURE_SUCCESS(rv, rv);

// Append data to string  
 if (mQueryString.IsEmpty()) {  
 mQueryString += convName + NS_LITERAL_CSTRING("=") + convValue;  
 } else {  
 mQueryString += NS_LITERAL_CSTRING("&") + convName  
 + NS_LITERAL_CSTRING("=") + convValue;  
 }

return NS_OK;  
 }  

Now assuming that the browser:

  • Went through all the elements of the form
  • Parsed them
  • Created the HTTP request
  • Sent the request

The server somehow needs to be able to interpret the request and parse the data.
If the form enctype is set to x-www-form-urlencoded then the server knows which format to expect the data, thus it will be able to parse and do something useful with it.

So when you’re running Node.js ,PHP, .NET, Ruby, or any other server side technology they are implementing a parser that goes through the HTTP request body and creates a key value pair data structure providing all the data contained in the form.
That’s why it is required to set a name in the form elements, they will be the keys of the data structure created in the server with the proper values for each element.

2. multipart/form-data

Now with that being said, lets think about the multipart/form-data encoding type.
If the browser is trying to send a file, whatever type it may be, does it makes sense to encode the whole file using percent encoding then append to a string containing all the other form elements like x-www-form-url-encoding does? I would say no.

So how does multipart/form-data encodes the form elements?
The definition in the RFC 2388 summarizes pretty well:

Definition of multipart/form-data

The media-type multipart/form-data follows the rules of all multipart
MIME data streams as outlined in [RFC 2046].  In forms, there are a
series of fields to be supplied by the user who fills out the form.
Each field has a name. Within a given form, the names are unique.

“multipart/form-data” contains a series of parts. Each part is
expected to contain a content-disposition header [RFC 2183] where the
disposition type is “form-data”, and where the disposition contains
an (additional) parameter of “name”, where the value of that
parameter is the original field name in the form. For example, a part
might contain a header:

Content-Disposition: form-data; name=”user”

with the value corresponding to the entry of the “user” field.

Field names originally in non-ASCII character sets may be encoded
within the value of the “name” parameter using the standard method
described in RFC 2047.

As with all multipart MIME types, each part has an optional
“Content-Type”, which defaults to text/plain.  If the contents of a
file are returned via filling out a form, then the file input is
identified as the appropriate media type, if known, or
“application/octet-stream”.  If multiple files are to be returned as
the result of a single form entry, they should be represented as a
“multipart/mixed” part embedded within the “multipart/form-data”.
Each part may be encoded and the “content-transfer-encoding” header
supplied if the value of that part does not conform to the default

Basically, each element has a different content-type, which allows a image to be sent in the same request as a bunch of text and still provide enough information for the server to parse all the data.

So if we ha a form like this:

 
 <form id="simpleForm" action="/" method="POST" name="uploadForm" enctype="multipart/form-data">  
  <input id="simpleInput" type="text" name="simpleInput" />  
  <input id="fileUpload" type="file" name="fileUpload" />  
  <input id="simpleSubmit" type="submit" name="simpleSubmit" value="Submit Simple Form" />
  </form>

The request body would look something like this:

——WebKitFormBoundary0K1fvU2Vy3qpT4ua
Content-Disposition: form-data; name="fileUpload"; filename=""
Content-Type: application/octet-stream

——WebKitFormBoundary0K1fvU2Vy3qpT4ua
Content-Disposition: form-data; name="textInput"

——WebKitFormBoundary0K1fvU2Vy3qpT4ua
Content-Disposition: form-data; name="submitBtn"

——WebKitFormBoundary0K1fvU2Vy3qpT4ua–

form-multi-data

Conclusion

From time to time I find myself in the need to create file upload plugins, and when the time comes I always find my self googling the answer for how that can be accomplished.
I never stopped and asked myself, wait a minute, how does this actually work, why are there two different encode types? or why do I need to change the encode type when sending files to the server?

I just simply take that as a given and move on.
However, even though we need to leverage what has already been created and try not to reinvent the wheel, at the same time, having a basic knowledge of how the tools we use on a daily basis function might not be a bad idea.
We don’t have to learn them to the point where we can write one from scratch, but we should learn them to the point where we can make smart decisions about when and how to use them.

Managing Game Assets - XHR2 and URLObjects

I ran into a problem while working on a ticket in the gladius project.
The ticket that I’m working on is about loading resources into the browser’s memory.
The part that I’m struggling with is how to remove the resources from the memory once they’ve been used.
For example:
Lets say a game has 10 levels, each level has different resources. It make sense to load the resources for level 1, level 2, etc, separately. So when the player finishes level 1 and goes to level 2, the resources from level 1 will be unloaded from memory.

I spent a few hours on google trying to find a way to manipulate the browser’s cache with javascript, but I hit a dead end. Not only I didn’t find any information, but all the information I found was saying that it was impossible to do such a thing.

if (!google) goto IRC :)

So there I was. Asking questions on different channels hoping to find a solution. The first answer I got wasn’t very encouraging. I was told that it was impossible to do what I wanted. But I knew it had to be possible. The ticket specifications were very clear, the module needed to unload cached content. So I kept asking.

Luckily, the second answer was a little bit more optimistic. It pointed me in the direction of XHR2 requests and URLObjects.

XHR2(XMLHttpRequest Level 2) has a lot of new features compared to the old XMLHttpRequest,some of the features are:
Cross-origin request
Upload progress events
Upload/Download of binary data
Plus, it can do a lot of other cools things, such as working with the new HTML5 APIs, like the File System, WebAudio and WebGL

But why XHR2 is handy when it comes to loading an image into the browser’s memory?
Because it provides a way to make a request to the server with a responseType set as blob. So the server sends the image back in binary format.
With the image in binary format it’s possible to create a temporary URLObject.
By having an image loaded as a URLObject, it gives the option to dynamically unload the image from the browser memory whenever needed. Here is an example:

 
window.URL = window.URL || window.webkitURL;`

function getBlob(url, cb) {
var r = new XMLHttpRequest();
r.open("GET", url);
r.responseType = "blob" // XHR2
r.send(null);
r.onload = function() { // XHR2
objectURL = window.URL.createObjectURL(r.response); // Creates a temporary URL```

console.log('objectURL: ' + objectURL); // moz-filedata:b42305a0-29a2-4b1e-ae02-6ef78b4cfe4e`

img = new Image();
img.src = objectURL;
img.onerror = function() {
console.log('onerror');
}
img.onload = function() {
console.log('onload');
}
}
}
.....
.....
getBlob('img1.jpg', null);
.....
.....
window.URL.revokeObjectURL(objectURL); // Releases the memory

Even though the solution worked, I’m not sure if it’s a feasible one.
First, it’s not supported by all major browsers. Only Chrome 8+ and Firefox 4+ provide support.
Second, I don’t know what are the performance issues compared to simple http requests.
However, using this technique it should be possible to implement webworkers to make the requests and then post back a message to the page containing the binary for the resource.(*it’s not)!
*
At the moment XHR2 requests are not supported in a worker thread
The bug is already filled in bugzilla
https://bugzilla.mozilla.org/showbug.cgi?id=675504
https://bugzilla.mozilla.org/show
bug.cgi?id=658178

If anybody is interested on the topic, these are some good references:

http://www.html5rocks.com/en/tutorials/file/xhr2/
http://caniuse.com/xhr2
http://davidflanagan.com/Talks/jsconf11/BytesAndBlobs.html
https://developer.mozilla.org/en/DOM/window.URL.createObjectURL
https://developer.mozilla.org/en/DOM/window.URL.revokeObjectURL#Browsercompatibility
https://developer.mozilla.org/en/Using
filesfromwebapplications#Example.3aUsingobjectURLstodisplay_images
http://www.htmlfivewow.com/slide45
http://dev.w3.org/2006/webapi/FileAPI/
http://www.opera.com/docs/specs/presto28/file/