the making of a webdev - 2
The next thing after the doctype that I learnt is what forms the basis of today’s data exchange between the client and the server in HTTP - the “cookies”.
HTTP is a stateless protocol. To understand what this really means — lets look at what HTTP itself is built on top of, the TCP/IP. For every page that the browser requests, the TCP underneath the HTTP makes a “connect” with the server (a connect is a 3-way handshake in TCP/IP protocol), pass over the HTTP data and then “disconnects”. There is no persistent connection maintained between the client and the server between requests. So, the server has no way of connecting a previous request with the current one and draw a co-relation — this means, for example, if a user gives a login request, the client connects to the server, passes the login data to the server, the server does the login for the user (user is in login state) and then disconnects it. The next time around when the user does an action, the same sequence if connect-data-disconnect happens, but then how do the server figure out that this user had already logged in? (Remind you that once the disconnect happens, the state information is lost and every request that comes in to the server is a new request with no past history). There must be some mechanism which enables this conversation between the server and the client about the state of the application.
There are several mechanisms to associate subsequent requests and maintain some kind of a state and one such working, reasonably good enough mechanism is “cookies”.
There are several rules that a browser sticks to make sure cookies are not messed up:
1. browser has a local storage area where cookies are stored (on the harddisk of the user)
2. browsers support around 20 cookies per domain and 300 cookies in total and each cookie’s maximum size fixed at around 4k (this varies)
3. browsers make sure that they don’t give away this domain’s cookies to some other domain - the same domain policy. (essentially, this means, for example, that google’s cookies are not given away to yahoo’s server and viceversa)
The third point is very important and it remained not so visible to me, until sometime, when I tried to read/write someone else’s cookies. For example, try to read the expiry time of a google’s cookie and set it to something else using a javascript on your domain.
I, actually realized why this script can’t do it by thinking a bit more - I could then give the url of the script and when they try to access the url, reset google’s cookies in anybody else’s computer to whatever time I want and just play.
Now there are variations to point #3 that make the cookies vulnerable like XSS (cross site scripting), which I’ll discuss in some other post. XSS falls under cookie stealing/theft (cookies can also be stolen in the network, which can be prevented using https), apart from this, there is the cookie poisoning (changing some values in the cookie without the knowledge of the server/client to produce some imitated change, which can be prevented by having a signature) etc.. But briefly, there are security and privacy concerns around cookies, that is what I understood.
The sequence of cookie exchange goes like this: Browser sends a request, the server sets the cookie in the HTTP header and sends it to the browser. Whenever the browser makes a request from the page which has cookie set, the cookie information is sent along with the request, so server figures out the state and acts accordingly.
Cookies can be set in server side (say, using a PHP script) like this:
<?php
//to set a cookie
setcookie('cookieName', 'cookieValue', $expiry, "/");
//to read a cookie
$cookie = $_COOKIE['cookieName'];
?>
Hmm.. when I wrote the setcookie(), I figured out that it has to be sequenced in the code in such a way that it appears before spitting out any HTML code (including any echos’ that you might’ve used. Otherwise, setcookie() simply returns false (the heaaders have already been sent, so cookie info cannot be set). Once the page is available, the cookies in the browser can be manipulated. Only that the next request to the server will go with the manipulated cookies.
Here an expiry time can be set as to when the cookie should expire. If the expiry time (in seconds) is provided, then a “persistent cookie” comes up (persistent cookies don’t expire, if I close the browser/session). The fourth parameter can tell which path this cookie is active on. There is also one more thing that the browser can do — send a particular cookie only for secure session (available as another parameter to setcookie).
On the client side, the browser can typically read and write the cookies through Javascript. I know not any other way to manipulate cookies on the clientside.
<script type="text/javascript">
document.cookie = "name=value; expires=time; path='/'; secure";
var cookies = document.cookie;
</script>
The first line to set a cookie and the second to read the values of the cookie.
Alright, the other ways to exchange data between the client and the server (of the state information) can possibly be:
1. Through URLs (have encoded strings that mimick cookie name/value pairs)
2. Hidden form variables and
3. possibly a few others methods..
Hmm.. they must be having their own difficulties or cookie seemed to be much reliable than these methods that having a cookie while browsing is almost like indispensable. I ponder on the reason why there is an option to enable/disable cookies - ya, obviously if a security vulnerability is found and the browser be used — atleast we could disable the cookies and have read data from the Internet, if not write!
Written by thanix on November 30th, 2007 with
no comments.
Read more articles on basic and webdev.
- [+] Digg: Feature this article
- [+] Del.icio.us: Bookmark this article
- [+] Furl: Bookmark this article