Sat, 25 Apr 2009
Explainer: "Why do some URLs have www in them, and what difference does it make?"
Katy (who I know from the CC internship in 2006) asked me this question recently:
Why do different pages show up depending on whether there's a www or not in the URL?
To understand, I have to explain how a browser gets a web page from the Internet. When a browser is asked to load a URL like <a href="http://www.asheesh.org/scribble/enlightened-but-confused.html> http://www.asheesh.org/scribble/enlightened-but-confused.html</a>, it breaks it apart into components.
- "http" is the the scheme
- "www.asheesh.org" is the domain name
- /scribble/enlightened-but-confused.html is the path
HTTP, the "scheme", tells the browser what protocol (or network language) to speak when it requests the page from the server.
The domain name is where things get interesting. This alone tells the browser who to ask for the page. The browser looks up www.asheesh.org in the domain name system, an Internet phone book service that converts names to numbers (so-called "IP addresses"). Once it knows the IP address for that name, it connects to it and prepares to speak HTTP.
The browser connects to that IP address, and asks (in the network language of HTTP):
- Hey, I'm trying to get a page from the website called www.asheesh.org.
- Please give me /scribble/enlightened-but-confused.html
So now, let's think about how http://www.asheesh.org/ and http://google.com/ differ: Their scheme is the same, and their path is the same. But the domain name is different.
The same is true for http://asheesh.org/ and http://www.asheesh.org/. You get the same content because, as luck has it, the administrator for asheesh.org is the same as the administrator for www.asheesh.org, and I decided to make them work the same way.
For some websites, if you add the www component, you do get different contents back: for example, http://cs.rochester.edu/ does not load, whereas http://www.cs.rochester.edu/ does.
So the final answer to Katy's question: You're lucky you ever get the same page for two URLs that are different, even if just by "www".