Field Trip!

You fire up Google Chrome, type google.com into the URL bar at the top, and hit ENTER. Before long, a "webpage" is displayed in your browser.

But what is a "webpage"?

And how did it get here?

1.1 | Servers, Clients, and Protocols

Abstractly speaking, the transaction you just experienced fits the client-server model, where your web browser (e.g. Chrome) is what is referred to as a client and the machine it ultimately reached out to after you hit ENTER is referred to as a server.

Things that happen on the server are said to take place server-side or on the back-end, while things happening on the client are said to take place client-side or on the front-end.

The term "client" in this context refers to software (the browser, in this case) and not to the "user". In fact, the browser is only one type of "client".

The browser is a client because it is making a request, a request to the server. The server, true to its name, then serves a response; that is, it sends a response back to the client. In the case of our request for the "webpage" above, the response is HTML.

HTML is covered in detail later, but for now it might help to know that HTML will be where most of the text you see displayed on the page came from.

It also serves as something of a manifest for any other files that need to be fetched (e.g. images). Once the browser has all of these files, it can render the page to the user. The rendered page is what we perceive as the "webpage" displayed for us in the browser, the sum of the HTML and all of the other assets listed in it.

We'll also get into the anatomy of the request and response, but for right now we're just observing the overall transaction.

Realistically though, Google does not own a single server that responds to all of their constant stream of requests; instead, Google is a service and is comprised of many, many servers. As you might imagine, there is some added complexity involved because of this, but it doesn't change the fact that our browser's request eventually makes its way to a server that will eventually return a response.

The physical server machines in this context are typically computers running nothing but software dedicated to processing requests coming in from the Web and responding appropriately. We'll get into the different kinds of software used for this later on, but something that is helpful to know now is that a web server is one software component of this process. Therefore, the term "server" can be used to describe both hardware and software.

As you'd imagine, we're pretty exclusively concerned with the software side of things within this book. Even so, it is eventually important to understand what kinds of physical machines end up running your application, where, in what configuration, and so on.

The term "client" could also refer to devices, but here we're concerned with client software, specifically the browser.

Opposite the server, the browser is the classic example of a client for this context, but there are plenty more.

Take Gmail for example: we can use our browser as a client to the Gmail service by visiting gmail.com, but the native Android and iOS Gmail apps are also clients of the service. The requests they make are different, but they communicate with the same Gmail service/servers.

And speaking of communication, we haven't yet mentioned another requirement here: clients and servers need a means of "talking" to each other. The way they do this is by using a particular protocol over a network. Here the protocol is HTTP and the network is the Internet.

There are actually several other protocols at work behind-the-scenes (including TCP and IP), but the only protocol we'll be working with directly during the course of this book is HTTP, which we'll talk about in just a little bit.

Since we accessed Google's server over the Internet, we can say that we made a request to a remote server. Inversely, once we start developing, we'll be running our application on our own machines as a local server.

1.2 | The URL and Path, Domains and Ports

When we kicked off our browser field trip earlier, we mentioned the "URL bar", but what is a URL?

URL stands for Uniform Resource Locater and, as you might guess, it acts something like an address for a resource accessed over a network.

Additionally, URL's are a type of URI (Uniform Resource Identifier), but you'll probably hear the two terms used interchangeably at some point. From the level we are operating at, they are functionally the same, but it is worth looking into the difference when you get a chance.

...but seriously don't go around saying "URL and URI mean the same thing" to every web developer you meet, because you'll probably end up getting shouted into oblivion. ...and it won't be my fault, because I warned you.

So let's think about our google.com URL from earlier.

google.com is actually not a full URL and thus was likely expanded to something like the following:

http://www.google.com:80/index.html

index.html? :80? Where did those come from?

We'll cover that in a minute.

This complete URL has the following format:

[protocol]://[subdomain].[domain]:[port]/[file]

To distill this down to natural language at the level we need to understand it, the URL can be read as:

[What means should we use to contact the server?]://[What service/server?]/[For what specific thing?]

Now, we won't need a deep understanding of all of the individual components, but they breakdown like this:

Term Meaning
protocol the means of communication (discussed in the next section)
subdomain the subdomain name, exactly as it sounds: a domain within the domain
domain the domain name, used to uniquely identify the location of a site on the Web
port a specific address on a physical computer
file a file

Discussing all of this is a bit of a rabbit hole, but worthwhile. Feel no obligation to memorize everything that follows.

Since we'll be discussing the protocol in the next section, let's skip down to domain and subdomain.

These two are used to direct traffic from the Internet to a specific location where the proper server machine can be reached. One big difference between the two is that a domain is something you purchase and a subdomain is something you configure.

There are powers that be on the Internet referred to as "domain name registrars" that allow you to register a domain name for a fee. To do this, you need to select a domain name paired with a top-level domain (TLD). In the case of google.com, google is the domain name and com is the TLD.

There are plenty of domain name registrars, but I would recommend Hover. In fact, I used them to register webappfromscratch.com and these8bits.com.

Also, to be even more specific, the domain google.com is actually resolved to an IP address using a system called DNS (Domain Name System). A common example of an IP address is the loopback address 127.0.0.1.

Once you own a domain, you can configure it to have many subdomains. www is quite common as a subdomain, but you'll also see these used for assorted purposes. Wikipedia, for example, uses subdomains for different languages (e.g. en for English, ja for Japanese).

After the domain in the full URL above, we see that 80 is the value for port. This shows up because the protocol we're using (HTTP) uses port 80 when we don't specify something else.

Why ports?

You'll recall that a web server is a program that runs on a computer to respond to HTTP requests, but there may also be other server programs, like an email server, running on that same machine. To make sure that they only receive incoming traffic that was meant for them, each server program will listen on a different port. Traditionally, web servers will listen on port 80 and email servers will listen on port 25.

Why?

That's just the way it is.

And speaking of traditions, if we don't specify a file name in our URL (ours ends in a /), it's common for web servers to assume that you're requesting the directory index file index.html. Our Rails app will behave somewhat differently, but this is the case historically.

In fact, we didn't even have this / in our URL bar earlier. To our browser, google.com means we are asking for google.com/.

So the domain/subdomain route traffic to a specific machine and the port routes traffic to a specific "part" of that machine; they determine the where.

Once there, everything following the domain/port (the / and everything following it) is referred to as the path. The path is used to specify what is being requested.

So while the domain and port stuff is good background knowledge, the path is what we'll be concerned with during the development of our application.

Here are some examples URL's and their respective paths:

URL Path
http://google.com/ /
http://google.com/index.html /index.html
http://example.com/chapters/1 /chapters/1

The URL's purpose is to describe where something is absolutely in regards to the great, big Internet, while the purpose of a path is to describe where something is relatively in regards to a given service or server.

We'll run into absolute and relative many more times before this book is done.

And even though the last part of the URL is listed as file, note that this may not be as strictly true in modern web applications as it used to be. We'll discuss this later, but the path was originally used to specify directories and files. For example, the path /www/index.html would point to an index.html file in a www directory.

You probably know "directories" as "folders". They're the same thing, but I tend to call them "directories".

With that out of the way, let's circle back to the beginning of our URL: http.

1.3 | HTTP

HTTP is the means by which your client (Chrome) communicated with a server (at Google).

Chrome sent an HTTP request to Google and received an HTTP response containing HTML, which your browser then rendered as the webpage you saw.

Yeah, yeah, what is HTTP?

HTTP stands for Hypertext Transfer Protocol.

The most important thing to notice about "HyperText Transfer Protocol" is that it is a protocol (hence it's presence at the beginning of the URL above as http) and that it is involved in transfer (indicating that something is being sent between two parties).

We'll get to "hypertext" shortly, but for right now, it's helpful to know that the first two letters of "HTML" also stand for "HyperText"

To be clear, HTTP is the protocol your web browser just used to request something from a server over the Internet as well as the means by which it received the response.

So, we made a request, got a response, and now we're staring at this webpage. That it?

Nope. The webpage rendered in your browser is the response body, which, in this world at least, implies that there is a head.

The fully rendered webpage you see is actually not just the result of the response body, but the response body was HTML and it served as a manifest for all the other files that needed to be fetched to complete the page. The sum of all of these files is the webpage you see before you.

Isn't, generally speaking, a "head" a component of a "body"?

Anatomically, yes. Here, no.

But, luckily for you, you're not stuck with a headless HTTP response body in your trunk, you've also got HTTP headers.

Headers

In a programming context, a body will contain the content itself (here, some HTML) and a head will be where its metadata lives. Metadata in this case refers to "data about data".

So what is this metadata? And where can we find it?

Well, finding it is easy. In Chrome you can see it for yourself using Chrome Developer Tools by:

  1. pressing F12
  2. clicking the Network tab
  3. clicking the Documents pill at the bottom
  4. reloading the page
  5. clicking the web request on the left hand side (this will look like a link that reads something like https://www.google.com/)
  6. and finally clicking the Headers tab

Show screenshot.

The following are of particular interest:

Request URL: http://www.google.com/
Request Method: GET
Status Code: 200 OK

Request Headers

Response Headers
  Content-Type: text/html

A whirlwind tour of the headers:

  • Request URL - Hey! That looks familiar. This is the full URL we ended up at.
  • Request Method - Every HTTP request is made using a particular HTTP verb (e.g. GET, POST).
  • Request Code - Every HTTP response comes with a status code indicating how everything went.
  • Request Headers - Response Headers - This describes what the response body is.

Now for the breakdown.

Request Method / HTTP Verb

In an HTTP request, the HTTP request method (or HTTP verb) informs the server what the nature of the request is.

Our request above used the GET method, because we were simply "getting" some HTML from the server.

GET and POST are particularly common, but in Rails we will use four:

  • GET
  • POST
  • PUT
  • DELETE

We'll get deeper into these in the section on REST, but for now all we need to know is that every HTTP request uses a particular HTTP request method.

Status Code

HTTP status codes can give you an idea of how the processing of the request went.

For our request of the Google homepage above, we got a status code of 200 which represents OK.

...which means exactly what it sounds like.

There are many HTTP status codes, but they breakdown like this:

  • 1XX - Informational
  • 2XX - Success - "Here you go!" / "You did it!"
  • 3XX - Redirection - "You requested something, but we're sending you here."
  • 4XX - Client Error - "Your request is no good. No soup for you."
  • 5XX - Server Error - "We messed something up, and good."

One HTTP status code in particular that you're probably familiar with is 404 which represents Not Found.

Just type nonsense in a URL after the first / for any given site and you'll probably get some kind of 404 page in response.

Some 404 pages try to help get you where you meant to go, while others are useless but awesome, like GitHub's. ...still others are simply useless.

Content-Type / MIME type

The response body is ultimately just a bunch of text, but the browser needs to know how to handle it. In this case, the body is HTML, so the server sets this header value as "text/html" so that the client knows how to appropriately render it.

Chrome has this listed in Dev Tools as Content-Type, as that is how this information is stored in an HTTP header, but this is also referred to as a MIME type.

MIME apparently stands for Multipurpose Internet Mail Extensions ... for some reason. Rails uses the terminology "MIME type" instead of "Content-Type", so we really just need to understand its purpose, rather than the meaning of its seemingly ridiculous (read: stupid) acronym.

HTTP - Takeaways

In conclusion, the server creates the HTTP response, fills in the body, sets the metadata appropriately, and then sends the entire thing back to the client.

In the case of our request to google.com, the server places HTML in the body and populates the header with information about the response, such as a status code and Content-Type, which in this case are 200 and text/html, respectively.

One last thing to note is that the "response body" is often simply referred to as "the response". I've been trying to expressly call it the "response body" above, but "response" works just as well.

Oh, and HTTP is stateless

Totally. ...what?

This is getting harder to demonstrate as it seems every website lets you sign up somehow now (which, in a way, is the reason we're even here talking about this), but imagine that I'm making an HTTP request to an ancient form of Wikipedia that isn't a web application, but rather all of its articles are regular old files.

...and imagine my 56k modem is crankin' away.

Let's say I see a link for the Wikipedia entry for 5by5 Studios as the result of a Google search and follow it. Chrome will have to go get it for me.

Server: Hello; It's a pleasure to meet you! Here's that page you asked for.

Chrome: Thanks, dude.

Chrome then renders the page and displays it to me. I read some, then click on the link to the entry for Dan Benjamin. Chrome now has to go back to the server to get that for me too.

Server: Hello; It's a pleasure to meet you! Here's that page you asked for. You know, that guy started 5by5 Studios and there was just a guy a minute ago...

Chrome, walking away: Yeah, thanks...

Me: "What's up Chrome?"

Chrome: I'm pretty sure that server didn't remember me from earlier.

Awkward.

More awkward than personifying Chrome? Only you can decide.

This "problem" stems from the fact that HTTP is stateless, meaning that each HTTP transaction is separate from every other and nothing is preserved.

Put simply, the server doesn't know you're you. It doesn't know if you were there before or what you've done.

Stateless HTTP? How could they let this happen?

Well, HTTP was originally designed, more or less, just to access text over the Internet.

Well then, if HTTP is stateless, how do we log in to anything? Services like Twitter, Facebook, and Google keep track of who we are and what we've done, right?

They do. And they do it with cookies.

1.4 | Cookies

HTTP cookies help us make HTTP act statefully.

Put simply, servers remember who we are because our browsers send cookies along with their requests.

But what is a cookie?

A cookie is a small amount of data returned from a server that is kept locally in the client and is sent along in subsequent requests to that domain.

Cookies have a name and a value.

It will also likely have a domain, path, an expiration date, and might have a few other fields.

Cookies can be used for many purposes, such as keeping track of simple actions like whether or not a user has visited a site before or clicked a particular button, but perhaps one of the most useful applications of the HTTP cookie is as a session cookie.

Session Cookies

If you consider a web application account of yours as locked, then the session cookie is your key.

The session cookie in Rails is often referred to simply as "the session".

Basically, when you authenticate (or "login", i.e. prove that you are you) with a server, you are then handed back a new session cookie in its response to your login. The server also keeps track of the value of the cookie on its end, so that when you visit again it can map that to the user account that is, ultimately, you.

At this point we're not really equipped to fully understand what all happens with a session cookie, but it typically contains a very long and jumbled looking series of characters that uniquely identifies you when you revisit the site that the session is valid for.

Let's take a look anyway!

Poking around in things I don't fully understand comprises a fair amount of the joy I take in programming.

Head back to Chrome and visit a site that you're logged into, like a Gmail or comparable email service.

  1. visit your inbox
  2. hit F12
  3. click the Resources tab
  4. click the right-facing triangle/arrow in front of Cookies
  5. I see a cookie for the domain mail.google.com, so I'm going to click that.

Holy crack! There's a lot of stuff in here.

In my case, I visited Gmail and see that I have a cookie with the name SID which very likely stands for "Session ID". Furthermore, its value looks like ASCII vomit, so that's probably it.

ASCII stands for "American Standard Code for Information Interchange".

When I say "ASCII vomit", what I'm trying to articulate is that the value of the session cookie looks something like this: DKKKAAQEEE__E_E__Eas234kjgl4534jgf

...which I read in my head as what I imagine trying to exhale quickly while shout-reciting the alphabet would sound like.

The reason this value is so long and random is to make it impossible (or, in some cases, just harder) for hackers to break into your account using brute-force.

Moving On

So now that we know who's sending things and how, let's look at what we're receiving in these responses.