Basics of telnet and HTTP

Say you want to request a webpage…  Normally, one would use a web browser, right?  But sometimes you just need to see what is really going on…  In this blog post I will show the basics of using the telnet command to work with the HTTP protocol.

For reference: http://www.w3.org/Protocols/rfc2616/rfc2616.html

Most of these commands were run on Linux, but telnet on Windows should work too.

telnet <ip-or-host> <port>

Background…

If you are using the HTTP protocol, which is port 80, then you must follow the HTTP protocol conventions (which are simple).  HTTP has two primary versions at this point: 1.0 and 1.1.

In the HTTP 1.0 days, a single website was bound to a single IP address.  What this means is that an HTTP request sent to a given IP address would return content from only one site.  This is quite limiting and inconvenient.  To have to assign a new IP for every different domain name… What a bother.  Not to mention that the current internet protocol standard, IPv4, is limited to several billion addresses and quickly running out.

More recently, HTTP 1.1 has become the standard.  This enables something called Name Based Virtual Hosting.  By requiring a “Host” header to be sent along with the request, HTTP servers can in turn “look up” the correct website and return it based on the name.  Hundreds or even thousands of different domains can now be hosted on a single IP address.

(keep in mind that SSL certificates each require a seperate IP address.  Due to encryption issues, the IP address is needed to determine which SSL certificate to use…)

So with that introduction, allow me to show you the basics of HTTP…

Using HTTP over Telnet

The telnet utility is a simple (but useful) utility that allows one to establish connections to a remote server.  From my perspective, it is most useful with plain text protocols (like HTTP), but my knowledge of telnet is not very deep…

Here is an example (commands you would type are in red):

[jason@neon ~]$ telnet gahooa.com 80
Trying 74.220.208.72…
Connected to gahooa.com (74.220.208.72).
Escape character is ‘^]’.
GET /       <press enter>
<html>
   <body>
      Hi, you have reached Gahooa!
   </body>
</html>
Connection closed by foreign host.

Because it was an HTTP 1.0 request, the server DID NOT wait for additional headers.  Again, quite limiting – only sending one header line.

And… HTTP 1.1

Here is an example of an Apache Virtual Host configuration directive.

<VirtualHost 74.220.208.72:80>
   # Defines the main name by which this VirtualHost responds to
   ServerName gahooa.com

   # Additional names (space delimited) which this VirtualHost will respond to.
   ServerAlias www.gahooa.com 

   # Apache will append the requested URI to this path in order to find the resource to serve.
   DocumentRoot /home/gahooa/sites/gahooa.com/docroot

</VirtualHost>

When we issue the following HTTP 1.1 request, we are in effect asking for the file at:

/home/gahooa/sites/gahooa.com/docroot/index.html

Keep in mind that because this is HTTP 1.1, the web server will continue to accept header lines until it encounters a blank line:
A blank line…

[jason@neon ~]$ telnet gahooa.com 80
Trying 74.220.208.72…
Connected to gahooa.com (74.220.208.72).
Escape character is ‘^]’.
GET /index.html HTTP/1.1       <press enter>
Host: www.gahooa.com           <press enter>
                               <press enter again>
HTTP/1.1 200 OK
Date: Wed, 03 Sep 2008 21:00:46 GMT
Server: Apache/2.2.9 (Unix)
Transfer-Encoding: chunked
Content-Type: text/html
                               <take note of blank line here>
<html>
   <body>
      Hi, you have reached Gahooa!
   </body>
</html>
Connection closed by foreign host.

A couple notes:

  • HTTP 1.1 continues to accept header lines until it recieves a blank line
  • HTTP 1.1 sends a number of header lines in the response.  Then a blank line.  Then the response content.

Redirects

One of the main points of writing this article was to describe how to debug strange redirect problems.   Redirects are done by sending a “Location” header in the response.  For more information on the Location header, please see http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.30

[jason@neon ~]$ telnet gahooa.com 80
Trying 74.220.208.72…
Connected to gahooa.com (74.220.208.72).
Escape character is ‘^]’.
GET /test-redirect.php HTTP/1.1 <press enter>
Host: www.gahooa.com            <press enter>
                                <press enter again>
HTTP/1.1 200 OK
Date: Wed, 03 Sep 2008 21:00:46 GMT
Server: Apache/2.2.9 (Unix)
Transfer-Encoding: chunked
Content-Type: text/html
Location: http://www.google.com <take note of this line>

The Location header in the response instructs the requestor to re-request the resource, but from the URI specified in the Location header.  In the above example, if you were debugging redirect issues, you would simply initiate another HTTP request to  http://www.google.com

Python instead of telnet

Finally, I’d like to illustrate a really simple python program that would facilitate playing around with the same:

import socket
S = socket.socket(socket.AF_INET)
S.connect(("www.gahooa.com", 80))

S.send("GET / HTTP/1.1\r\n")
S.send("Host: www.gahooa.com\r\n")
S.send("\r\n")

print S.recv(1000)

S.close()

Conclusion

When you are not familiar with protocols such as HTTP, understanding “how things work” can be daunting.  But like many technologies out there, they really are simple (once understood).

The more truth and understanding you can fit into your perspective, the better you will be able to make informed decisions.

Gahooa!

3 thoughts on “Basics of telnet and HTTP

Leave a comment