Friday, October 23, 2009

The Hypertext Transfer Protocol



[ Team LiB ]





The Hypertext Transfer Protocol


Most of the time, a browser uses HTTP to communicate with the Web server. In some cases the browser uses File Transfer Protocol (FTP) and communicates with an FTP server, but overall, HTTP carries the bulk of Web traffic. Most people wouldn't recognize the HTTP protocol if it bit them, and there's really nothing wrong with that. Even as a Java programmer, you don't need to know the specifics of HTTP because you have the URL and URLConnection classes to handle communications from the client side, and servlets and JSP to handle the server side of things.


However, it's almost always better to know more than you absolutely need to, especially in the computer industry. That little extra bit of understanding can help you diagnose problems faster, understand the implications of architectural decisions better, and use all the capabilities a system has to offer.


An HTTP connection is a simple network socket connection. The Web server usually listens for incoming connections on port 80. After the connection is established, the browser sends a few lines of text indicating which Web page it wants to see, some request headers telling the Web server what kind of browser is making the request, and a few other interesting items, such as the browser user's preferred language, the kinds of data the browser accepts, and even the kind of browser (Netscape, Internet Explorer, Opera, and so on).


The only part of the request that is required is the first line, which tells the server what file the browser wants. The rest is optional. Each line in the request is a human-readable text line, separated by a newline character. The request header ends with a blank line. The protocol is so simple that you can even use the telnet command to interact manually with a Web server.


For example, you can view the "Hello World" JSP from Hour 1, "Getting Started with JavaServer Pages," by telneting to port 80 on your Web server (or whatever port your Web server is running on, like Tomcat's default of 8080), entering GET path HTTP/1.0, and pressing Enter twice. The path in an HTTP request is the portion of the URL that comes after the hostname and includes the leading /. Thus, if you access HelloWorld.jsp with http://localhost:8080/HelloWorld.jsp, you enter /HelloWorld.jsp as the path.


Figure 4.1 shows a Telnet session requesting and receiving the HelloWorld JSP.


Figure 4.1. You can interact with a Web server directly by using the telnet command.


Why Can't I See What I'm Typing?

You might need to turn on local echo in your Telnet window to see what you are typing. Also, if you make any typing errors, don't be surprised if the Web server doesn't understand the Backspace key and complains that you sent it a garbled command.

Turning local echo on or off varies depending on the client you use. If you are using the Microsoft Telnet Client, you can type "set LOCALECHO" at the telnet command prompt. If you have already connected to a host, you'll need to move to the command prompt by typing ctrl-]. To leave the command prompt, press the "Enter" key at the command prompt.

Many command line telnet clients allow you to turn local echo on by typing "toggle echo" at the command prompt. If that doesn't work, consult your documentation.



Notice that the request to the server said that it was using version 1.0 of HTTP (the HTTP/1.0 at the end of the GET request), and that the server responded with version 1.1 of HTTP. HTTP version 1.1 adds a number of options that can optimize Web access and enable the browser to retrieve multiple pages over a single connection.


Under HTTP 1.1, an additional line must be present in each request. You must specify the name of the host you are accessing. This is important in multihomed hosts in which a single Web server supports many host names.


Figure 4.2 shows a Telnet session that again fetches the HelloWorld.jsp file, this time using HTTP 1.1.


Figure 4.2. HTTP 1.1 requires you to specify a hostname in the request.


Notice that the Web server did not automatically close down the connection as it did when you used HTTP/1.0. One of the optimizations of HTTP 1.1 is that a browser can use the same connection to make multiple requests. Setting up a connection is a time-consuming process, so leaving a connection open can be a real time-saver. You can force the server to close the connection by specifying Connection: close in the request. Figure 4.3 shows a Telnet session that asks the server to close the connection.


Figure 4.3. If you want the server to automatically close the connection in HTTP 1.1, you must explicitly say so.


Viewing the Request Headers Made by a Browser


A browser sends quite a bit more information than the minimum. The request object has methods that enable you to retrieve all the header values the browser sends. Listing 4.1 shows a JSP file that displays all the headers sent to it.


Listing 4.1 Source Code for DumpHeaders.jsp



<html>
<body>
<pre>
<%
java.util.Enumeration e = request.getHeaderNames();

while (e.hasMoreElements())
{
String headerName = (String) e.nextElement();
out.print(headerName+": ");

java.util.Enumeration h = request.getHeaders(headerName);

while (h.hasMoreElements())
{
String header = (String) h.nextElement();
out.print(header);
if (h.hasMoreElements()) out.print(", ");
}
out.println();
}
%>
</pre>
</body>
</html>

Figure 4.4 shows the headers sent to DumpHeaders.jsp.


Figure 4.4. A JSP or servlet can examine all the request headers.


How can you be sure that you are really seeing all the header values? Because HTTP works over a simple socket connection, you can create a program that accepts an incoming connection and dumps out anything sent to it.


Listing 4.2 shows the Dumper.java program you can use to verify that you are seeing all the header values.


Listing 4.2 Source Code for Dumper.java



import java.net.*;
import java.io.*;

public class Dumper
{
public static void main(String[] args)
{
try {
int portNumber = 1234;
try {
portNumber = Integer.parseInt(System.getProperty("port"));
} catch (Exception e) {
}
ServerSocket serv = new ServerSocket(portNumber);

for (;;) {
Socket sock = serv.accept();
InputStream inStream = sock.getInputStream();
int ch;
while ((ch = inStream.read()) >= 0) {
System.out.print((char) ch);
}
sock.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

Figure 4.5 shows the output from the Dumper program. When you run Dumper on your local machine, point the browser to the URL http://localhost:1234. Because the Dumper program doesn't understand HTTP and doesn't know to shut down the connection, you need to either click the Stop button on your browser or terminate the Dumper program.


Figure 4.5. You can use a simple socket program to view the headers sent by a browser.


In Case of Trouble

If you are having trouble accessing the Dumper program, see the "Q&A" at the end of this hour.






    [ Team LiB ]



    No comments:

    Post a Comment