HTTP(S) URLs & Context Roots

Introduction

What is a URL? What is a context root? It’s helpful to understand these things when writing or working with web applications.

Consider the following:

https://www.thinkmiddleware.com:443/subjectApp/getSubject.do

Details

This entire string, is considered a URL–a Uniform Resource Locator. It is described in <a href=”http://www.ietf.org/rfc/rfc1738.txt”>RFC 1738</a>.

The first part of this string, “https”, is the protocol. In this case, the protocol is HTTP over SSL. This protocol is defined in <a href=”http://www.ietf.org/rfc/rfc2818.txt”>RFC 2818</a>. There are many other protocols–http, ftp, file, etc. The RFCs refer to protocols as “schemes”.

The “://” is always present.

Next, we have a fully qualified hostname, www.thinkmiddleware.com. The is a valid DNS hostname as described in these <a href=”http://www.dns.net/dnsrd/rfc/”>RFCs</a>. This could also be an IP address or a hostname that can be resolved with a local hosts file. In this example, thinkmiddleware.com is the domain name and “www” is the alias or host/machine name.

The “:” separating the host and port is always present.

After the hostname, comes a port number. This is optional. The protocol or scheme will define a default port. For example, if “443” were omitted, a browser would still open a connection to port 443. If a non-standard port is used, then the port number must be specified as it is in this example.

Everything that remains after the “/” following the port is called the path or url-path. The definition of the URL path is specific to the scheme or protocol being used. This article is primarily concerned with HTTP or HTTPS.

In J2EE terms, a url-path is of the form:

  • context-root/application-specific-path

The context-root maps to the web application’s document root directory. This would be the top level directory of a WAR file (the directory containing WEB-INF).

The context-root is defined in application.xml of an EAR file or in an application-server specific way.

The application-specific-path is relative to the WAR file document root or is a servlet path defined in a web application’s web.xml file.

GET Request

If a GET request is being called, the URL can come in the following format:

Everything before the ‘?’ is a URL as we’ve described it so far. Everything after the question mark is a series of Key-Value pairs that will be available to application code on the server-side.