Please support my sponors and make this site possible!!!
Please support our sponsors!

 

Home > Core Java FAQ > Networking FAQ
Networking
URL Connections(16) * Internet Addresses(14) *  Sockets(23)  * Security(01)  * Miscellaneous (14) 
 
Q . What's the difference between a URL instance and a URLConnection instance?

Ans : 

A URL instance represents the location of a resource, and a URLConnection instance represents a link for accessing or communicating with the resource at that location.

     The URL class provides an abstraction of a Uniform Resource Locator (URL), the World Wide Web's basic type of pointer. A URL specifies where and how (by which protocol) to reach a resource; it does not specify the contents at that location.
     The URLConnection represents a connection to the resource specified by a URL. It provides general connection support both for the well-known protocols such as http and for custom protocols that you might create. You can use a URLConnection instance to inspect and set properties of the connection (e.g., whether the connection can be used for output in addition to input), to get information from the URL (e.g., content length and header fields), and to get input and output streams for moving data through the connection.

Q . How do I make a connection to a URL?

Ans : 

You obtain a URL instance and then invoke openConnection on it.

     URLConnection is an abstract class, which means you cannot directly create instances of it using a constructor. Nor would you want to, because the type of connection you need depends on the protocol specified in the URL. The URL class's openConnection method manages these details for you. When you invoke openConnection on a URL instance, you automatically get the right kind of connection (subclass of URLConnection) for your URL.
     When you create a URL instance or invoke openConnection, remember to handle the exceptions that might be thrown (if not, the compiler will remind you):

     URL url;
     URLConnection connection;
     try {
         url = new URL (...);
         connection = url.openConnection();
     } catch(MalformedURLException e) {
         // ... handle exception from URL constructor
     } catch(IOException e) {
         // ... handle exception from URL.openConnection
     }
     Obtaining a URLConnection instance is merely the first step. To inspect or communicate with the resource at the other end, you need to set up input or output streams for the connection
Q . How do I read from a remote file if I have its URL?

Ans :  

Get a URL connection from the URL, get an input stream from the connection, and then read from that stream according to the type of data you expect.

     A URLConnection instance manages the connection between your program and a URL, but it delegates much of the actual work to other objects. For example, you do not directly send data to or receive data from a URLConnection instance. Instead, you ask the connection for an input stream or output stream and then transfer data through that stream. To obtain finer control over the data flow, you can wrap a stream filter (an instance of a subclass of FilterInputStream or FilterOutputStream) around the basic input or output stream.
     Below are the typical steps for reading from a file:

  1. Create a URL instance that points to the file you want to read.
  2. Invoke openConnection on that URL instance.
  3. Invoke getInputStream() to get an InputStream object from the connection.
  4. Wrap an instance of an appropriate FilterInputStream subclass around the basic input stream and read from it.
  5. Close the InputStream.
For convenience, URL's openStream method combines steps 2 and 3. The code fragment below exemplifies the process:
     /* using JDK 1.0.2: */
     URL url = null;
     URLConnection connection;
     String urlString = "http://java.sun.com/";
     String currentLine;
     DataInputStream inStream;
     try {
         url = new URL (urlString);
     } catch(MalformedURLException e) { /* ... */ }

     try {
         connection = url.openConnection();
         inStream = new DataInputStream (connection.getInputStream());
         while (null != (currentLine = inStream.readLine())) {
             System.out.println(currentLine);
         }
         inStream.close();
     } catch (IOException e) { /* ... */ }
     To write equivalent code using the JDK 1.1, perform the standard conversion from byte-oriented input streams to character-oriented readers.

      Note: In an applet, similar code could read data only from the host that originally delivered the applet code. In general, applets loaded over the net can make network connections only back to the host they were loaded from.

Q . Why do I get a null result when I use the getHeader... methods in the URLConnection class?

Ans :

There are two main sources of null results from getHeader... methods: you requested a specific header field that doesn't exist for the present connection, or you used a getHeader... method that is not fully implemented in the JDK 1.0.2 (in other words, a bug).

     The URLConnection class provides two different ways to request header information: by specific header type and by position in the overall header list. The general way to request the value of a specific header field is to ask for it by name, using the getHeaderField(String) method:

     public String getHeaderField(String name)
If the named header field exists, this method returns a String instance representing the field's value; otherwise the method returns null. For convenience, the URLConnection class also provides methods to access standard header fields and return digested numerical values where appropriate:
  • public String getContentEncoding()
  • public int getContentLength()
  • public String getContentType()
  • public long getDate()
  • public long getExpiration()
     The second approach is to retrieve the header fields by position rather than by name. For this you use the pair of methods that index the header fields starting with one (not zero):
  • public String getHeaderFieldKey(int index)
  • public String getHeaderField(int index)
For example, the following method iterates through all the header fields for the given URLConnection instance:
     public void printHeaders(URLConnection connection) {
         for (int i = 1; true; ++i) {
             String headerKey = connection.getHeaderFieldKey(i);
             if (headerKey == null) {
                 break;
             }
             System.out.println("    Header " + i + ":  "
                                + headerKey + ": "
                                + connection.getHeaderField(i));
         }
     }
In the JDK 1.0.2, the two methods for retrieving headers by index always return null—this is a bug, not a feature. The JDK 1.1 has fixed this bug, as you can verify by running the GetHeaderExample sample code within the JDK 1.1 virtual machine.
Q . What is URLConnection's getOutputStream method intended to work with on the server side?

Ans :  

The output stream you get from a URL connection usually hooks up with an http-related process on the server side, such as a CGI script.

     Intermachine (network) communication requires cooperating processes at the two ends. With a URL connection, one end is a Java Virtual Machine running your application or applet, and the other end is some server process—using http, ftp, or some other protocol specified in your URL. Reading from a URL connection's input stream is the common and simple case. A general-purpose server process, such as a web or ftp server, can locate and send back a copy of the resource you request.
     Writing to a URL connection's output stream, however, is more restricted, because of the actions required on the server side. Http servers, for example, can't simply create or write to arbitrary files specified in a URL. They must use special processes that have been configured explicitly to accept input across the net. When you send data through a URL connection, you will most likely be sending it to a CGI (Common Gateway Interface) process. (A telltale "cgi-bin" in a URL usually gives away that you are communicating with a CGI process.)
     Note: The default behavior for URL connections is to disallow output streams. The current JDK implementations (1.0.2 and 1.1) define getOutputStream in the URLConnection class to throw an UnknownServiceException:

     /* In URLConnection.java (JDK 1.0.2 and 1.1) */
     public OutputStream getOutputStream() throws IOException {
         throw new UnknownServiceException(
               "protocol doesn't support output");
     }
To enable output, a URLConnection subclass must override getOutputStream, as Sun's JDK implementation does in its HttpURLConnection class in order to allow posting to http servers
Q . How do I send data from my Java program to a CGI program?


Ans :

You can use an http GET request by packing your data into the query string of the URL, or you can use an http POST request by sending your data through an output stream obtained from the URL connection.

     The way you send data to a particular CGI (Common Gateway Interface) program depends on what that program has been written to handle. CGI programs generally expect to receive their data either as an http GET request or as an http POST request. In Java, URLConnection instances support sending data through both mechanisms.
     GET requests are the basic http mechanism for fetching material from the World Wide Web. On the server side, a GET request typically causes the server to locate the resource indicated by the URL and send it back to the client. CGI programs can piggyback on the GET mechanism by looking for a query string at the tail of the URL:

     http://<machine-name>/<path-to-cgi-program>?<query-string>
The information in the query string is made accessible to the program as the value of an environment variable called QUERY_STRING. The CGI program then generates its output and sends that to the client.
     To use a GET request from the client side, a Java program must create an appropriate URLConnection instance, obtain an input stream from the connection, and then read from that stream. The following code fragment, for example, simply reads and prints the CGI program's output:
     /* http GET + query string — using JDK 1.0.2: */
     DataInputStream inStream;
     String urlString = "http://hoohoo.ncsa.uiuc.edu/cgi-bin/test-cgi"
     String queryString = "arg1=val1"
                          + "&arg2=val2"
                          + "&arg3=val3"
                          + "&arg4=val4"
                          + "&arg5=val5";
     urlString += ("?" + queryString);
     try {
         String currentLine;
         URL url = new URL(urlString);
         URLConnection connection = url.openConnection();
         inStream = new DataInputStream(connection.getInputStream());

         /* Read the server's response and close up. */
         while (null != (currentLine = inStream.readLine())) {
             System.out.println(currentLine);
         }
         inStream.close();
              inStream = null;
     } catch (Exception e) {
         // ... handle exception
     } finally {
         if (inStream != null) {
             inStream.close();
         }
     }
GET requests are appropriate when the CGI program needs relatively little information per interaction.
     Alternatively, a client program can make an http POST request. A POST lets you send an arbitrary amount of information to the CGI program, separate from the URL used to initiate the connection. The URL thus has no query string; it specifies only the path to the CGI program:
     http://<machine-name>/<path-to-cgi-program>
After creating a URLConnection instance, your program needs to prepare the connection for a POST (setDoOutput(true)), obtain an output stream from the connection, write data to it, and close it. After that, reading the output from the CGI program works the same as for the GET request just described. The following code fragment illustrates these steps:
     /* http POST — using JDK 1.0.2: */
     DataInputStream inStream;
     PrintStream outStream;
     String urlString = "http://hoohoo.ncsa.uiuc.edu/cgi-bin/test-cgi"
     String dataString = "arg1=val1"
                        + "&arg2=val2"
                        + "&arg3=val3"
                        + "&arg4=val4"
                        + "&arg5=val5";

     try {
         URL url = new URL(urlString);
         URLConnection connection = url.openConnection();
         connection.setDoOutput(true);
         outStream = new PrintStream(connection.getOutputStream());
         outStream.println(dataString);
         outStream.println();
         outStream.close();
         outStream = null;

         /* Read the server's response and close up. */
         inStream = new DataInputStream(connection.getInputStream());
         while (null != (currentLine = inStream.readLine())) {
             System.out.println(currentLine);
         }
         inStream.close();
         inStream = null;
     } catch (Exception e) {
         // ... handle exception
     } finally {
         if (outStream != null) {
             outStream.close();
         }
         if (inStream != null) {
             inStream.close();
         }
     }
Q . Can I write (from my applet) to an external file on a URL?

Ans :  

No; the standard URLConnection classes allow you only to send data to a CGI program on the server (from which your applet was loaded), but not to write directly to a file URL.

     The JDK URLConnection classes—the abstract URLConnnection class and its various implementation subclasses—support writing output to a URL only in the form of http POST requests. Thus, there is no direct way to write to a file specified by a URL; you need to have (or write) a CGI program at the receiving end that can take the data in your POST request and save it to a file.

Q . How can my Java stand-alone application fetch documents in the same fashion as (partially simulating) a browser?

Ans :  
          

Fetching is easy enough, with the assistance of java.net classes, but you would also need to parse, format, and display what you fetch, which requires substantial extra work on your part.

Q . Why do I get a security exception when I try to connect to an external URL from an applet? (If I run equivalent code as an application, it works fine.)

Ans : 

As part of the Java security model, applets are quite restricted in the network connections they can make: applets can typically make network connections only back to the URL host they were fetched from, whereas stand-alone applications do not have this restriction.

     For more details, including the latest status of all known security-related bugs, see JavaSoft's security FAQ web page.

Q . How do I get a URLConnection to work through proxy firewalls? I.e. How do you get your Java application to do its web accesses through a proxy?

Ans : 

This is typically needed for any net access to another domain. Tell the run time system what you are trying to do, by using these commandline arguments when you start the program.

java -DproxySet=true -DproxyHost=SOMEHOST -DproxyPort=SOMENUM code.java

Note proxyPort is optional and it defaults to 80. Without this, you will see an exception like java.net.UnknownHostException or java.net.NoRouteToHostException

The proxy settings work for both java.net.URLConnection, and for java.net.Sockets.

Netscape's and IE's JVMs (at least in versions 4.x+) take the proxy settings for applets from the browser's proxy configuration. You can also do URL proxies in applications (not applets) with the following code

// set up to use proxy
System.getProperties().put("proxySet", "true");
System.getProperties().put("proxyHost", "myproxy.server.name");
System.getProperties().put("proxyPort", "80");

But how do I know the name of the proxy server?
This code just tells you how you can get a URL connection to the outside. Since it is your proxy server, you are expected to know the name of it. There isn't any code that you can write that will allow
arbitrary URL connections to be initiated from outside the firewall. Think about it! If there were, the firewall would not be doing its job.
Also note there are corresponding socksProxyPort and socksProxyHost for when socks is used instead of proxy. The default socks port is 1080.

Q . How do I do a HTTP GET in Java?

Ans : 

 

This one is easy. Just build a URL with the query string on the end in the usual way. I.e. http://www.site.com/perl-script.pl?val1=Put+stuff+here. java.net.URLEncode.encode() is a static method that will properly encode a string for you, as well. Optionally, you could perform an java.applet.AppletContext.showDocument(), which also will submit the information, using the browser to display the output. 

Q . What are content handlers and why should I care? 

Ans :   

Apparently, there are no content handlers defined in the Java spec. The JDK from sun provides Content handlers as Java's way to add functionality to the URL class. By adding content handlers, the URL class can be made to return various MIME types as objects with the getContent method. With the proper content handlers in place, downloading GIFs, JPEGs, MPEGs, and Postscript documents, to name a few, is just a single method call. 

Q . What are protocol handlers?

Ans : 

Protocol handlers are another way to extend the URL class. Again, there are no standard protocol handlers defined in the Java spec. [Note: Again, I find that I have to get Gosling's book to check this.] In a URL, the protocol is the first part of the URL string, the part that precedes the colon. With the proper protocol handlers in place, URLs can handle ftp, mail, gopher, and even finger. Check out http://java.sun.com/people/brown/ for an implementation of a finger protocol handler. 

Q . Can I use non-http: URLs?

Ans : 

Netscape apparently supports the ftp:, mailto:, gopher:, and even the telnet: content handlers (I couldn't find anywhere this was documented). As near as I can tell, noone else does at this time. As a result, you may wish to use this sparingly. It is very easy to test at runtime for content handler support, however. Simply create a URL using the protocol handler. 

URL testurl;
boolean goodmailto;
try {
     testurl = new URL("mailto:maus@io.com");
    goodmailto = true;
}
catch (MalformedURLException e) {
    goodmailto = false;
}
In this example, goodmailto will be true if the protocol handler exists, and false if it does not. [Note: Include link to sample applet at my site which does this.]

Q . Should I write my CGI programs in Java?

Ans  : 

The best answer to that is - "It depends". If you intend to write client-side CGI handling in Java, the mechanisms for it are there and easy to use (see CGI-POST and CGI-GET questions above). If you would like to write CGI server-side programs, be aware that Java has no defined way to grab the enviroment variables that you will need from the HTTP server. JavaSoft has introduced the Servlet API as a standard way to do this, but adoption of this standard by the major server players is still somewhat distant. 

If you do not have access to a server with a Java API for CGI, you will have to write a shell wrapper for your Java programs, since Java has no way of obtaining enviroment variables. Examples exist on the web, if you need guidance. 

Q . How can a Java program talk to a CGI program?

Ans : 

Web browsers display forms, read user input, encode that input into a standard format called a "query string", and send that data to CGI programs that live on the web server. When you write an applet that talks to a CGI program, you have to do all this yourself.
The first thing to know is that there are two ways a CGI program can accept data from a web browser, GET and POST. CGIs that use GET take their arguments from the URL. Programs that use POST read their arguments from standard input. 

The second thing to know is that when you submit data to a form through a web browser, the web browser encodes the data for you. In an applet, however, you need to encode the data yourself. The data is encoded like this: Each form entry is a name-value pair. Names and values are separated from each other by equals signs (=). Pairs are separated from each other by ampersands (&). For example, consider this form: 

<Form method=GET action="http://metalab.unc.edu/javafaq/cgi-bin/getform.pl"> 
Email: <Input NAME="email" size=40>
Name: <Input NAME="realname" size=40> 
<Input TYPE="submit" VALUE="Subscribe">
</Form>

You see that this uses the GET method to communicate with a cgi-bin program at http://metalab.unc.edu/javafaq/cgi-bin/getform.pl. It sends two fields to the CGI program, email and realname. Let's say you want to send the string "elharo@metalab.unc.edu" for the email address, and the string "Elliotte Harold" for the real name. Then the query string would look like this:
String qs = "email=elharo%40metalab.unc.edu&realname=Elliotte%20Harold"; 

The spaces in "Elliotte Harold" and the @ in "elharo@metalab.unc.edu" have been converted into percent escapes. All non-alphanumeric characters in the values must be replaced with a % followed by their ASCII value. Thus a space becomes %20 and the @ becomes %40. 

To send this data to the server, append a question mark (?) and the query string to the URL of the CGI program, and request that URL from the server. Thus the URL you want is:

http://metalab.unc.edu/javafaq/cgi-bin/getform.pl?email=elhr%40mlab.unc.edu;realname=Elliotte%20H";

In Java terms this requires constructing a URL object from this string, and opening that URL's InputStream to read the response. The following code fragment demonstrates:

try {
String thisLine;
String qs = "email=elharo%40metalab.unc.edu&realname=Elliotte%20Harold";
URL u = new URL("http://metalab.unc.edu/javafaq/cgi-bin/getform.pl?" + qs);
DataInputStream theHTML = new DataInputStream(u.openStream());
while ((thisLine = theHTML.readLine()) != null) {
System.out.println(thisLine);

}
catch (Exception e) {
System.err.println(e);
}

Communicating with CGI programs that use POST is somewhat more complex, and it doesn't work very well in Java 1.0.2. It may be improved in Java 1.1. When POSTing to a CGI, you encode the query string exactly as you do for GET requests. However instead of merely requesting a URL's InputStream, you open a URLConnection to the CGI program. 
Do not append the query string to the URL as you did with GET. Instead set the URLConnection's doOutput and doInput fields to true and set AllowUserInteraction to false. Chain the URLConnection's OutputStream to a DataOutputStream and use the DataOutputStream's writeBytes() method to send the query string to the server. 

If you want to read the response, then chain the URLConnection's InputStream to a DataInputStream, and use the DataInputStream's readLine() method to read the response in a while loop. The following code fragment demonstrates:


String query = "email=elharo%40metalab.unc.edu;realname=Elliotte%20Harold";

try {

// open the connection and prepare it to POST
URL u = new URL("http://metalab.unc.edu/javafaq/cgi-bin/postform.pl");
URLConnection uc = u.openConnection();
uc.setDoOutput(true);
uc.setDoInput(true);
uc.setAllowUserInteraction(false);
DataOutputStream dos = new DataOutputStream(uc.getOutputStream());

// Send the data
dos.writeBytes(query);
dos.close();

// Read the response
DataInputStream dis = new DataInputStream(uc.getInputStream());
String nextline;
while((nextline = dis.readLine()) != null) {
    System.out.println(nextline);
}
 dis.close();
}
catch (Exception e) {
    System.err.println(e);
}

As you see, posting forms is considerably more complex than using the GET method. However on some platforms, GET has an annoying habit of failing once the query string grows past 200 characters. The exact point where GET fails varies depending on the operating system and the web server. 

Copyright © 2000 javafaq.com. All rights reserved