Network programming in Java


Package:
import java.net.*;
Can use these classes to (a) communicate with any server, (b) construct your own server.


Java network programming reference



Example - Find IP address

Find numeric (IP) host address given text address.

InetAddress.getByName(hostname)

From Graba:


import java.net.*;
import java.io.*;

public class ip 
{
  public static void main ( String[] args ) throws IOException 
  {
    String hostname;
    BufferedReader input = new BufferedReader ( new InputStreamReader(System.in) );

    System.out.print("\n");
    System.out.print("Host name: ");
    hostname = input.readLine();

    try 
    {
      InetAddress ipaddress = InetAddress.getByName(hostname);
      System.out.println("IP address: " + ipaddress.getHostAddress());
    }
    catch ( UnknownHostException e )
    {
      System.out.println("Could not find IP address for: " + hostname);
    }
  }
}

Run it:


$ javac ip.java

$ java ip

Host name: www.computing.dcu.ie
IP address: 136.206.11.240

Q. Write program to find text given numeric.

See DNS lookup.


Query local machine

getLocalHost - get your own numeric IP address.

Q. Write program to do this.

See also DOS command "ipconfig"

Not to be confused with: 127.0.0.1



TCP Sockets

Connection-oriented.
Must explicitly socket.close()

Example - Query open ports

Port scanner - look at some machines in DCU to find ports that are "open" - providing a service.

Does this by trying to open a socket to that port.


import java.net.*;
import java.io.*;

public class ports 
{
  public static void main ( String[] args ) throws IOException 
  {
    String hostname;
    BufferedReader input = new BufferedReader ( new InputStreamReader(System.in) );
    Socket s = null;

    System.out.print("\n");
    System.out.print("Host name: ");
    hostname = input.readLine();

    try 
    {
      // this is to see if host exists:
      InetAddress ipaddress = InetAddress.getByName(hostname);

//	int p =  21;		// ftp
//	int p =  23;		// telnet
//	int p =  25;		// smtp
	int p =  80;		// http
//	int p = 110;		// pop3

		try
		{
		  s = new Socket(hostname, p);
		  System.out.println("A server is running on port " + p + ".");
		  s.close();
		}
		catch (IOException e)
		{
		  System.out.println("No server on port " + p + ".");
		}
    }
    catch ( UnknownHostException e )
    {
      System.out.println("Could not find host: " + hostname);
    }

	if (s != null)
	{
		try
		{
			s.close();
		}
		catch ( IOException ioEx )
		{
		}
	}
  }
}

Can now look for http servers:


Host name: www.dcu.ie
A server is running on port 80.

Host name: dgrayweb.computing.dcu.ie
A server is running on port 80.

Host name: mailhost.computing.dcu.ie
A server is running on port 80.

telnet servers (this is results from outside DCU):


Host name: camac.dcu.ie
A server is running on port 23.

Host name: makalu.computing.dcu.ie
A server is running on port 23.

Host name: eiger.computing.dcu.ie
No server on port 23.

Host name: www.dcu.ie
No server on port 23.

POP3 servers:


Host name: mailhost.computing.dcu.ie
A server is running on port 110.

Caution when scanning ports: Some sites don't like this.
Scanning lots of ports looks like hostile intent.
If firewall blocks a port, program will wait until timeout - could take a while.



Download HTTP page

From The Java Developers Almanac:

Example here is getting my latest password for how to email me:


// download text content of URL

import java.net.*;
import java.io.*;

public class jget 
{
  public static void main ( String[] args ) throws IOException 
  {
    try 
    {
        URL url = new URL("http://computing.dcu.ie/~humphrys/howtomailme.html");
    
        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
        String str;

        while ((str = in.readLine()) != null) 
        {
          System.out.println(str);
        }

        in.close();
    } 
    catch (MalformedURLException e) {} 
    catch (IOException e) {}
  }
}

Q. Make URL command-line argument.

Q. Download to file.

Q. Parse HTML to extract password.

Q. Insert error statements into the 2 catch sections.
Note where the following are caught:

  1. Bad URL syntax.
  2. Host does not exist.
  3. Host exists but URL not found.


Get HTTP headers

From The Java Developers Almanac:


// get the HTTP headers 

import java.net.*;
import java.io.*;

public class jhttp
{
  public static void main ( String[] args ) throws IOException 
  {
    try 
    {    
      URL url = new URL("http://computing.dcu.ie/~humphrys/howtomailme.html");

      URLConnection c = url.openConnection();
    
      for (int i=0; ; i++) 
      {
            String name = c.getHeaderFieldKey(i);
            String value = c.getHeaderField(i);
    
            if (name == null && value == null)     // end of headers
            {
              break;         
            }

            if (name == null)     // first line of headers
            {
              System.out.println("Server HTTP version, Response code:");
              System.out.println(value);
              System.out.print("\n");
            }
            else
            {
              System.out.println(name + "=" + value);
            }
      }
    } 
    catch (Exception e) {}
  }
}

Output:


Server HTTP version, Response code:
HTTP/1.1 200 OK

Date=Mon, 22 Nov 2004 11:43:09 GMT
Server=Apache/2.0.47 (Unix) PHP/5.0.2
Last-Modified=Thu, 18 Nov 2004 10:32:20 GMT
ETag="19495e-3cd-e7abf500"
Accept-Ranges=bytes
Content-Length=973
Keep-Alive=timeout=15, max=100
Connection=Keep-Alive
Content-Type=text/html; charset=ISO-8859-1



HTTP headers.
Request - sent by client.
Response - returned by server.


Page not found

If the file is not found you will normally get 404, though there are some other possibilities:

http://computing.dcu.ie/BADPAGE will give something like:


Server HTTP version, Response code:
HTTP/1.1 404 Not Found

Date=Mon, 22 Nov 2004 12:15:27 GMT
Server=Apache/2.0.47 (Unix) PHP/5.0.2
Content-Length=318
Keep-Alive=timeout=15, max=100
Connection=Keep-Alive
Content-Type=text/html; charset=iso-8859-1

Q. Write a program to check if a URL exists and return yes/no.



HTTP response codes.


My 404 re-direct

Note that I catch all errors on my site with a re-direct to a script, and you get 200 for all requests, good or bad, on my site, for reasons explained at that link.

http://computing.dcu.ie/~humphrys/BADPAGE will give something like:


Server HTTP version, Response code:
HTTP/1.1 200

Date=Mon, 22 Nov 2004 12:12:20 GMT
Server=Apache/2.0.47 (Unix) PHP/5.0.2
Keep-Alive=timeout=15, max=100
Connection=Keep-Alive
Transfer-Encoding=chunked
Content-Type=text/html; charset=ISO-8859-1
content-length=1270



The "URL" class hides the sockets

You may see that the URL class actually hides the socket that is being created underneath.

Here is opening a socket directly to send a HTTP GET command and read the results:


// HTTP GET through socket, not through "URL" class

import java.net.*;
import java.io.*;

public class sget 
{
  public static void main ( String[] args ) throws IOException 
  {
    Socket s = null;

    try 
    {
	String host = "computing.dcu.ie";
	String file = "/~humphrys/howtomailme.html";
	int port = 80;
    
	s = new Socket(host, port);

	OutputStream out = s.getOutputStream();
	PrintWriter outw = new PrintWriter(out, false);
	outw.print("GET " + file + " HTTP/1.0\r\n");
	outw.print("Accept: text/plain, text/html, text/*\r\n");
	outw.print("\r\n");
	outw.flush();

	InputStream in = s.getInputStream();
	InputStreamReader inr = new InputStreamReader(in);
	BufferedReader br = new BufferedReader(inr);
	String line;
	while ((line = br.readLine()) != null) 
	{
		System.out.println(line);
	}
	// br.close();		// Q. Do I need this?
    } 
    catch (UnknownHostException e) {} 
    catch (IOException e) {}

	if (s != null)
	{
		try
		{
			s.close();
		}
		catch ( IOException ioEx ) {}
	}
  }
}

From:

flush() - send this now.
TCP sends a variable number of bytes. It may buffer bytes (to collect a larger amount) before sending.
flush() tells it to send what it has now.

Output:

$ java sget
HTTP/1.1 200 OK
Date: Mon, 22 Nov 2004 15:14:10 GMT
Server: Apache/2.0.47 (Unix) PHP/5.0.2
Last-Modified: Thu, 18 Nov 2004 10:32:20 GMT
ETag: "19495e-3cd-e7abf500"
Accept-Ranges: bytes
Content-Length: 973
Connection: close
Content-Type: text/html; charset=ISO-8859-1


(the URL content)



HTTP methods.
HEAD - can be used to test a URL existence without downloading.


Sending a HTTP POST request

e.g. Sending multiple lines of data through a HTML Form.


telnet to HTTP

All plain text commands. Can just telnet to port 80 and send http commands:

$ telnet www.computing.dcu.ie 80
GET /index.html HTTP/1.1
Host: www.computing.dcu.ie

(blank line to end header)



Write your own client to control ftp, telnet, POP3 ..

We have seen how to write your own http client, using the URL class, or using sockets directly.
Now your program can control http.

You can study the commands of any other service and write a client for that too.
Use a socket to connect to the port and then send the appropriate commands.



Sites that restrict scripts

Some sites don't provide content to scripts, only to browsers.
e.g. Write Java program as above to download the result of a Google search. It will be blocked.

One way round this is to set User agent to pretend to be a browser:

$ java  "-Dhttp.agent=Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"   prog

That is, Google is asking you not to hit them with a script many times.
They won't mind the occasional scripted hit (as in the above).
But respect their wishes by making sure you don't hit them many times.