Getting Started:
Setting up Java for ArchStudio 3

Overview

This first document explains the basics of installing Java 2 SDK version 1.4. It also provides a detailed background on setting up system PATH and CLASSPATH environment variables in general (not ArchStudio-specific).

Java experts will probably find this document mostly a review. ArchStudio 3-specific information on configuring and setting up ArchStudio 3 for both use and development is contained on a separate page called Running and Developing For ArchStudio 3.


Setting up Java

ArchStudio 3 is a Java application written specifically for Java 2 version 1.4 (aka JDK1.4) or better. It will not run properly in any previous version of Java, including 1.3.1. As of this writing, the latest version of the Java 2 Software Development Kit is J2SDK1.4.2. ArchStudio 3 has been tested with this version.

In order to run ArchStudio 3 or do any development on it, you will need to download and install Java 2 SDK version 1.4. This is a free download from Sun Microsystems, and is available for Solaris, Windows, and Linux. Other platforms (such as Macintosh) have separately developed SDKs.

Note that there is a separate download available called the Java 2 Runtime Environment (JRE) that does not include developer tools. The J2SDK includes everything that this runtime environment does, and therefore we recommend just downloading the J2SDK.

Note that it is possible to have any number of JDK's happily installed on your machine at any given time. Make sure to install them in separate directories, and in order of release, if possible. One of our developer machines has JDK1.0.2, 1.1, 1.2.2, 1.3.1, and 1.4.2 all running together. Note special considerations on paths and classpaths, below.

Once you've got J2SDK1.4 installed, you can go to the dowload page. This will direct you to the ISR Software Update Site where you can download our latest tools, including ArchStudio 3. Make sure to download and install both ArchStudio 3 and the xArch/xADL 2.0 Data Binding Library.


Of PATHs and CLASSPATHs

Two system environment variables will affect how your Java programs run. The directions given here apply equally to UNIX/Linux and Windows, although there may be some differences for OSes that lack direct support for environment variables (i.e. Macintosh). Consult your Java SDK environment for more details for such platforms.

Important Note: Many novice users confuse the PATH and the CLASSPATH environment variables. Please note that these are not the same thing, although their structures are similar. They are two separate environment variables with separate functions. Please see detailed descriptions in the following two sections.


Of PATHs

The PATH environment variable determines where your system will look for executable programs when they are named on the command line, except if the relative path of the program is specifically specified on the command line.

Windows

In Windows, the current directory is always implicitly first on the PATH. This means that if you type:

java program

in Windows, the OS will search the current directory for an executable program called java.exe (or java.bat or java.cmd or any number of executable file types on Windows) and run it. If Windows does not find one, it will search the directories specified in the PATH environment variable in order, looking for an executable file called java.

To specify a specific Java executable that is not in the current directory, overriding the current PATH you can write its full pathname on the command line, as such:

C:\Java\jdk1.4.0beta3\bin\java.exe program

In general, it is convenient to include jdk_home_directory\bin on your path when developing and running Java programs, where jdk_home_directory is the directory where your working JDK is installed, for instance C:\Java\jdk1.4.0beta3.

The PATH environment variable consists of a list of directories, separated by semicolons (;). Invalid directory names are ignored. An example of a Windows path is:

D:\Java\jdk1.4.0b3\bin;C:\WINNT\system32;
C:\WINNT;C:\WINNT\System32\Wbem;C:\OnPath

To determine your current PATH in Windows, open up a command prompt and type:

echo %PATH%

There are several ways to set or modify the PATH:

By Hand You can set the PATH by hand at the command-line by typing:

SET PATH=dir1;dir2;...

Alternately, you can append a directory to the front of the existing path by typing:

SET PATH=newdir;%PATH%

You can append a directory to the end of the path by typing path by typing:

SET PATH=%PATH%;newdir

By Batch File A batch file is the Windows equivalent of a UNIX shellscript. It contains a sequence of commands that are executed as if they were typed on the command line, with some additional support for change-able parameters. For example, a Windows batchfile to set up the command environment for use with J2SDK1.4 might look something like this:

SET JDK_HOME=C:\Java\jdk1.4.0beta3\
SET PATH=%JDK_HOME%\bin;%PATH%

Win95/98/ME only: You can also add these commands to the batch file that runs on system startup, C:\autoexec.bat. Changes take effect on reboot.

By NT Environment Variables: Windows NT4/2000/XP have a specific place in the GUI, rather than an autoexec.bat file, where environment variables can be set up. Right-click on the My Computer icon and select Properties. The exact place where environment variables can be set varies among Windows versions. For instance, in Windows 2000, it's in the Advanced tab, under "Environment Variables."

UNIX

Note: All information about UNIX here applies pretty much equally to Linux

In UNIX, the current directory not, by default, on the PATH. This means that if you type:

java program

in UNIX, the OS will search the the directories on the PATH for an executable file called java and run it. If there is an executable file called java in the current directory, it will be ignored unless the current directory is explicitly (by name) or implicitly (by '.') specified on the PATH.

To specify a specific Java executable that is not in the current directory, overriding the current PATH you can write its full pathname on the command line, as such:

/usr/bin/java/1.4/bin/java program

or use a relative path, of course:

../bin/java program

In general, it is convenient to include jdk_home_directory/bin on your path when developing and running Java programs, where jdk_home_directory is the directory where your working JDK is installed, for instance /usr/bin/java.

On UNIX, the PATH environment variable consists of a list of directories, separated by colons (:). Invalid directory names are ignored. An example of a UNIX path is:

/usr/bin:/opt/local/bin:/opt/mh-6.8.4/bin:
/opt/X11-6.0/bin:/usr/dt/bin:/opt/public/bin:
/opt/emacs-20.4/bin:/opt/www-1.0/bin:/opt/SUNWspro-5.0/bin:
/usr/ccs/bin:/opt/gcc-2.8.1/bin:/opt/gnutools-1.1/bin

To determine your current PATH in UNIX, get a command prompt and type:

echo $PATH

Most UNIX installations also include a convenient command to determine what program will run when you type a given command, by analyzing the path. This is the which command. So, if you are not sure which java will run when you type java, you could ask UNIX:

which java

and you'll get a response like:

/usr/bin/java/1.4/bin/java

There are several ways to set or modify the PATH:

By Hand You can set the PATH by hand at the command-line by using your current shell's set-environment command. Sadly, this differs from shell to shell. It may be:

sh: export PATH dir1:dir2:...
csh: setenv PATH dir1:dir2:...
csh (sometimes): set PATH=dir1:dir2:...

...etc. Check man set for more information.

No matter which form you end up using, appending a directory to the front or end of the PATH usually involves inserting $PATH wherever the "old path" would go on the command line, for instance:

setenv PATH newdir;$PATH

By shellscript You can also put environment-variable setting commands in a shellscript. Be warned that most shellscripts begin with the pound-bang hack:

#!/bin/sh (or similar)

Which ends up forking a new shell for use just in the shellscript. Therefore, setting environment variables in the shellscript changes them for the forked shell, rather than the current one. Therefore, the path will be good for commands executed in the shellscript, but will disappear when the shellscript exits.

Many shells include the ability to "source," or execute in immediate mode, commands in a file. This is often preferable to the shellscript option when setting up an environment for direct use. The command, again, is shell-dependent, but examples include:

sh: exec cmdfile
csh: source cmdfile Where cmdfile is a file containing a sequence of commands to execute.

Additionally, many shells, such as csh, have a startup shellscript to which commands can be added. For instance, in csh, this script is called ~/.cshrc. Be especially careful when editing your startup script.


Of CLASSPATHs

One of the most interesting (and, for new users, frustrating) aspects of Java is its use of an environment variable called CLASSPATH to determine the location(s) from which classes are loaded. A large amount of debugging effort has been spent in programs only to find that a CLASSPATH is set incorrectly. The CLASSPATH is simply an environment variable. Setting this variable is almost exactly the same as setting the PATH (see above), except the name of the variable is CLASSPATH instead of PATH.

Side Note: Sun has two useful pages describing the use of the CLASSPATH on different platforms. See here for Windows and here for UNIX.

Using the CLASSPATH is easy once you understand it. The CLASSPATH is a standard environment variable that consists of a list of three things:

  1. Directories,
  2. JAR files, and
  3. ZIP files
Each element in the list is separated by a path separator character, the same character used in the PATH variable, described above. On Windows, this character is a semicolon (;). On UNIX, the path separator character is a colon (:).

To understand the CLASSPATH requires an understanding of the relationship between filesystem directories and Java packages. Java packages provide a global, top-level namespace hierarchy for Java classes. Levels in the package hierarchy are specified by the period (dot) character ('.'). Sun recommends avoiding namespace conflicts by using the development organization's Internet domain name segments in reverse order. So, many classes developed here at ICS in UCI are in packages like:

edu.uci.ics.packagename

However, typing these names can be cumbersome, so often developers break this recommendation for the sake of brevity (as we have done with packages like c2.fw and archstudio.comp.xarchadt). In theory, each level in the package hierarchy maps onto a filesystem directory. So, a class with fully-qualified name edu.uci.isr.xarchutils.XArchFlatInterface implies that there is be a class file called XArchFlatInterface.class in a directory called:

edu/uci/isr/xarchutils (UNIX) or
edu\uci\isr\xarchutils (Windows)

Once you understand how Java packages map to filesystem directories, it is possible to explore how the CLASSPATH works.

As noted above, there are three types of things that can be on the CLASSPATH: directories, JAR files, and ZIP files. A JAR (Java ARchive) file is simply a ZIP file with a distinguished file called the manifest in it, but for the purposes of this discussion, the difference between a JAR file and a ZIP file is irrelevant.

Each entry in the CLASSPATH represents a location from which the Java runtime will search for classes when it needs to load them. The CLASSPATH is searched in order, so in the case of multiple classes with the same fully qualified class name (that is, package + class name) on the CLASSPATH, the first one found is loaded.

So, let's say there is a directory called ~edashofy/projects as the first entry on the CLASSPATH. When Java needs to load a class called edu.uci.ics.xarchutils.XArchFlatInterface, it will look for a file called:

~edashofy/projects/edu/uci/ics/xarchutils/XArchFlatInterface.class

JAR or ZIP archives on the CLASSPATH are searched as if they were a filesystem root. That is, ZIP and JAR files can contain directories and files just as if they were a directory themselves. So, let's say that a file called archstudio.jar is the first entry on the CLASSPATH. When Java needs to load a class called edu.uci.ics.xarchutils.XArchFlatInterface, it will search the archive file for an entry called:

edu/uci/ics/xarchutils/XArchFlatInterface.class and unZIP it from the archive.

Important: If your CLASSPATH contains a directory which, in turn, contains a JAR or ZIP archive, the JAR or ZIP archive will not be automatically searched. It must be added separately to the CLASSPATH.

The runtime will iterate along the CLASSPATH looking for each class when it is needed. If the class is not found, a ClassNotFoundException is thrown by the runtime.

One important insight to glean from this is that all the files in a given package or software system do not necessarily have to be in the same place, as long as the classpath is specified correctly. So, if a JAR file containing edu/uci/isr/Foo.class is on the classpath, along with a directory containing a file structure edu/uci/isr/Bar.class, both classes will be found by the runtime when needed, even though they are in different places in the filesystem.


Important! Exceptions to the CLASSPATH Rules Above

In general, class loading works as specified above. However, there are a few notable exceptions:

  1. Classes in the java.* packages are treated specially by different runtimes, and may not be allowed to be loaded externally.
  2. Applications may define their own classloaders which ignore the CLASSPATH or load classes from a non-filesystem source such as a URL or a database.
  3. The Java Extension Mechanism changes how classloading works in version 1.2 and later. It allows certain archive files to be considered part of the Java core, and therefore are automatically added to the Java classpath by the runtime.
  4. Most, if not all, of the Java tools have a command-line parameter, -classpath, which allows the user to set the classpath for a single tool invocation, overriding whatever is stored in an environment variable.

Comments? Questions?

Comments or questions on this tutorial should go to Eric Dashofy.