Working with ZIP Files

Last modified by ObiEzechukwu on 2011/03/03 15:45

Outline

Background

The zip compression format is not only useful for reducing the size of data sets but also serves as a means of grouping multiple named data sets into a single manageable archive. The commons library provides utility wrappers for the Java zip API, which not only simplify the creation of zip archives but also the task of reading compressed files. 

In this section, we provide two sample applications which demonstrate how to create and read  zip archives respectively. Both examples can be  downloaded from our subversion repository, which is accessible via this link. Please note that you will need an obix labs account in order to access the repository as we discourage anonymous access.

Example 1: Building a ZIP file

In this example we demonstrate how to create a zip archive using the ZIPBuilder class from the commons distribution. This example can be checked out using this direct repository link. When setup in the Eclipse IDE, the application should be laid out as follows:

sample application layout

The sample consists of three resource-files: obix-license.pdf,obix-license.txt, world-greetings.xml; which, for the purposes of this demonstration, will be compressed into a ZIP archive. The sample also includes a single class ZIPBuilderDemo, which is its entry point, and which performs the actual task of invoking the ZIPBuilder. To see how the resulting zip file is built, let us examine the main method, which is listed below:

public static void main(String[] args) throws ObixException
{
        ZIPBuilder zipBuilder = new ZIPBuilder(true, true);

        byte[] resourceData;
        for (String resourceName : SAMPLE_RESOURCES)
        {
              resourceData =
                       ClasspathResourceUtils.getClasspathResource(
                                                  resourceName,ZIPBuilderDemo.class);
             zipBuilder.add(resourceName, resourceData);
        }

        zipBuilder.close();

        write(zipBuilder);
}



private static void write(ZIPBuilder zipBuilder) throws ObixException
{
        File outputFile = new File(OUTPUT_FILE_NAME);
        zipBuilder.write(outputFile);
}

Observe that, at the top of the class, we create a ZIPBuilder instance. And within the subsequent for loop, we iterate through the sample resource-names, read the corresponding resource data using the obix classpath utilities, and add the data to the zip file being constructed. When we have finished, we close the ZIPBuilder to indicate that we are done. At this point, the resulting zip file can be written to a file, output stream or simply referenced as a byte array. Note that the add method is overloaded, so that it can accept directory and file handles, input streams, and bulk data maps.

If you execute the sample application, it should result in a file called "sample-zipfile.zip" being written to the application's current directory-context. The filename and location of the output can of course be overriden by changing the value of the constant OUTPUT_FILE_NAME as declared at the top of the ZIPBuilderDemo class.

Example 2: Reading a ZIP file

In this example we demonstrate how to read a zip archive using the ZIPReader class from the commons distribution. This example can be checked out using this direct repository link. When setup in the Eclipse IDE, the sample application should be laid out as follows:

sample application layout

As can be seen from the above image, the sample consists of a single source file ZIPReaderDemo.java and a sample zip file sample-zip-file.zip. If possible, we recommend that you conduct a visual inspection of the zip file in order to familiarize yourself with its structure. You should find that the file contents are consistent with the items shown the following two figures. At the top-level, it consists of a single file called world-greetings.xml and a folder named licenses, which, in turn, consists of two further files obix-license.pdf and obix-license.txt.

directory top-level contentssubdirectory contents

The intent of the demo class is to list the contents of this zip file, using tabs to indent the lines so as to show folder and file depth. If you execute the class, it should produce the following output:

File:world-greetings.xml(1856)
Directory: license/
        File:license/obix-license.pdf(109356)
        File:license/obix-license.txt(1576)

To understand this output, let us examine the internals of the class, which are listed below:

public static void main(String[] args) throws ObixException
{

       byte[] zipResourceData =
                 ClasspathResourceUtils.getClasspathResource(
                            "sample-zip-file.zip", ZIPReaderDemo.class);

       ZIPReader reader = new ZIPReader(zipResourceData);

       List<CompressedItem> contents = reader.getContents();

       for (CompressedItem compressedItem : contents)
       {
                if (CompressedItem.isFolder(compressedItem))
                        printDirectoryContents(0,(CompressedDataFolder)compressedItem);
                else printFileName(0,(CompressedData)compressedItem);
       }
}

Observe that we create a ZIPReader instance, initialised from the zip data, that is, in turn, is read from the application's classpath using the obix classpath utilities. For this reason, it is important to ensure that the zip file "sample-zip-file.zip" is not filtered from the build output. 

Once the reader is created, we then obtain the contents of the zip file by calling the getContents() method on the reader instance. Notice that the getContents() method returns a collection of CompressedItem instances. Given that CompressedItem is the parent class of all zip entry types, it is necessary to have a means of determining which entry is a file or a directory. If you examine the loop in the main method, you will notice that we use the isFolder(...) method to test if an item is a file or folder. Depending on the result of this call, we can cast the item to either a CompressedDataFolder or CompressedData instance, which represent folders and files respectively.

Once we determine if an item is a folder or file, we invoke either the printDirectoryContents(...) or printFileName(...) methods to print the directory name, contents and zip entry names respectively. Note that the first argument to these methods is the depth of the item in the entry tree, and this is used to determine the indenting used in the output-line corresponding to the entry whose details are being printed. For the sake of completeness, the following excerpt shows the definition of the print... methods.

private static void printFileName(int depth, CompressedData compressedItem)
{
       String indent = buildIndent(depth);
       System.out.println(indent + "File:" + compressedItem.getName() +
                                   "(" + compressedItem.getData().length + ")");
}



private static void printDirectoryContents(int depth, CompressedDataFolder compressedItem)
{
        String indent = buildIndent(depth);
        System.out.println(indent + "Directory: " + compressedItem.getName());
        Collection<CompressedItem> contents = compressedItem.getChildren();

        for (CompressedItem child : contents)
        {
               if (CompressedItem.isFolder(child))
                        printDirectoryContents(depth+1,(CompressedDataFolder)child);
               else printFileName(depth+1,(CompressedData)child);
        }  
}


private static String buildIndent(int depth)
{
        StringBuffer result = new StringBuffer();                        for (int i=0;i<depth;i++)
                         result.append("\t");
                return result.toString();
}

Tags:
Created by ObiEzechukwu on 2011/03/03 15:16

COPYRIGHT: 2010 OBIX LABS LTD