How to list all file names from a Zip/tar.gz file inputstream?

Question

What I am currently doing is using the getObject method from S3 to obtain a ResponseInputStream of a compressed file, and then processing these compressed files through some stream methods. Similar to the following code:

ZipArchiveInputStream zipIn = s3.getZipIn();
while ((entry = zipIn.getNextZipEntry()) != null) {
                if (entry.isDirectory()) {
                    continue;
                }
                long curFileSize = entry.getSize();
                ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
                zipIn.transferTo(byteOut);
                //do something
                String fileName = entry.getName();
}

I think my current approach will download the entire compressed file before executing my logic.

But now I have a special requirement, which is to only obtain the relative path file name of each file, without the actual content of each file. I know that some meta information of this compressed format, such as zip, will exist in certain header or tail partitions. I have seen many simple ways to read file names from local files, but I am not sure if there is a similar function for skipping downloads in my way of obtaining streams from the network, so that I can complete this task without consuming a large amount of network bandwidth.

I know S3 supports partial downloads. Is there any reasonable solution or library to do this?

Leave a Comment Cancel reply