How to Read an Aws S3 File with Java

How can I read an AWS S3 File with Java?

The 'File' class from Java doesn't understand that S3 exists. Here's an example of reading a file from the AWS documentation:

AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());        
S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key));
InputStream objectData = object.getObjectContent();
// Process the objectData stream.
objectData.close();

How to read a file from an Amazon S3 bucket using the AWS SDK for Java V2

You are using the wrong logic to read an object from an Amazon S3 bucket using the AWS SDK for Java V2. You are calling list buckets. You can get metadata about each object calling listObjectsV2. For example, you can invoke the S3Object's key() method to get the key name.

Now to read an object from an Amazon S3 bucket, you need the bucket name and key name and then invoke getObjectAsBytes, as shown in this Java logic which shows how to read a PDF document and write it to a local path:

package com.example.s3;

import software.amazon.awssdk.core.ResponseBytes;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.GetObjectRequest;
import software.amazon.awssdk.services.s3.model.S3Exception;
import software.amazon.awssdk.services.s3.model.GetObjectResponse;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
    
/**
 * To run this AWS code example, ensure that you have setup your development environment, including your AWS credentials.
 *
 * For information, see this documentation topic:
 *
 * https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html
 */

public class GetObjectData {

    public static void main(String[] args) {

     final String USAGE = "\n" +
                "Usage:\n" +
                "    GetObjectData <bucketName> <keyName> <path>\n\n" +
                "Where:\n" +
                "    bucketName - the Amazon S3 bucket name. \n\n"+
                "    keyName - the key name. \n\n"+
                "    path - the path where the file is written to. \n\n";

        if (args.length != 3) {
            System.out.println(USAGE);
            System.exit(1);
        }

        String bucketName = "myBucket";
        String keyName = "book.pdf";
        String path = "C:/AWS/book.pdf";

        Region region = Region.US_EAST_1;
        S3Client s3 = S3Client.builder()
                .region(region)
                .build();

        getObjectBytes(s3,bucketName,keyName, path);
        s3.close();
    }

    
    public static void getObjectBytes (S3Client s3, String bucketName, String keyName, String path ) {

        try {
            GetObjectRequest objectRequest = GetObjectRequest
                    .builder()
                    .key(keyName)
                    .bucket(bucketName)
                    .build();

            ResponseBytes<GetObjectResponse> objectBytes = s3.getObjectAsBytes(objectRequest);
            byte[] data = objectBytes.asByteArray();

            // Write the data to a local file
            File myFile = new File(path );
            OutputStream os = new FileOutputStream(myFile);
            os.write(data);
            System.out.println("Successfully obtained bytes from an S3 object");
            os.close();

        } catch (IOException ex) {
            ex.printStackTrace();
        } catch (S3Exception e) {
          System.err.println(e.awsErrorDetails().errorMessage());
           System.exit(1);
        }
    }
}

Find this example and many other Amazon S3 Java V2 code examples in Github here:

https://github.com/awsdocs/aws-doc-sdk-examples/tree/master/javav2/example_code/s3

How do I read the content of a file in Amazon S3

First you should get the object InputStream to do your need.

S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key));
InputStream objectData = object.getObjectContent();

Pass the InputStream, File Name and the path to the below method to download your stream.

public void saveFile(String fileName, String path, InputStream objectData) throws Exception {
    DataOutputStream dos = null;
    OutputStream out = null;
    try {
        File newDirectory = new File(path);
        if (!newDirectory.exists()) {
            newDirectory.mkdirs();
        }

        File uploadedFile = new File(path, uploadFileName);
        out = new FileOutputStream(uploadedFile);
        byte[] fileAsBytes = new byte[inputStream.available()];
        inputStream.read(fileAsBytes);

        dos = new DataOutputStream(out);
        dos.write(fileAsBytes);
    } catch (IOException io) {
        io.printStackTrace();
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        try {
            if (out != null) {
                out.close();
            }
            if (dos != null) {
                dos.close();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

After you Download your object read the file and make it to JSON and write it to .txt file after that you can upload the txt file to the desired bucket in S3

Reading a file on aws s3 with a Java lambda

After adding the following try/catch wrapping all the code in handleRequest method (I was not catching Errors, this is why I did not see it) :

try {
    //My code above
} catch (Error | Exception e) {
    LOGGER.error(e);
}

I got a java.lang.OutOfMemoryError. I had set the lambda memory to 128MB since my code in local was working with less than that. But it seems that reading from s3 needs a bit more and now it works fine with 512MB.

Get an Amazon S3 object deserialized to a Java object

You are looking in the correct Guide - wrong topic for latest Java API to use. The topic you are looking at is old V1 code.

Look at this topic to use AWS SDK for Java V2 - which is considered best practice. V2 packages start with software.amazon.awssdk.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/example_s3_GetObject_section.html

Once you get an object from an S3 bucket, you can convert it into a byte array and from there - do what ever you need to. This V2 example shows how to get a byte array from an object.

package com.example.s3;

// snippet-start:[s3.java2.getobjectdata.import]
import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
import software.amazon.awssdk.core.ResponseBytes;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.GetObjectRequest;
import software.amazon.awssdk.services.s3.model.S3Exception;
import software.amazon.awssdk.services.s3.model.GetObjectResponse;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
// snippet-end:[s3.java2.getobjectdata.import]

/**
 * Before running this Java V2 code example, set up your development environment, including your credentials.
 *
 * For more information, see the following documentation topic:
 *
 * https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html
 */

public class GetObjectData {

    public static void main(String[] args) {

     final String usage = "\n" +
                "Usage:\n" +
                "    <bucketName> <keyName> <path>\n\n" +
                "Where:\n" +
                "    bucketName - The Amazon S3 bucket name. \n\n"+
                "    keyName - The key name. \n\n"+
                "    path - The path where the file is written to. \n\n";

        if (args.length != 3) {
            System.out.println(usage);
            System.exit(1);
        }

        String bucketName = args[0];
        String keyName = args[1];
        String path = args[2];

        ProfileCredentialsProvider credentialsProvider = ProfileCredentialsProvider.create();
        Region region = Region.US_EAST_1;
        S3Client s3 = S3Client.builder()
                .region(region)
                .credentialsProvider(credentialsProvider)
                .build();

        getObjectBytes(s3,bucketName,keyName, path);
        s3.close();
    }

    // snippet-start:[s3.java2.getobjectdata.main]
    public static void getObjectBytes (S3Client s3, String bucketName, String keyName, String path ) {

        try {
            GetObjectRequest objectRequest = GetObjectRequest
                    .builder()
                    .key(keyName)
                    .bucket(bucketName)
                    .build();

            ResponseBytes<GetObjectResponse> objectBytes = s3.getObjectAsBytes(objectRequest);
            byte[] data = objectBytes.asByteArray();

            // Write the data to a local file.
            File myFile = new File(path );
            OutputStream os = new FileOutputStream(myFile);
            os.write(data);
            System.out.println("Successfully obtained bytes from an S3 object");
            os.close();

        } catch (IOException ex) {
            ex.printStackTrace();
        } catch (S3Exception e) {
          System.err.println(e.awsErrorDetails().errorMessage());
           System.exit(1);
        }
    }
    // snippet-end:[s3.java2.getobjectdata.main]
}