How to read and process large text/CSV files from an S3 bucket using C#?
Increase your Lambda timeout, which (currently) has a hard limit of 15 minutes.
If your CSV processing takes longer than 15 minutes, Lambda functions are not the right solution for your job - they are meant for quick processing.
What would be the right solution is out of scope but you could perhaps utilise spot EC2 instances, step functions, run containers on Fargate etc.
Related: to speed up your current process, make parallel requests to S3 at the beginning and then process in one go i.e. create the tasks and then await
them all at once.
Read and parse CSV file in S3 without downloading the entire file
You should just be able to use the createReadStream
method and pipe it into fast-csv:
const s3Stream = s3.getObject(params).createReadStream()
require('fast-csv').fromStream(s3Stream)
.on('data', (data) => {
// do something here
})
Extract specific column from csv stored in S3
You can use the result returned from Amazon Athena via get_query_results()
.
If the data
variable contains the JSON shown in your question, you can extract a list of the instances with:
rows = [row['Data'][1]['VarCharValue'].replace('"', '') for row in data]
print(rows)
The output is:
['instanceId', 'i-053090803', 'i-0724f62a', 'i-552', 'i-07f4e5', 'i-0eb453', 'i-062120', 'i-0121a04', 'i-0f213', 'i-0ee19d8', 'i-04ad3c29', 'i-7c6166', 'i-07bc579d', 'i-0b8bc7df5']
You can skip the column header by referencing: rows[1:]
How do I read a csv file from aws s3 in aws lambda
https://docs.python.org/3/library/csv.html
According to the documents, I think you used the wrong way of csv module. So the reader is empty and that's why your code does not return anything
Related Topics
Grouped Collection Select Alphabetical Order Rails
How to Deal with App_Key and App_Secret (Dropbox API)
Listing Directories at a Given Level in Amazon S3
Displaying a Polygon with Gmaps4Rails
Error Installing Rubymine, No Sdk Specified, But It Is Listed
How to Get the Destination Url of a Shortened Url Using Ruby
Twitter 3-Legged Authorization in Ruby
Ruby Elegant Way to Return Min/Max If Value Outside Range
Multi Level Block Method Is Generating Issue
Ruby Not Finding New Version of Openssl
Is There a Solution to Bypass 'Can't Add a New Key into Hash During Iteration (Runtimeerror)'
Shorter Way to Pass Every Element of an Array to a Function
What Is the Purpose of Redo and Retry Statements in Ruby
Finding the Product of a Variable Number of Ruby Arrays