Read a file line by line from S3 using boto?
It appears that boto has a read()
function that can do this. Here's some code that works for me:
>>> import boto
>>> from boto.s3.key import Key
>>> conn = boto.connect_s3('ap-southeast-2')
>>> bucket = conn.get_bucket('bucket-name')
>>> k = Key(bucket)
>>> k.key = 'filename.txt'
>>> k.open()
>>> k.read(10)
'This text '
The call to read(n)
returns the next n bytes from the object.
Of course, this won't automatically return "the header line", but you could call it with a large enough number to return the header line at a minimum.
Read file content from S3 bucket with boto3
boto3 offers a resource model that makes tasks like iterating through objects easier. Unfortunately, StreamingBody doesn't provide readline
or readlines
.
s3 = boto3.resource('s3')
bucket = s3.Bucket('test-bucket')
# Iterates through all the objects, doing the pagination for you. Each obj
# is an ObjectSummary, so it doesn't contain the body. You'll need to call
# get to get the whole body.
for obj in bucket.objects.all():
key = obj.key
body = obj.get()['Body'].read()
read .txt file from s3 bucket not returning all file content
CloudWatch Logs for this Lambda function should be the definitive view of the printed logs.
Your code looks to be correct - the read function on StreamingBody returns all data (if you don't specify an amount parameter), so I don't think there's a problem with your code. It is receiving the entire file contents.
It looks like the truncated view you are seeing in the Lambda console may simply be a limitation of the console, in order to avoid showing an overwhelming number of lines of output.
Read a csv file from aws s3 using boto and pandas
Here is what I have done to successfully read the df
from a csv
on S3.
import pandas as pd
import boto3
bucket = "yourbucket"
file_name = "your_file.csv"
s3 = boto3.client('s3')
# 's3' is a key word. create connection to S3 using default config and all buckets within S3
obj = s3.get_object(Bucket= bucket, Key= file_name)
# get object and file (key) from bucket
initial_df = pd.read_csv(obj['Body']) # 'Body' is a key word
How to read Txt file from S3 Bucket using Python And Boto3
Your ids
is literal string ['i-041fb789f1554b7d5', 'i-0d0c876682eef71ae']
, not a list. To parse it and convert to list use ast module:
import ast
# ...
InstancetobeStart = (obj.get()['Body'].read().decode('utf-8'))
ids = ast.literal_eval(InstancetobeStart)
Reading part of a file in S3 using Boto
S3 supports GET requests using the 'Range' HTTP header which is what you're after.
To specify a Range request in boto, just add a header dictionary specifying the 'Range' key for the bytes you are interested in. Adapted from Mitchell Garnaat's response:
import boto
s3 = boto.connect_s3()
bucket = s3.lookup('mybucket')
key = bucket.lookup('mykey')
your_bytes = key.get_contents_as_string(headers={'Range' : 'bytes=73-1024'})
How to read .dat file from AWS S3 using mdfreader
The easiest method would be to use download_file()
to download the file from Amazon S3 to /tmp/
on the local disk.
Then, you can use your existing code to process the file. This is definitely not a 'hack' -- it is a commonly used technique. It's certainly more reliable than streaming the file.
There is a limit on the amount of storage available and AWS Lambda containers can be reused, so either delete the temporary file after use, or use the same filename (eg /tmp/temp.dat
) each time so that it overwrites the previous version.
Related Topics
How to Use a String as a Keyword Argument
How to Find the First Key in a Dictionary
Sampling Uniformly Distributed Random Points Inside a Spherical Volume
Pandas: Resample Timeseries with Groupby
How Does Python's "Super" Do the Right Thing
Remove Non-Ascii Characters from Pandas Column
How to Read Datetime Back from SQLite as a Datetime Instead of String in Python
Python: How to Remove Empty Lists from a List
How to Merge Images into a Canvas Using Pil/Pillow
Filedialog, Tkinter and Opening Files
How to Create a Numpy Array of Arbitrary Length Strings
Cannot Import Qtwebkitwidgets in Pyqt5
How to Use If/Else in a Dictionary Comprehension
How to Change My Desktop Background with Python
Selenium Python Error: Element Could Not Be Scrolled into View