How to Import a Text File on Aws S3 into Pandas Without Writing to Disk

Read excel file from S3 into Pandas DataFrame

It is perfectly normal! obj is a dictionnary, have u tried ?

df = pd.read_excel(obj['body'], header=2)

How to import pandas profile report output as html/json file on AWS S3 location

After generating the profile report as

 profile = pandas_profiling.ProfileReport(
df, title="Data Profile Report", minimal=True)

To write .html file to S3, we have to first write this file to local filesystem and then upload the file from local filesystem to S3 and finally delete the file from local filesystem as below:

# write .html file to s3
profile.to_file('./file_name-profile.html')
awswrangler.s3.upload(local_file='./file_name-profile.html', path='s3://analytics-storage-bucket/processedData/file_name-profile.html')
os.remove('./file_name-profile.html')
###

This code works on ec2 and aws glue job.

Upload data to S3 bucket without saving it to a disk

Save text file:

obj = 'some string'
bucket = 'my-bucket'
key = 'prefix/filename.txt'

boto3.client('s3').put_object(Body=obj, Bucket=bucket, Key=key)

Save csv file from pandas dataframe:

df = my-dataframe
bucket = 'my-bucket'
key = 'prefix/filename.csv'

csv_buffer = io.StringIO()
df.to_csv(csv_buffer)
boto3.client('s3').put_object(Body=csv_buffer.getvalue(), Bucket=bucket, Key=key)

Files uploaded to s3 are missing content

You need the close the file first so that the data is written to the file system.

with open(f"textfile.txt", "w") as text_file:
text_file.write(description)

#now the with block ends and calls close() on the file and it's written to disk
upload_to_aws("textfile.txt",'bucket-name',"test.txt")

It can be done with flush() also if you'd want to keep the file open to write more but you don't need that here.



Related Topics



Leave a reply



Submit