Difference in Boto3 Between Resource, Client, and Session

Difference in boto3 between resource, client, and session?

Client and Resource are two different abstractions within the boto3 SDK for making AWS service requests. If you want to make API calls to an AWS service with boto3, then you do so via a Client or a Resource.

You would typically choose to use either the Client abstraction or the Resource abstraction, but you can use both, as needed. I've outlined the differences below to help readers decide which to use.

Session is largely orthogonal to the concepts of Client and Resource (but is used by both).

Here's some more detailed information on what Client, Resource, and Session are all about.

Client:

  • this is the original boto3 API abstraction
  • it provides low-level AWS service access
  • all AWS service operations are supported by clients
  • it exposes botocore client to the developer
  • it typically maps 1:1 with the AWS service API
  • it exposes snake-cased method names (e.g. ListBuckets API => list_buckets method)
  • typically yields primitive, non-marshalled data (e.g. DynamoDB attributes are dicts representing primitive DynamoDB values)
  • requires you to code result pagination
  • it is generated from an AWS service description

Here's an example of client-level access to an S3 bucket's objects:

import boto3

client = boto3.client('s3')

response = client.list_objects_v2(Bucket='mybucket')

for content in response['Contents']:
obj_dict = client.get_object(Bucket='mybucket', Key=content['Key'])
print(content['Key'], obj_dict['LastModified'])

Note: this client-level code is limited to listing at most 1000 objects. You would have to use a paginator, or implement your own loop, calling list_objects_v2() repeatedly with a continuation marker if there were more than 1000 objects.

OK, so that's the low-level Client interface. Now onto the higher-level (more abstract) Resource interface.

Resource:

  • this is the newer boto3 API abstraction
  • it provides a high-level, object-oriented API
  • it does not provide 100% API coverage of AWS services
  • it uses identifiers and attributes
  • it has actions (operations on resources)
  • it exposes sub-resources and collections of AWS resources
  • typically yields marshalled data, not primitive AWS data (e.g. DynamoDB attributes are native Python values representing primitive DynamoDB values)
  • does result pagination for you
  • it is generated from an AWS resource description

Here's the equivalent example using resource-level access to an S3 bucket's objects:

import boto3

s3 = boto3.resource('s3')

bucket = s3.Bucket('mybucket')

for obj in bucket.objects.all():
print(obj.key, obj.last_modified)

Note: in this case you do not have to make a second API call to get the objects; they're available to you as a collection on the bucket. These collections of sub-resources are lazily-loaded.

You can see that the Resource version of the code is much simpler, more compact, and has more capability (for example it does pagination for you and it exposes properties instead of a raw dictionary). The Client version of the code would actually be more complicated than shown above if you wanted to include pagination.

Finally, onto Session which is fundamental to both Client and Resource and how both get access to AWS credentials, for example.

Session:

  • stores configuration information (primarily credentials and selected region)
  • allows you to create service clients and resources
  • boto3 creates a default session for you when needed

A useful resource to learn more about these boto3 concepts is the introductory re:Invent video.

When to use a boto3 client and when to use a boto3 resource?

boto3.resource is a high-level services class wrap around boto3.client.

It is meant to attach connected resources under where you can later use other resources without specifying the original resource-id.

import boto3
s3 = boto3.resource("s3")
bucket = s3.Bucket('mybucket')

# now bucket is "attached" the S3 bucket name "mybucket"
print(bucket)
# s3.Bucket(name='mybucket')

print(dir(bucket))
#show you all class method action you may perform

OTH, boto3.client are low level, you don't have an "entry-class object", thus you must explicitly specify the exact resources it connects to for every action you perform.

It depends on individual needs. However, boto3.resource doesn't wrap all the boto3.client functionality, so sometime you need to call boto3.client , or use boto3.resource.meta.client to get the job done.

What is the difference between boto3.Session().client and boto3.client?

boto3.client("s3") creates a client using a default session. Which is same as

boto3.DEFAULT_SESSION.client('s3')

boto3.Session() creates new Session. Since no arguments are given, object created will be equivalent to the default session. Normally you would create new session if you want to use new credentials profile, e.g.

boto3.Session(profile_name='non-default-profile')

a == b are different since these are different instances of Client.

What is the convention when using Boto3 clients vs resources?

You can certainly use both.

The resource method actually uses the client method behind-the-scenes, so AWS only sees client-like calls.

In fact, the resource even contains a client. You can access it like this:

import boto3
s3 = boto3.resource('s3')
copy_source = {
'Bucket': 'mybucket',
'Key': 'mykey'
}
s3.meta.client.copy(copy_source, 'otherbucket', 'otherkey')

This example is from the boto3 documentation. It shows how a client is being extracted from a resource, and makes a client call, effectively identical to s3_client.copy().

Both client and resource just create a local object. There is no back-end activity involved.

Different between AWS boto3.session.Session() and boto3.Session()

It is just for convenience; they both refer to the same class. What is happening here is that the __init__.py for the python boto3 package includes the following:

from boto3.session import Session

This just allows you to refer to the Session class in your python code as boto3.Session rather than boto3.session.Session.

This article provides more information about this python idiom:

One common thing to do in your __init__.py is to import selected Classes, functions, etc into the package level so they can be conveniently imported from the package.

Boto: How to use s3 as a resource or a client, but not both?

The boto3 API provides both a 'client' and 'resource' object model for most of the AWS APIs. The documentation has this to say on the difference:

Resources represent an object-oriented interface to Amazon Web Services (AWS). They provide a higher-level abstraction than the raw, low-level calls made by service clients

In other words, the 'client' APIs are a fairly one to one wrapper over the underlying AWS REST calls. The 'resource' API calls are meant to be easier to use, and they provide some "quality of life" improvements that make writing code quicker. Which one to use largely comes down to a coding style preference. For the most part what you can accomplish with 'client' calls can also be accomplished with 'resource' calls. Not always, though. Certainly, for your example, it's possible in either case:

s3 = boto3.client('s3')

# List all of the objects in a bucket, note that since we're fairly
# close to the underlying REST API with the client interface, we need
# to worry about paginating the list objects
paginator = s3.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=bucket_name):
for cur in page.get('Contents', []):
# And delete each object in turn
s3.delete_object(Bucket=bucket_name, Key=cur['Key'])

# Create a zero-byte object to represent the folder
s3.put_object(Bucket=bucket_name, Key='testdir/')

The same work can be accomplished with the resource interface

s3r = boto3.resource('s3')

# Same idea with resource
bucket = s3r.Bucket(bucket_name)
# Paginating, and calling delete on each object in turn is handled
# behind the scenes by all() and delete() in turn
bucket.objects.all().delete()
# Creating the object, again make a zero-byte object to mimic creating
# a folder as the S3 Web UI does
bucket.put_object(Key='testdir/')

Again, it comes down to personal preferences. I personally prefer using the client interface, since it makes it easier to understand and track which underlying API calls are being made, but it's really up to you.

use boto3 session when opening s3 url

As mentioned you just need to replace "wb" with "rb". I was mistaken that this didn't work

from smart_open import open
import boto3

url = 's3://bucket/your/keyz'
session = boto3.Session(aws_access_key_id,
aws_secret_access_key,
region_name)

with open(url, 'rb', transport_params={'client': session.client('s3')}) as fin:
file = fin.read()

print(file)


Related Topics



Leave a reply



Submit