s3cmd sync move two buckets into a single bucket
Thanks to @fviard from over at github for answering my queston. copied here is the answer i received:
By default, the sync don't delete files at destination that are not in
the source. It can tell you that in the summary, but it will not do
it. Check that you have the following configuration: delete_after =
False delete_after_fetch = False delete_removed = False and that you
don't use a option on the command line like "--delete-removed".Btw, you are not needed to do things in separated commands. Without
the "skip-existing", you can do something like that: sync
s3://source1/ s3://source2/ s3://source3/ s3://mydestination/
TLDR; it only deletes the files if you configured it to do so in the config. otherwise the warning message can be ignored.
S3 moving files between buckets on different accounts?
You don't have to open permissions to everyone. Use the below Bucket policies on source and destination for copying from a bucket in one account to another using an IAM user
Bucket to Copy from:
SourceBucket
Bucket to Copy to:
DestinationBucket
Source AWS Account ID:
XXXX–XXXX-XXXX
Source IAM User:
src–iam-user
The below policy means – the IAM user - XXXX–XXXX-XXXX:src–iam-user
has s3:ListBucket
and s3:GetObject
privileges on SourceBucket/*
and s3:ListBucket
and s3:PutObject
privileges on DestinationBucket/*
On the SourceBucket the policy should be like:
{
"Id": "Policy1357935677554",
"Statement": [{
"Sid": "Stmt1357935647218",
"Action": ["s3:ListBucket"],
"Effect": "Allow",
"Resource": "arn:aws:s3:::SourceBucket",
"Principal": {"AWS": "arn:aws:iam::XXXXXXXXXXXX:user/src–iam-user"}
}, {
"Sid": "Stmt1357935676138",
"Action": ["s3:GetObject"],
"Effect": "Allow",
"Resource": "arn:aws:s3:::SourceBucket/*",
"Principal": {"AWS": "arn:aws:iam::XXXXXXXXXXXX:user/src–iam-user"}
}]
}
On the DestinationBucket the policy should be:
{
"Id": "Policy1357935677555",
"Statement": [{
"Sid": "Stmt1357935647218",
"Action": ["s3:ListBucket"],
"Effect": "Allow",
"Resource": "arn:aws:s3:::DestinationBucket",
"Principal": {"AWS": "arn:aws:iam::XXXXXXXXXXXX:user/src–iam-user"}
}, {
"Sid": "Stmt1357935676138",
"Action": ["s3:PutObject"],
"Effect": "Allow",
"Resource": "arn:aws:s3:::DestinationBucket/*",
"Principal": {"AWS": "arn:aws:iam::XXXXXXXXXXXX:user/src–iam-user"}
}]
}
Command to be run is s3cmd cp s3://SourceBucket/File1 s3://DestinationBucket/File1
Synchronizing S3 Folders/Buckets
CloudBerry Explorer comes with PowerShell command line interface and you can learn here how to use it to do sync.
How can I backup or sync an Amazon S3 bucket?
I prefer to backup locally using sync where only changes are updated. That is not the perfect backup solution but you can implement periodic updates later as you need:
s3cmd sync --delete-removed s3://your-bucket-name/ /path/to/myfolder/
If you never used s3cmd, install and configure it using:
pip install s3cmd
s3cmd --configure
Also there should be S3 backup services for $5/month but I would also check Amazon Glacier which lets you put nearly 40 GB single archive file if you use multi-part upload.
http://docs.aws.amazon.com/amazonglacier/latest/dev/uploading-archive-mpu.html#qfacts
Remember, if your S3 account is compromised, you have chance to lose all of your data as you would sync an empty folder or malformed files. So, you better write a script to archive your backup few times, for e.g by detecting start of the week.
Update 01/17/2016:
Python based AWS CLI is very mature now.
Please use: https://github.com/aws/aws-cli
Example: aws s3 sync s3://mybucket .
Best way to move files between S3 buckets?
Update
As pointed out by alberge (+1), nowadays the excellent AWS Command Line Interface provides the most versatile approach for interacting with (almost) all things AWS - it meanwhile covers most services' APIs and also features higher level S3 commands for dealing with your use case specifically, see the AWS CLI reference for S3:
- sync - Syncs directories and S3 prefixes. Your use case is covered by Example 2 (more fine grained usage with
--exclude
,--include
and prefix handling etc. is also available):The following sync command syncs objects under a specified prefix and bucket to objects under another specified prefix and bucket by copying s3 objects. [...]
aws s3 sync s3://from_my_bucket s3://to_my_other_bucket
For completeness, I'll mention that the lower level S3 commands are also still available via the s3api sub command, which would allow to directly translate any SDK based solution to the AWS CLI before adopting its higher level functionality eventually.
Initial Answer
Moving files between S3 buckets can be achieved by means of the PUT Object - Copy API (followed by DELETE Object):
This implementation of the PUT operation creates a copy of an object
that is already stored in Amazon S3. A PUT copy operation is the same
as performing a GET and then a PUT. Adding the request header,
x-amz-copy-source, makes the PUT operation copy the source object into
the destination bucket. Source
There are respective samples for all existing AWS SDKs available, see Copying Objects in a Single Operation. Naturally, a scripting based solution would be the obvious first choice here, so Copy an Object Using the AWS SDK for Ruby might be a good starting point; if you prefer Python instead, the same can be achieved via boto as well of course, see method copy_key()
within boto's S3 API documentation.
PUT Object
only copies files, so you'll need to explicitly delete a file via DELETE Object
still after a successful copy operation, but that will be just another few lines once the overall script handling the bucket and file names is in place (there are respective examples as well, see e.g. Deleting One Object Per Request).
Exclude folders for s3cmd sync
You should indeed use the --exclude
option. If you want to sync every file on the root but not the folders, you should try :
s3cmd --exclude="/*/*" sync local/ s3://s3bucket
Keep in mind that a folder doesn't really exist on S3. What seems to be a file file
in a folder folder
is just a file named folder/file
! So you just have to exclude file with the pattern /*/*
.
Related Topics
How to Install Google Test on Ubuntu Without Root Access
Bash Command That Prints a Message on Stderr
What's The Meaning of 'Blacklisted' on Gstreamer
Configure "-Prefix" Option for Cross Compiling
How to Manually Install The Eclipse-Cdt Plugin from an Archive/Zip on Ubuntu
Why Having to Use Non-Blocking Fd in a Edge Triggered Epoll Function
Can 'Find' or Any Other Tool Search for Files Breadth-First
Rodbc Not Recognizing My Odbc Settings
How to Re-Add a Unicode Byte Order Marker in Linux
How Can Beaglebone Black Be Used as Mass Storage Device
Understanding Linux Directory Permissions Reasoning
How to Monitor Cwnd and Ssthresh Values for a Tcp Connection
Gdb Complains No Source Available
Unix Domain Sockets Not Accessable Across Users