S3

S3 Resource Replication with Arpio

Arpio replicates the following Amazon Simple Storage Service resource types.


J
ump to:

S3 Bucket

Arpio automates the replication of S3 bucket contents and configuration to an alternate bucket in the recovery region and recovery account. This process leverages S3 bucket replication and scales to support massive buckets containing hundreds of millions of objects.

Bucket Onboarding

Arpio automates the creation of a new bucket in the recovery environment as the destination of the object replication process. Because S3 bucket names must be globally unique, the recovery bucket will have a different name than the primary bucket. Arpio will name this bucket by appending a random suffix to the name of the primary bucket. Your application may need to become aware of this new bucket name, and be updated to reference objects from this location when running in the recovery environment.

Arpio automates the security configuration of the recovery environment bucket to enable the replication process to copy objects into the bucket.

Option 1. Let Arpio create the replication configuration

Arpio creates an IAM role in the source environment that will be used for S3 bucket replication. This role grants permission to read objects from the primary bucket and write them to the recovery bucket.

Once the security configuration is in place, Arpio will configure the bucket replication settings on the primary bucket to begin the replication process. 

Option 2. You create the replication configuration

There are two primary reasons why you would choose this option:

  1. If you manage your primary bucket with an infrastructure-as-code solution (such as Terraform or CloudFormation), then you will want to configure the replication settings in your infrastructure-as-code to prevent these changes from detecting as drift. Arpio will generate the Terraform or CloudFormation configuration for you that you can easily add to your existing solution.
  2. If you already have replication configured on your primary S3 bucket, then Arpio does not currently automatically set up replication. For instructions on how to setup replication for this case see: How to protect an S3 bucket which is already configured for replication

Initial Backfill 

After bucket replication has been configured, Arpio will backfill the recovery bucket with objects from the primary bucket. The backfill process utilizes S3 Batch Operations to copy objects from the primary bucket to the recovery bucket. If you do not need all objects copied to the recovery bucket, you can specify a timeframe of objects to include based on object age (excluding older objects).

Replication

The following attributes are translated during replication:

Attribute

Translation

Lifecycle Rules

Lifecycle rules are copied from the primary bucket. An additional lifecycle rule is added for deleted objects that are older than the configured recovery point retention policy so that objects with delete markers that should no longer be retained will be fully deleted.

Notification Configurations

When testing the recovery environment or actively failed over, Arpio translates notification configurations that reference SNS topics, SQS queues, and Lambda functions, function versions, and aliases to target the corresponding topics and queues in the recovery environment. These configurations are not enabled when the recovery environment is not in use.

Encryption Configuration

Arpio replicates the default encryption configuration from the primary bucket. If KMS encryption is being utilized with the primary bucket, Arpio creates a new customer managed KMS key in the recovery environment and configures S3 default encryption to utilize it.

Bucket ACL

Arpio replicates the bucket ACL from the primary bucket to the recovery bucket. The bucket ACL and bucket policy are updated to match the ACL and policy of the primary bucket, at which point other principles may be granted access.

Bucket Policy

Arpio replicates the bucket policy from the primary bucket to the recovery bucket, and adds additional policy statements to support the bucket replication process. The primary bucket statements are included and translated according to the policy document translation process.

The following resources are automatically selected into recovery points when an S3 bucket is selected:

  • SNS Topics referenced by notification configurations on the primary environment bucket
  • SQS Queues referenced by notification configurations on the primary environment bucket
  • Lambda Functions, Versions, and Aliases referenced by notification configurations on the primary environment bucket.