Ensuring Business Continuity: Creating a Recovery Server from an Existing EC2 Instance Using AMI

Ensuring Business Continuity: Creating a Recovery Server from an Existing EC2 Instance Using AMI

Creating a Recovery Server from an Existing EC2 Instance Using AMI

In this blog post, we'll walk through the process of creating a recovery server from an existing EC2 instance. This involves creating an Amazon Machine Image (AMI) from the existing instance, launching a new instance from the AMI, and attaching the root volume from the old instance to the new one. This solution is essential for disaster recovery, ensuring business continuity, and minimizing downtime.

Why This Solution is Needed

In the world of cloud computing, ensuring the availability and reliability of your applications is crucial. Unexpected failures, data corruption, or other issues can lead to downtime, which can be costly. By creating a recovery server, you can quickly restore your services and data, minimizing the impact of any disruptions.

Prerequisites

  • AWS account with necessary permissions.

  • boto3 library installed (pip install boto3).

  • Existing EC2 instance.

Step-by-Step Guide

1. Create an AMI from the Existing EC2 Instance

First, we need to create an AMI from the existing EC2 instance. This AMI will serve as a snapshot of the instance, capturing its configuration and data.

import boto3
import time
import logging
from botocore.exceptions import ClientError

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()

# Function to assume a role and return temporary credentials
def assume_role(role_arn, session_name):
    sts_client = boto3.client('sts')
    try:
        response = sts_client.assume_role(
            RoleArn=role_arn,
            RoleSessionName=session_name
        )
        credentials = response['Credentials']
        return credentials
    except ClientError as e:
        logger.error(f"Error assuming role: {e}")
        return None

# Initialize boto3 client with assumed role credentials
role_arn = 'arn:aws:iam::your-account-id:role/your-role-name'
session_name = 'AssumeRoleSession'
credentials = assume_role(role_arn, session_name)

if credentials:
    ec2_client = boto3.client(
        'ec2',
        region_name='your-region',
        aws_access_key_id=credentials['AccessKeyId'],
        aws_secret_access_key=credentials['SecretAccessKey'],
        aws_session_token=credentials['SessionToken']
    )

    s3_client = boto3.client(
        's3',
        region_name='your-region',
        aws_access_key_id=credentials['AccessKeyId'],
        aws_secret_access_key=credentials['SecretAccessKey'],
        aws_session_token=credentials['SessionToken']
    )

    # Variables
    instance_id = 'your-existing-instance-id'
    new_instance_type = 't2.micro'  # Change as needed
    s3_bucket_name = 'your-s3-bucket-name'
    ami_name = 'RecoveryServerAMI'

    try:
        # Step 1: Create an AMI from the existing EC2 instance
        response = ec2_client.create_image(
            InstanceId=instance_id,
            Name=ami_name,
            NoReboot=True
        )
        ami_id = response['ImageId']
        logger.info(f'Created AMI: {ami_id}')

        # Wait for the AMI to be available
        while True:
            response = ec2_client.describe_images(ImageIds=[ami_id])
            state = response['Images'][0]['State']
            if state == 'available':
                logger.info(f'AMI {ami_id} is now available.')
                break
            logger.info('Waiting for AMI to be available...')
            time.sleep(10)

        # Step 2: Launch a new EC2 instance from the created AMI
        response = ec2_client.run_instances(
            ImageId=ami_id,
            InstanceType=new_instance_type,
            MinCount=1,
            MaxCount=1
        )
        new_instance_id = response['Instances'][0]['InstanceId']
        logger.info(f'Launched new instance: {new_instance_id}')

        # Wait for the new instance to be running
        while True:
            response = ec2_client.describe_instances(InstanceIds=[new_instance_id])
            state = response['Reservations'][0]['Instances'][0]['State']['Name']
            if state == 'running':
                logger.info(f'New instance {new_instance_id} is now running.')
                break
            logger.info('Waiting for new instance to be running...')
            time.sleep(10)

        # Step 3: Detach the root volume from the existing EC2 instance
        response = ec2_client.describe_instances(InstanceIds=[instance_id])
        root_volume_id = response['Reservations'][0]['Instances'][0]['BlockDeviceMappings'][0]['Ebs']['VolumeId']
        ec2_client.detach_volume(VolumeId=root_volume_id, InstanceId=instance_id)
        logger.info(f'Detached root volume: {root_volume_id}')

        # Wait for the volume to be detached
        while True:
            response = ec2_client.describe_volumes(VolumeIds=[root_volume_id])
            state = response['Volumes'][0]['State']
            if state == 'available':
                logger.info(f'Volume {root_volume_id} is now detached.')
                break
            logger.info('Waiting for volume to be detached...')
            time.sleep(10)

        # Step 4: Attach the detached root volume to the newly created recovery server
        ec2_client.attach_volume(VolumeId=root_volume_id, InstanceId=new_instance_id, Device='/dev/sda1')
        logger.info(f'Attached volume {root_volume_id} to new instance {new_instance_id}')

        # Step 5: Store the AMI in an S3 bucket
        ami_export_task = ec2_client.create_instance_export_task(
            InstanceId=new_instance_id,
            TargetEnvironment='vmware',
            ExportToS3Task={
                'S3Bucket': s3_bucket_name,
                'S3Prefix': 'exported-amis/'
            }
        )
        export_task_id = ami_export_task['ExportTask']['ExportTaskId']
        logger.info(f'Export task created with ID: {export_task_id}')

        # Wait for the export task to complete
        while True:
            response = ec2_client.describe_export_tasks(ExportTaskIds=[export_task_id])
            state = response['ExportTasks'][0]['State']
            if state == 'completed':
                logger.info(f'Export task {export_task_id} completed.')
                break
            logger.info('Waiting for export task to complete...')
            time.sleep(10)

        # Step 6: Set lifecycle policy for the S3 bucket
        lifecycle_policy = {
            'Rules': [
                {
                    'ID': 'DeleteAMI',
                    'Prefix': 'exported-amis/',
                    'Status': 'Enabled',
                    'Expiration': {
                        'Days': 7
                    }
                }
            ]
        }
        s3_client.put_bucket_lifecycle_configuration(
            Bucket=s3_bucket_name,
            LifecycleConfiguration=lifecycle_policy
        )
        logger.info(f'Set lifecycle policy for bucket {s3_bucket_name} to expire objects after 7 days.')

    except ClientError as e:
        logger.error(f'An error occurred: {e}')
else:
    logger.error("Failed to assume role and obtain credentials.")
2. Launch a New EC2 Instance from the Created AMI

Next, we'll launch a new EC2 instance using the AMI we just created.

import boto3
import time
import logging
from botocore.exceptions import ClientError

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()

# Initialize boto3 client
ec2_client = boto3.client('ec2', region_name='your-region')

# Variables
ami_id = 'your-ami-id'
new_instance_type = 't2.micro'  # Change as needed

try:
    # Step 2: Launch a new EC2 instance from the created AMI
    response = ec2_client.run_instances(
        ImageId=ami_id,
        InstanceType=new_instance_type,
        MinCount=1,
        MaxCount=1
    )
    new_instance_id = response['Instances'][0]['InstanceId']
    logger.info(f'Launched new instance: {new_instance_id}')

    # Wait for the new instance to be running
    while True:
        response = ec2_client.describe_instances(InstanceIds=[new_instance_id])
        state = response['Reservations'][0]['Instances'][0]['State']['Name']
        if state == 'running':
            logger.info(f'New instance {new_instance_id} is now running.')
            break
        logger.info('Waiting for new instance to be running...')
        time.sleep(10)

except ClientError as e:
    logger.error(f'An error occurred: {e}')

Conclusion

By following these steps, you can create a recovery server from an existing EC2 instance, ensuring that your data and services can be quickly restored in case of any issues. This solution is vital for maintaining business continuity and minimizing downtime, providing peace of mind that your applications are resilient and reliable.