Building a PoC with AWS Instance Scheduler:

post1

Have you ever forgotten to turn off your AWS EC2 instances, only to face unexpected costs?
Inspired by AWS’s Instance Scheduler post shared on AWS solutions website, I decided to build my own Proof-of-Concept using Terraform (IaC) to automate EC2 instance management.

With EC2 Instance Scheduler, managing ‘dev’ workloads becomes effortless — no more manual stopping or starting required.

Utilized AWS Services:
– Lambda
– DynamoDB
– EventBridge
– CloudWatch
– IAM

IaC tools:
– Terraform

GitHub Repo link: https://github.com/damianwojciechowski4/AWS-Projects/tree/main/proj-ec2-scheduler

Solution workflow

lambda ec2 scheduler

The workflow consists of several key steps:

Step 1. EventBridge Schedule Rule Setup

AWS EventBridge has a Scheduler Rule that triggers a Lambda function every hour.

Terraform Code Snippet:

resource "aws_scheduler_schedule" "scheduler" {
  name       = var.EventBridge_ScheduleName
  group_name = "default"
  flexible_time_window {
    mode = "OFF"
  }
  //triggering lambda every hour using cron
  schedule_expression = "cron(0 * ? * * *)"
  target {
    //lambda function target
    arn      = aws_lambda_function.example.arn
    //using role to allow to trigger lambda
    role_arn = aws_iam_role.scheduler.arn
  }
}
eventbridge
EventBridge Schedule Rule with Lambda Function target

Step2. Lambda Function Invocation

DynamoDB Parameter CheckThe triggered Python Lambda function leverages the boto3 library to gather details about all EC2 instances in the specified AWS region, including attributes like instance_id, instance_state, and tags.

# Initialize EC2 Client
ec2_client = boto3.client('ec2')

# --- omitted ----
#request ec2 instances
    try: 
        # use describe_instances() boto3 function on ec2_client.
        response_ec2 = ec2_client.describe_instances()
        instances = response_ec2['Reservations']
        #trigger processing of ec2 instances data
        process_ec2_instances(instances, weekday_num, current_time)
    except ClientError as e:
        print(e.response['Error']['Message'])
        return
# --- omitted ----
def process_ec2_instances(instances, weekday_num, current_time):
    instanceCount = 0
    for reservations in instances:
        instanceCount += 1
        for instance in reservations['Instances']:
            print(f"Instance #{instanceCount}")
            instance_id = instance['InstanceId']
            instance_state = instance['State']['Name']
            tags = instance['Tags']
            print(f"# Instance ID: {instance_id}")
            print(f"# Instance State: {instance_state}")
            print(40*"*")
# --- omitted ---

Step3. DynamoDB Parameter Check

The Lambda function scans a DynamoDB table that holds specific parameters essential for the scheduler’s logic.
These parameters include:

  • schedulerName: Name identifier for the schedule.
  • start_time and stop_time: Defines when to start or stop instances.
  • weekdays: Specifies the days of the week when scheduling should be active.
dynamodb
DynamoDB table 'ec2-scheduler-table'

Step4. Lambda Comparison Logic

Function then checks whether instance is having tag named Scheduler and compares it to the schedulerName value gathered from ec2-scheduler-table.
In addition scirpt checks what is the weekday and time defined in the DynamoDB table and compares that to current time and weekday.
Based on these conditions, the function determines whether an EC2 instance should be started, stopped, or left running.

🐍
lambda_snippet.py
# --- omitted ----
for tag in tags:
    if tag['Key'] == 'Scheduler':
      #assign Scheduler tag value to variable
      schedule_tag = tag['Value']
      ec2_schedule_tag = schedule_tag
      break
    else:
      continue

# --- omitted ----
for item in dynamo_items:
    #compare weekday range and validate with current day
    weekday_db_range = item['weekdays']
    first_day, last_day = weekday_db_range.split('-')
    first_day_num = weekday_map[first_day.lower()]
    last_day_num = weekday_map[last_day.lower()]
    # one of the conditions that must be fulfilled
    # weekday must be within first and last day range
    if first_day_num <= weekday_num <= last_day_num:
       if (item['schedulerName'] == ec2_schedule_tag):                          
             start_time = item['start_time']
             start_time = datetime.strptime(start_time, "%H:%M:%S").time()

             stop_time = item['stop_time']
             stop_time = datetime.strptime(stop_time, "%H:%M:%S").time()
                           
             print("------ Scheduler information -------")
             print(f"- weekday is within specified range {weekday_db_range}")
             print(f"--> start time {start_time}")
             print(f"--> stop time {stop_time}")
            print(f"--> current time {current_time}")
# --- omitted ----

Step5. Lambda Instance State Management

Depending on the results of the comparison, Lambda invokes either the start_instances() or stop_instances() methods from boto3 to manage the EC2 instances’ states.

Below is the snippet of functions that are responsible for starting or stopping specific instance depending on currently fulfilled condition.

🐍
lambda_snippet.py
def ec2_start_instance(instance_id):
    try:
        response_ec2_start = ec2_client.start_instances(InstanceIds=[instance_id])
        print(f"- starting EC2 instance {instance_id}")
        #print(response_ec2_start)
    except ClientError as e:
        print(e.response['Error']['Message'])

def ec2_stop_instance(instance_id):
    try:
        response_ec2_stop = ec2_client.stop_instances(InstanceIds=[instance_id])
        #print(response_ec2_stop)
    except ClientError as e:
        print(e.response['Error']['Message'])

Step6. Logging and Monitoring

Finally, the Lambda function’s output, including details about actions taken, is sent to a dedicated CloudWatch Log Group, allowing for easy monitoring and troubleshooting.

You can find Log Group dedicated to the function right below:

logging
CloudWatch logs Log Group

Security

All policies assigned to the EventBridge and Lambda roles are crafted to strictly adhere to the Least-Privilege Principle. This means each component has only the permissions essential for performing its designated tasks, enhancing the security of the solution.

Examples

dev instance
scheduler dev
Scheduler name configured for 2 separate EC2 instances
  • Stopping Instances
stoppinginstance !
Lambda output — stopping instances
stoppinginstance 2
AWS Console — stopping EC2 instances
  • Starting Instances
startinginstance
Lambda output — Starting Instances that were stopped
  • Keeping Instances Working
noactionlambda
Lambda output — no action required
  • Instance without Scheduler tag like dev or dev-schedule
no tag 1
no tag 2
Lack of Scheduler tag configured
lambda noschedulertag
Lambda reaction to instances without configured scheduler Tag

Conslusion

Automation of EC2 instance management significantly reducing costs.
By leveraging AWS services like LambdaEventBridgeDynamoDBCloudWatch, and IAM, we’ve created a robust and flexible scheduling solution that keeps EC2 instances running only when needed.
This proof-of-concept highlights the power of Infrastructure as Code (IaC) and the security advantages of the Least-Privilege Principle, ensuring our automation is both secure and scalable.