go-chaos is a (yet another) app developed to simplify chaos engineering, although not completely finished, this post is to demo some complete functionality for AWS.

There are several tools for chaos engineering available today in the market, what sets go-chaos apart is the simplicity in the everyday use, no need for complicated tools or platforms, although many of those tools will be better suited for a more complex architecture.

In this case the infrastructure is created using terraform, a simple wordpress page running on instances, in a autoscaling group, just to keep the example simple, the AMI used is the bitnami wordpress.

All the code for this exercise is located in my personal github repo: https://github.com/mental12345/test-go-chaos/

Once the repo is cloned,

cd infra
terraform init
terraform plan

Terraform will create only 6 resources:

Terraform will perform the following actions:

  # module.wordpress.aws_autoscaling_attachment.asg_attachment_bar will be created
  + resource "aws_autoscaling_attachment" "asg_attachment_bar" {}

  # module.wordpress.aws_autoscaling_group.this will be created
  + resource "aws_autoscaling_group" "wordpress_autoscaling" {}

  # module.wordpress.aws_elb.wordpress_lb will be created
  + resource "aws_elb" "wordpress_lb" {}

  # module.wordpress.aws_launch_configuration.wordpress_alc will be created
  + resource "aws_launch_configuration" "wordpress_alc" {}

  # module.wordpress.aws_security_group.wordpress-sg-lb will be created
  + resource "aws_security_group" "wordpress-sg-lb" {}

  # module.wordpress.aws_security_group.wordpress_sg will be created
  + resource "aws_security_group" "wordpress_sg" {}

Plan: 6 to add, 0 to change, 0 to destroy.

if the changes are correct, an apply is next to build and finish the test infrastructure

terraform apply 

This is a pretty simple infrastructure, instances inside an autoscaling group nothing complicated, inside the instances is running wordpress, although is possible to use a custom AMI, with a custom application or webpage.

Once the infrastructure is created, go to AWS and search for the load balancer, in my case is the only one in existence, copy and paste the DNS Name in the browser, you should see an example page. aws console, load balancer

wordpress sample page

In the aws console, EC2 section, we can see the existing instances, our infrastructure should be capable of repairing itself in the case of any issue that could happen.

running instances

Now is the time to use go-chaos, you can download it here, for now this will only work in linux, once is downloaded, put it in /usr/bin/go-chaos.

As mentioned in the beginning of the article, go-chaos executes operations in AWS to simulate possible issues in the infrastructure. This is still a really early release so is not suitable for everyday use in any environment, this is still just a proof of concept.

A chaos operation is a set of executions in order to meet the template requirements.

Inside the root folder in the test-go-chaos repo, there is a json file, called ec2.json, on which there is a simple go-chaos template.

{
  "App": "Go-chaos test, this is a test for EC2 instances",
  "Cloud": {
    "Kind": "EC2",
    "Region" : "us-east-1"
  },
  "Config": {
    "Tag": "env:prod",
    "Chaos": "STOP",
    "number": 4
  }
}

The first part App, is just a simple description of what is going to do, in this case is a simple EC2 test. Cloud, has information about the AWS account, what is it that we are going to mess with, in this case the wordpress app is in EC2 instances, the region where our application is located.

Config, is the operation that we will apply to the infrastructure, “tag” is used for identifying the instances, “chaos” is a set of actions to apply to the infrastructure.

Chaos actions:

number is the amount of resources to which the chaos will be applied to.

In this case, the chaos-operation consists of; getting a list of aws instances with the env:prod tag, and then sending the stop request to 4 instances.

go-chaos template ec2.json

This command will generate the next output:

Successfully Opened: ec2.json
chaos permitted
Stopping instances: 
i-0c8eedea171b3cb7b
i-010ef738b4e6cc0e1
i-000dcf5a2239b78a1
i-0ad1a79a70f01c9ef

On the AWS Console, these instances should be stopped or in the process of doing it.

Infrastructure

On the above image, the instances are stopped, however the autoscaling group immediately started to kill those stopped instances and start new ones. In this case the infrastructure, is capable of repairing itself, without any downtime.

Go-chaos, is still in development, this post was just a simple demo of what is capable of doing, some of the features already in development, are k8s compute resources, such as pods, replica-sets and deployments, for AWS the next features in sight are ECS clusters, tasks and deployments. Some other clouds in the plans are, GCP and DigitalOcean.

The other important features missing right now, is some sort of report or document, and a request to the application to see if is healthy.

Thanks for reading, that’s it for now.