Skip to main content

Create and run chaos experiments

Harness Chaos Engineering (CE) gives you the flexibility to create elaborate chaos experiments that help create complex, real-life failure scenarios against which you can validate your applications. At the same time, the chaos experiments are declarative and can be constructed using the Chaos Studio user interface with no programmatic intervention.

A chaos experiment is composed of chaos faults that are arranged in a specific order to create a failure scenario. The chaos faults target various aspects of an application, including the constituent microservices and underlying infrastructure. The parameters associated with these faults can be tuned to impart the desired chaos behavior.

For more information, go to Flow of control in a chaos experiment.

Construct a chaos experiment

To add a chaos experiment:

  1. In Harness, navigate to Chaos > Chaos Experiments.

    Chaos Experiments page

  2. Select + New Experiment.

    Chaos Studio is displayed.

    Experiment Overview

  3. In the Experiment Overview, enter the experiment Name and optional Description and Tags.

  4. In Select a Chaos Infrastructure, select the infrastructure where the target resources reside, and then click Next.

    This takes you to the Experiment Builder tab, where you can choose how to start building your experiment.

    Experiment Builder

    For more information on infrastructure, go to Connect chaos infrastructures.

  5. Select how you want to build the experiment. The options, explained later, are:

    • Blank Canvas - Lets you build the experiment from scratch, adding the specific faults you want.
    • Templates from Chaos Hubs - Lets you preview and select and experiment from pre-curated experiment templates available in Chaos Hubs.
    • Upload YAML - Lets you upload an experiment manifest YAML file.

    These options are explained below.

    If you select Blank Canvas:

    The Experiment Builder tab is displayed.

    Experiment Builder tab with Add button

    1. Select Add, then select each fault you want to add to the experiment individually.

      Select Faults

      For each fault you select, you'll tune the fault's properties next.

      Tune Fault

    2. To tune each fault:

      • Specify the target application (only for pod-level Kubernetes faults): This lets the application's corresponding pods be targeted.

      • Tune fault parameters: Every fault has a set of common parameters, such as the chaos duration, ramp time, etc., and a set of unique parameters that may be customised as needed.

      • Add chaos probes: (Optional) On the Probes tab, you can add chaos probes to automate the chaos hypothesis checks for a fault during the experiment execution. Probes are declarative checks that aid in the validation of certain criteria that are deemed necessary to declare an experiment as passed.

      • Tune fault weightage: Set the weight for the fault, which sets the importance of the fault relative to the other faults in the experiments. This is used to calculate the resilience score of the experiment.

    3. To add a fault that runs in parallel to another fault, point your mouse below an existing fault, and then select Add.

      Complex Faults Experiment

      In Experiment Builder, faults that are stacked vertically run in parallel, and faults or groups of parallel faults run in sequence from left to right.

    If you select Templates from Chaos Hubs:

    1. Select an experiment template from a chaos hub.

      • Select Experiment Type to see available chaos hubs to select templates from.
      • Select a template to see a preview of the faults included.

      Fault Templates

      You can edit the template to add more faults or update the existing faults.

    If you select Upload YAML:

    1. Upload an experiment manifest YAML file to create the experiment.

      You can edit the experiment to update the existing faults or add more of them.

  6. Save the experiment.

    Save experiment options

    • Select Save to save the experiment to the Chaos Experiments page. You can add it to a chaos hub later.
    • Select Add Experiment to ChaosHub to save this experiment as a template in a selected chaos hub.

Now, you can choose to either run the experiment right away by selecting the Run button on the top, or create a recurring schedule to run the experiment by selecting the Schedule tab.

Advanced experiment setup options

You can select Advanced Options on the Experiment Builder tab to configure the advanced options (described below) while creating an experiment for a Kubernetes chaos infrastructure:

Advanced Options

General options

Node Selector

Specifies the node on which the experiment pods will be scheduled. Provide the node label as a key-value pair.

  • Can be used with node-level faults to avoid the scheduling of the experiment pod on the target node(s).

  • Can be used to limit the scheduling of the experiment pods on nodes that have an unsupported OS.

    Node Selector

Toleration

Specifies the tolerations that must be satisfied by a tainted node to be able to schedule the experiment pods. For more information on taints and tolerations, go to the Kubernetes documentation.

  • Can be used with node-level faults to avoid the scheduling of the experiment pod on the target node(s).

  • Can be used to limit the scheduling of the experiment pods on nodes that have an unsupported OS.

    Toleration

Annotations

Specifies the annotations to be added to the experiment pods. Provide the annotations as key-value pairs. For more information on annotations, go to the Kubernetes documentation.

  • Can be used for bypassing network proxies enforced by service mesh tools like Istio.

    Annotations

Security options

Enable runAsUser

Specifies the user ID to be used for starting all the processes in the experiment pod containers. By default 1000 user ID is used.

  • Allows privileged access or restricted access for experiment pods

    runAsUser

Enable runAsGroup

Specifies the group ID to be used for starting all the processes in the experiment pod containers instead of a user ID.

  • Allows privileged access or restricted access for experiment pods

    runAsGroup

Launch an experiment from a chaos hub

You can launch experiments from the default Enterprise Chaos Hub or from custom hubs.

note

Launching the experiment from a hub is different from running an experiment from the Chaos Experiments page. The experiments in chaos hubs are actually templates, so when you launch them from a hub you must provide some additional details. The experiments in the Chaos Experiments page execute immediately, as configured, when you run them.

To launch an experiment from a chaos hub:

  1. In Harness, navigate to Chaos > ChaosHubs, and then select the hub you want.

  2. Find the experiment you want to launch, and then select Launch Experiment.

  3. Select a chaos infrastructure, and then select Next.

    You can change the infrastructure type if necessary.

    Select a Chaos Infrastructure

    Chaos Studio is displayed when you select Next.

  4. In Chaos Studio's Experiment Builder, select the faults in the experiment to configure them.

    You can add more faults or delete faults from the experiment, or update the sequence of faults.

  5. Select Run to execute the experiment.

    You can also save your customized experiment as a template in a chaos hub using the Save button.

Analyze chaos experiments

You can observe the status of execution of fault/s of a chaos experiment during its run. The screen shows the experiment pipeline on the right hand side, and details such as Environment, Infrastructure Name, and the runs that have passed and failed on the left hand side.

Experiment Executing

When the experiment completes execution, it displays the Resilience Score. This score describes how resilient your application is to unplanned failures. The probe success percentage helps determine the outcome of every fault in the chaos experiment. Probes (if any) associated with the experiment are used to understand how the application fared.

Experiment Failed

If any of the faults fail, you can find the Fail Step that elaborates on the reason why the fault failed.

Result Fail Step