Skip to content

Cluster

The Cluster resource lets you manage AWS EMR Clusters for processing large amounts of data using tools such as Apache Hadoop and Apache Spark.

Minimal Example

Create a basic EMR cluster with the required properties and a couple of common optional settings.

ts
import AWS from "alchemy/aws/control";

const emrCluster = await AWS.EMR.Cluster("basic-emr-cluster", {
  Name: "BasicEMRCluster",
  Instances: {
    MasterInstanceType: "m5.xlarge",
    SlaveInstanceType: "m5.xlarge",
    InstanceCount: 3
  },
  JobFlowRole: "EMR_EC2_DefaultRole",
  ServiceRole: "EMR_DefaultRole"
});

Advanced Configuration

Configure an EMR cluster with additional features such as bootstrap actions and logging.

ts
const advancedEmrCluster = await AWS.EMR.Cluster("advanced-emr-cluster", {
  Name: "AdvancedEMRCluster",
  Instances: {
    MasterInstanceType: "m5.xlarge",
    SlaveInstanceType: "m5.xlarge",
    InstanceCount: 3
  },
  JobFlowRole: "EMR_EC2_DefaultRole",
  ServiceRole: "EMR_DefaultRole",
  BootstrapActions: [
    {
      Name: "Install Apache Spark",
      ScriptBootstrapAction: {
        Path: "s3://my-bucket/bootstrap-actions/install-spark.sh"
      }
    }
  ],
  LogUri: "s3://my-bucket/logs/"
});

Using Auto-Scaling

Create a cluster with managed scaling policies for dynamic resource management.

ts
const autoScalingCluster = await AWS.EMR.Cluster("auto-scaling-emr-cluster", {
  Name: "AutoScalingEMRCluster",
  Instances: {
    MasterInstanceType: "m5.xlarge",
    SlaveInstanceType: "m5.xlarge",
    InstanceCount: 3,
    AutoScalingRole: "EMR_AutoScaling_DefaultRole"
  },
  JobFlowRole: "EMR_EC2_DefaultRole",
  ServiceRole: "EMR_DefaultRole",
  ManagedScalingPolicy: {
    ComputeLimits: {
      MinimumCapacityUnits: 2,
      MaximumCapacityUnits: 10,
      UnitType: "Instances"
    }
  }
});

Configuring Kerberos Authentication

Set up a cluster with Kerberos authentication for enhanced security.

ts
const kerberosCluster = await AWS.EMR.Cluster("kerberos-emr-cluster", {
  Name: "KerberosEMRCluster",
  Instances: {
    MasterInstanceType: "m5.xlarge",
    SlaveInstanceType: "m5.xlarge",
    InstanceCount: 3
  },
  JobFlowRole: "EMR_EC2_DefaultRole",
  ServiceRole: "EMR_DefaultRole",
  KerberosAttributes: {
    Realm: "EXAMPLE.COM",
    KdcAdminPassword: "MyKdcAdminPassword",
    CrossRealmTrustPrincipalPassword: "MyCrossRealmTrustPassword",
    EnableKerberos: true
  }
});