InstallationDeployDeploy to AWS EKS

Deploy MOSTLY AI to an AWS EKS cluster

To run MOSTLY AI in AWS, you need an Elastic Kubernetes Service (EKS) cluster. Please refer to the infrastructure requirements and your own company infrastructure team for reference on how to create your cluster, according to your internal guidelines and best practices.

Infrastructure is tailored to the specific needs of each company. As such, it is important to note that while this guide will get you up and running, none of the modules or examples are direct requirements for the installation. Treat them as a starting point for your own infrastructure setup.

Prerequisites

  • An AWS account where you have root user privileges.
  • A fully-qualified domain name (FQDN) or a valid hosted zone where you can create a new record.
    • You can validate if your hosted zone is exposed by running dig +stats +short <YOURHOSTEDZONE> NS in a terminal with Internet access.
shell
#For a valid hosted zone, a response should be returned:
 dig +stats +short mostly.ai NS
  ns-1939.awsdns-50.co.uk.
  ns-1041.awsdns-02.org.
  ns-437.awsdns-54.com.
  ns-1014.awsdns-62.net.
 
# For an invalid hosted zone, no response should be returned:
 dig +stats +short mostlyASD.ai NS
  • Install the following tools:

  • Decide in which AWS region you want to deploy MOSTLY AI. These instructions are based on deploying to eu-central-1. If you need to deploy to another region, adjust your deployment region accordingly.

  • Obtain deployment details from your Customer Experience Engineer.

    • MOSTLY AI Helm chart
    • First-time log in credentials for the MOSTLY AI Platform
    • (Optional) MOSTLY AI image repository pull secret. This is required only if you do not have the images from MOSTLY AI in your internal image repository and instead intend to use the MOSTLY AI image repository to pull the container images.
  • Recommended:

    • An IDE like Visual Studio Code or Cursor

Pre-deployment

Export your root user credentials for AWS

💡

This will vary depending on how your AWS access is set up. Refer to your infrastructure team and to the Official AWS Documentation for exact guidance.

This tutorial assumes that SSO is set up. If you are not using SSO, create an IAM user with the AdministratorAccess policy and use its credentials instead.

  1. Navigate to your AWS Access Portal.
  2. From the IAM account where you have the AdministratorAccess policy, click on Access keys.
  3. From the Option 1: Set AWS environment variables modal, copy your key, secret and token.
  4. Open a terminal in your IDE where you already have the MOSTLY AI helm charts available and paste them there.
shell
#export AWS_ACCESS_KEY_ID="<YOURAWSKEY>"
 
#export AWS_SECRET_ACCESS_KEY="<YOURAWSSECRET>"
 
#export AWS_SESSION_TOKEN="<YOURAWSTOKEN>"
 
#export AWS_REGION="<YOURAWSREGION>"

Validate that your credentials are enabled and that you have access to AWS with the following command:

shell
aws sts get-caller-identity

Result

The expected output is similar to the following:

{
  "UserId": "AROAMOSTLYAIQ:my.user@mycompany.com",
  "Account": "1234567890",
  "Arn": "arn:aws:sts::1234567890:assumed-role/AWSReservedSSO_AdministratorAccess_0987654321/my.user@mycompany.com"
}

Create the EKS cluster with eksctl

💡

The mostlyai-cluster.yaml file creates an Elastic Kubernetes Service (EKS) cluster with the minimum requirements to get you up and running. However, it should be treated as a starting point for your own infrastructure setup.

Always refer to your own company infrastructure team for reference on how to create your cluster, according to your internal guidelines and best practices while considering the minimum requirements for deploying MOSTLY AI.

  1. In your IDE, create a new file: mostlyai-cluster.yaml with the contents below. Make sure to at least edit the region metadata.region to be the same region you specified in the Export your root user credentials for AWS process:

    mostlyai-cluster.yaml
    apiVersion: eksctl.io/v1alpha5
     kind: ClusterConfig
     
     metadata:
       name: mostly-ai
       region: eu-central-1
       version: "1.33"
     
     iam:
       withOIDC: true
     
     addons:
       - name: eks-pod-identity-agent
         version: latest
     
       - name: vpc-cni
         version: latest
         configurationValues: |
           enableNetworkPolicy: "true"
         podIdentityAssociations:
           - namespace: kube-system
             serviceAccountName: aws-node
             permissionPolicyARNs:
               - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
     
       - name: coredns
         version: latest
     
       - name: kube-proxy
         version: latest
     
       - name: aws-ebs-csi-driver
         version: latest
         podIdentityAssociations:
           - namespace: kube-system
             serviceAccountName: ebs-csi-controller-sa
             permissionPolicyARNs:
               - arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy
     
     managedNodeGroups:
       - name: system-nodes
         amiFamily: Bottlerocket
         instanceTypes: [ m5a.2xlarge ]
         minSize: 1
         maxSize: 1
         desiredCapacity: 1
         labels: { nodepool: system, role: system }
         volumeType: gp2
         volumeSize: 24
     
     
       - name: cpu-compute-nodes
         amiFamily: Bottlerocket
         bottlerocket: { enableAdminContainer: true }
         instanceTypes: [ c5.4xlarge ]
         minSize: 1
         maxSize: 4
         desiredCapacity: 1
         labels: { nodepool: cpu-compute, role: workers }
         taints:
           - key: scheduling.mostly.ai/node
             value: engine-jobs
             effect: NoSchedule
         volumeType: gp2
         volumeSize: 24
     
     vpc:
       cidr: 192.168.0.0/16
       clusterEndpoints: { publicAccess: true, privateAccess: false }
       nat: { gateway: Single }
  2. Save the file and launch a dry-run to validate that the mostlyai-cluster.yaml file can be executed without errors by eksctl:

    shell
    eksctl create cluster -f mostlyai-cluster.yaml --dry-run

This should output the contents of the mostlyai-cluster.yaml file as read by eksctl. If if returns any errors, please validate your credentials (and their level of permissions), region and changes you may have made to the mostlyai-cluster.yaml file itself.

  1. If there are no errors in the dry run, proceed to execute the creation of the cluster by removing the flag --dry-run from the same command (this takes 15+ minutes to complete):

    shell
    eksctl create cluster -f mostlyai-cluster.yaml --dry-run

Result

The cluster is created by eksctl by leveraging of CloudFormation in your account and you will see the progress of its creation in your terminal. When completed, you should receive a message like:

EKS cluster "mostly-ai" in "eu-central-1" region is ready

Connect to the cluster and prepare for deployment of the MOSTLY AI Platform

  1. When cluster creation is completed, retrieve the credentials to connect to the cluster (make sure to update the region to be the same where your mostly-ai cluster is up and running):

    shell
    aws eks update-kubeconfig --region eu-central-1 --name mostly-ai

Update the values.yaml in the root of the MOSTLY AI Helm charts.

Map the FQDN that you wish to use, secret to pull the images (mostlyRegistryDockerConfigJson), gp2 as storage class, and haproxy as the ingress class. Save the file after you finish your changes.

Refer to the sample excerpt below as reference for the values.yaml file:

values.yaml
...
_customerInstallation:
   domainNames:
     mostly-ai: &fqdn your.mostly.ai.installation.fqdn.com # Replace for your valid fqdn
   deploymentSettings:
   ...
     mostlyRegistryDockerConfigJson: &mostlyRegistryDockerConfigJson eyJhdXRocyI6eyJodHRwczovL3NvbWUuZG9ja2VyLnJlcG9zaXRvcnkvdjEvIjp7InVzZXJuYW1lIjoic29tZS11c2VybmFtZSIsInBhc3N3b3JkIjoic29tZS1wYXNzd29yZCIsImF1dGgiOiJjMjl0WlMxMWMyVnlibUZ0WlRwemIyMWxMWEJoYzNOM2IzSmsifX19
     persistenceStorageClass: &persistenceStorageClass "gp2"
     ingressClassName: &ingressClassName "haproxy"
...
combinedChart:
 ...
  haproxy: {enabled: true}
...
haproxy:
  controller:
    ingressClassResource:
      default: true

Deploy the platform in your cluster

shell
helm upgrade --install mostly-ai ./mostly-combined --values values.yaml --namespace mostly-ai --create-namespace

Result

The expected output is similar to the following. If you see errors, see the Troubleshoot AWS EKS deployment issues section.

shell
Release "mostly-ai" does not exist. Installing it now.
NAME: mostly-ai
LAST DEPLOYED: Fri Aug 15 15:54:53 2025
NAMESPACE: mostly-ai
STATUS: deployed
REVISION: 1
TEST SUITE: None

Connect yout deployment with your FQDN

In Route53, map the external IP created by the haproxy service in its hosted zone as an A record, routing traffic to an Alias to Application and Classic Load Balancer.

Make sure to use the same External IP that is exposed by haproxy. You can confirm it with this command:

shell
kubectl get svc -n mostly-ai

If you are using an external domain provider such as GoDaddy, Namecheap, etc., map the external IP address of the load balancer as an A record there.

Log in to your MOSTLY AI deployment.

In a new tab in your web browser, go to your FQDN and log in with the superadmin credentials.

MOSTLY AI Deployment - Log in page

You can use your MOSTLY AI deployment.