Deploy CHT Core on Medic hosted EKS
Setting up a cloud hosted deployment of CHT Core on Medic's AWS EKS infrastructure
While not directly available to the public who might be doing CHT Core development, having Medic’s process for using our Amazon Elastic Kubernetes Service (AWS EKS) publicly documented will help Medic employees new to EKS. As well, hopefully external developers looking to re-use Medic tools and process to use EKS will find it helpful.
While these instructions assume you work at Medic and have access to private GitHub repositories, many of the tools are fully open source.
Prerequisites
Command Line
Be sure you have these tools installed and repos cloned:
- awscli: version
2or newer - kubectl: Must be within one minor version of cluster. If cluster is
1.24.x, use1.23.x,1.24.xor1.25.x. - helm
- jq
- node version
22or later. Be sure thatnpmwas installed as well withnode(it normally is) - medic-infra repo - so you can run the
eks-aws-mfa-loginscript
Optional: Autocomplete
Both helm and kubectl have autocomplete libraries. For power users and beginners alike, it adds a lot of discoverability. This code is for zsh, but bash, fish and powershell are supported as well:
source <(kubectl completion "$(basename "$SHELL")")
source <(helm completion "$(basename "$SHELL")")See helm and kubectl docs to automatically loading these on every new session.
Request permission
By default, Medic teammates do not have EKS access and must file a ticket to request it:
- Create a ticket to get your DNS and Namespace created for EKS, which should match each other. As an example, a
mrjones-devname space would matchmrjones.dev.medicmobile.orgDNS. The ticket should include requesting EKS access to be granted. - Once the ticket in step one is complete, follow the CLI setup guide.
- AWS Admin will create your IAM user, and a role prefixed with eks-
that contain the same policies as similar usernames. Admins can take a look at IAM user: mrjonesand IAM role:eks-mrjonesfor examples.
First time setup
These steps only need to be run once!
After you have created a ticket per “Request permission” above, you should get a link to sign up for AWS. Click the link and:
Create new password ensure it’s 10+ characters including one alpha (
a-z) and one special (~!@#$%^&*_-+=`|\(){}[]:;"'<>,.?/) character.Setup MFA. In top-right corner of browser, there is a drop-down menu with your
username @ medic. Click that and then on “My Security Credentials”Assign an MFA device and give it the same name as your username: In AWS web GUI, click your name in upper right:
- Security Credentials
- scroll down to “Multi-factor authentication (MFA)”
- click “Assign MFA device”
- enter a “Device name” (should match username)
- “Select MFA device” that you’re using
Create Access Keys for Command Line Interface: In AWS web GUI, click your name in upper right -> Security Credentials -> scroll down to “Access keys” -> click “Create access key” -> for use case choose “Command Line Interface” -> click “Next” -> enter description and click “Create access key”
Run
aws configureand place appropriate access keys during prompts. Useeu-west-2region. It should look like this:$ aws configure AWS Access Key ID [None]: <ACCESS-KEY-HERE> AWS Secret Access Key [None]: <SECRET-HERE> Default region name [None]: eu-west-2 Default output format [None]:
Starting and stopping (aka deleting)
Login with
eks-aws-mfa-loginscript in the infra repo:./eks-aws-mfa-login USERNAME TOTP_HERERun the Update Kubeconfig command, assuming username is
mrjonesand namespace ismrjones-dev- be sure to place these with yours and add the eks prefix to your username:aws eks update-kubeconfig --name dev-cht-eks --profile eks-mrjones --region eu-west-2Ensure you’re using dev EKS cluster:
kubectl config use-context arn:aws:eks:eu-west-2:720541322708:cluster/dev-cht-eksIf get an error
no context exists with the name, changeuse-contexttoset-contextin the command. This will create the entry the first time. Subsequent calls should useuse-context.
4a. Notify your Kubernetes Admin to create your namespace, and add your role and rolebindings that tie into the aws-auth configmap in the kube-system namespace.
4b. Now is a good time to test all your access works correctly: kubectl -n <username>-dev get pods
Create a new
values.yamlfile by copying this one. Be sure to update these values after you create it:<your-project-name>and<your-namespace>- set bothUSERNAME-dev- for examplemrjones-dev<password-value>- put in a strong - this instance is exposed to the Internet! *<subdomain>- yourusername. For example:mrjones.dev.medicmobile.org
<your-project-name>and<your-namespace>- set bothUSERNAME-dev- for examplemrjones-dev<password-value>- put in a strong - this instance is exposed to the Internet! *<subdomain>- yourusername. For example:mrjones.dev.medicmobile.orgclusteredCouch_enabled- set totrue
* Note some characters are unsupported in
password::,@,",', etc. Be sure to enclose it in quotes""and do not use spaces in your password. Your deployment will succeed but you won’t be able to log into the CHT instance.Ensure you have the latest code of
cht-corerepo:git checkout master;git pull originEnsure you have
nodedependencies installed forcht-deployscript:cd scripts/deploy;npm installRun deploy, being sure to update
PATH_TOto be where you saved it in the prior step:./cht-deploy -f PATH_TO/values.yamlDelete it when you’re done:
helm delete USERNAME-dev --namespace USERNAME-dev
Cloning a Medic hosted instance
Sometimes a Medic teammate will need to run tests on data from an instance hosted in a Medic EKS deployment. When cloning a production instance, use extreme caution as it will have real PII/PHI in it. This includes, but is not limited to:
- Using a secure password
- Only ever share credentials over 1Password
- Deleting the instances, volume and snapshot when they’re no longer being used
Otherwise, as Medic has selected AWS as it’s provider to host production instances, making clones is safe when the above basic security measures are followed.
Overview
The cloning process assumes you have access to EKS and to the snapshots and volumes you wish to clone and create. This is not a permission granted to normal teammates who use EKS, so check with SRE as needed.
After checking your permissions, first find a snapshot of the data you’re wanting to clone. Only production data has automated snapshots, so when cloning a development instance, manually create a snapshot first. After finding the snapshot and its ID (e.g. snap-081d1cc18de16d8c7), create a new volume from this snapshot. Now label the volume so it’s flagged for EKS use. Finally, put your newly created volume ID (e.g. vol-047f57544f4085fb2) in a values.yml file to use with helm and the deploy script.
Read on below for the exact steps on how to do this.
Steps
Always be sure of which context you’re working on! Start off by setting your context to dev-cht-eks:
kubectl config use-context arn:aws:eks:eu-west-2:720541322708:cluster/dev-cht-eksAnd then follow these steps:
Find the ID of the snapshot by using the production URL to retrieve the ID and date of the latest snapshot. Be sure to replace
moh-foo.appwith the real URL of the instance (note: this doesn’t matter whatcontextyou’re on):aws ec2 describe-snapshots --region=eu-west-2 --filters "Name=tag:Address,Values='moh-foo.app.medicmobile.org'" | jq '.Snapshots[0]'This should result with the following JSON from which you can both verify it is current, but also that it is the correct instance to get the
SnapshotIdvalue from. This JSON truncated for brevity:{ "Description": "Created for policy: policy-43210483209 schedule: Default Schedule", "SnapshotId": "snap-432490821280432092", "StartTime": "2024-08-18T15:52:59.831000+00:00", "State": "completed", "VolumeId": "vol-4392148120483212", "VolumeSize": 900, "Tags": [ { "Key": "Address", "Value": "moh-foo.app.medicmobile.org" }, { "Key": "Name", "Value": "Production: moh-foo.app.medicmobile.org" }, { "Key": "Description", "Value": "4x foo production for bar" } ] }Now that you found your snapshot ID, create a volume from it. Being sure to replace
snap-432490821280432092with your ID, call:aws ec2 create-volume --region eu-west-2 --availability-zone eu-west-2b --snapshot-id snap-432490821280432092Be sure to grab the
VolumeIdfrom the resulting JSON,vol-f9dsa0f9sad09f0dsain this case:{ "AvailabilityZone": "eu-west-2b", "CreateTime": "2024-08-23T21:31:27+00:00", "Encrypted": false, "Size": 900, "SnapshotId": "snap-432490821280432092", "State": "creating", "VolumeId": "vol-f9dsa0f9sad09f0dsa", "Iops": 2700, "Tags": [], "VolumeType": "gp2", "MultiAttachEnabled": false }Run
describe-volumesuntil that volume has aStateofavailable:aws ec2 describe-volumes --region eu-west-2 --volume-id vol-f9dsa0f9sad09f0dsa | jq '.Volumes[0].State' "available"Once you have that volume created and
available, tag it withkubernetes.io/cluster/dev-cht-eks: ownedandKubernetesCluster: dev-cht-eks:aws ec2 create-tags --resources vol-f9dsa0f9sad09f0dsa --tags Key=kubernetes.io/cluster/dev-cht-eks,Value=owned Key=KubernetesCluster,Value=dev-cht-eksYou can verify your tags took effect by calling
describe-volumesagain:aws ec2 describe-volumes --region eu-west-2 --volume-id vol-f9dsa0f9sad09f0dsa | jq '.Volumes[0].Tags'Which should result in this JSON:
[ { "Key": "kubernetes.io/cluster/dev-cht-eks", "Value": "owned" }, { "Key": "KubernetesCluster", "Value": "dev-cht-eks" } ]Switch to the production cluster and then find the
subPathof the deployment you made the snapshot from. TheCOUCH-DB-NAMEis usuallycht-couchdb. But, it can sometimes becht-couchdb-1(check./troubleshooting/list-deployments <your-namespace>if you still don’t know). Including theuse-context, the two calls are below. Note thattroubleshootingdirectory is in the CHT Core repo:kubectl config use-context arn:aws:eks:eu-west-2:720541322708:cluster/prod-cht-eks ./troubleshooting/get-volume-binding <DEPLOYMENT> <COUCH-DB-NAME> | jq '.subPath'Which shows the path like this:
"storage/medic-core/couchdb/data"Create a
values.ymlfile from this template and edit the following fields:project_name- likely your username followed by-dev. For examplemrjones-devnamespace- likely the same as project name, your user followed by-dev. For examplemrjones-devchtversion- this should match the version you cloned frompassword- this should match the version you cloned fromsecret- this should match the version you cloned fromuser- usemedicuseruuid- this should match the version you cloned fromcouchdb_node_storage_size- use the same size as the volume you just clonedaccount-id- this should always be720541322708host- this should be your username followed bydev.medicmobile.org. For examplemrjones.dev.medicmobile.orghosted_zone_id- this should always beZ3304WUAJTCM7PpreExistingDataAvailable- set this to betruedataPathOnDiskForCouchDB- use the subPath you got in the step above. For examplestorage/medic-core/couchdb/datapreExistingEBSVolumeID-1- set this to be the ID from step 2. For examplevol-f9dsa0f9sad09f0dsapreExistingEBSVolumeSize- use the same size as the volume you just cloned
Deploy this to development per the steps above. NB - Be sure to call
kubectl config use-context arn:aws:eks:eu-west-2:720541322708:cluster/dev-cht-eksbefore you call./cht-deploy! Always create test instances on the dev cluster.Login using the
userandpasswordset above, which should match the production instance.When you’re done with this deployment, you can delete it with helm:
helm delete USERNAME-dev --namespace USERNAME-devNow that no resources are using the volume, you should delete it. If you created a snapshot, you should delete that as well. Be sure to replace
vol-f9dsa0f9sad09f0dsaandsnap-432490821280432092with your actual IDs. You only need to delete the snapshot if you created it above, do no delete snapshots you did not create:
aws ec2 delete-volume --region eu-west-2 --volume-id vol-f9dsa0f9sad09f0dsa
aws ec2 delete-snapshot --snapshot-id snap-432490821280432092References and Debugging
More information on cht-deploy script is available in the CHT Core GitHub repository which includes specifics of the values.yaml file and more details about the debugging utilities listed below.
Debugging
A summary of the utilities in cht-core/scripts/deploy directory, assuming mrjones-dev namespace:
- list all resources:
./troubleshooting/list-all-resources mrjones-dev - view logs, assuming
cht-couchdb-1returned from prior command:./troubleshooting/view-logs mrjones-dev cht-couchdb-1 - describe deployment, assuming
cht-couchdb-1returned from 1st command:./troubleshooting/describe-deployment mrjones-dev cht-couchdb-1 - list all deployments:
./troubleshooting/list-all-resources mrjones-dev
Getting shell
Sometimes you need to look at files and other key pieces of data that are not available with the current troubleshooting/view-logs script. In this case, getting an interactive shell on the pod can be helpful.
- First, get a list pods for your namespace:
kubectl -n NAMESPACE get pods - After finding the pod you’re interested, connect to the pod to get a shell:
kubectl -n NAMESPACE exec -it PODNAME/CONTAINERNAME -- /bin/bash
invalid apiVersion Error
If you get the error:
exec plugin: invalid apiVersion “client.authentication.k8s.io/v1alpha1” when running
kubectl version
You might be using an version of kubernetes api client.authentication.k8s.io which is not supported by your kubectl client. This can sometimes happen in EKS clusters if aws cli is an older version, in most cases you need at least version 2 of aws cli. Check version by running: aws --version and note that version 2 cannot be installed through pip (See Command Line section above for installation instructions)
SRE Steps for granting users access to a namespace
If you’re on the SRE/Infra team and want to grant a Medic teammate access to EKS:
- Tools required: aws, eksctl, kubectl
- Create AWS User.
- Attach IAM policy: Force_MFA and share auto-generated password safely
- Have user log in and finish MFA, access key setup
- SRE adds you to mfa-required-users group
- Add the namespaces and users to
tf/eks/dev/access/main.tf - Run tofu apply in the folder
tf/eks/dev/access - Create
identitymappingif needed:
Reading the AWS guide for principal access may help here!
Did this documentation help you ?