Deploy CHT Core on Medic hosted EKS
While not directly available to the public who might be doing CHT Core development, having Medic’s process for using our Amazon Elastic Kubernetes Service (AWS EKS) publicly documented will help Medic employees new to EKS. As well, hopefully external developers looking to re-use Medic tools and process to use EKS will find it helpful.
While these instructions assume you work at Medic and have access to private GitHub repositories, many of the tools are fully open source.
Prerequisites
Command Line
Be sure you have these tools installed and repos cloned:
- awscli: version
2
or newer - kubectl: Must be within one minor version of cluster. If cluster is
1.24.x
, use1.23.x
,1.24.x
or1.25.x
. - helm
- jq
- node version
22
or later. Be sure thatnpm
was installed as well withnode
(it normally is) - Medic Infra repo cloned
Optional: Autocomplete
Both helm
and kubectl
have autocomplete libraries. For power users and beginners alike, it adds a lot of discoverability. This code is for zsh
, but bash
, fish
and powershell
are supported as well:
source <(kubectl completion "$(basename "$SHELL")")
source <(helm completion "$(basename "$SHELL")")
See helm and kubectl docs to automatically loading these on every new session.
Request permission
By default, Medic teammates do not have EKS access and must file a ticket to request it:
- Create a ticket to get your DNS and Namespace created for EKS, which should match each other. As an example, a
mrjones-dev
name space would matchmrjones.dev.medicmobile.org
DNS. The ticket should include requesting EKS access to be granted. - Once the ticket in step one is complete, follow the CLI setup guide.
NB - Security key (e.g. Yubikey) users need to add a TOTP MFA (Time-based, One-Time Password Multi-Factor Authentication) too! CLI requires the TOTP values (6-digit number) and security keys are not supported. Security keys can only be used on web logins.
First time setup
These steps only need to be run once!
After you have created a ticket per “Request permission” above, you should get a link to sign up for AWS. Click the link and:
Create new password ensure it’s 10+ characters including one alpha (
a-z
) and one special (~!@#$%^&*_-+=`|\(){}[]:;"'<>,.?/
) character.Setup MFA. In top-right corner of browser, there is a drop-down menu with your
username @ medic
. Click that and then on “My Security Credentials”Assign an MFA device and give it the same name as your username: In AWS web GUI, click your name in upper right:
- Security Credentials
- scroll down to “Multi-factor authentication (MFA)”
- click “Assign MFA device”
- enter a “Device name” (should match username)
- “Select MFA device” that you’re using
Create Access Keys for Command Line Interface: In AWS web GUI, click your name in upper right -> Security Credentials -> scroll down to “Access keys” -> click “Create access key” -> for use case choose “Command Line Interface” -> click “Next” -> enter description and click “Create access key”
Run
aws configure
and place appropriate access keys during prompts. Useeu-west-2
region. It should look like this:$ aws configure AWS Access Key ID [None]: <ACCESS-KEY-HERE> AWS Secret Access Key [None]: <SECRET-HERE> Default region name [None]: eu-west-2 Default output format [None]:
Run the Update Kubeconfig command, assuming username is
mrjones
and namespace ismrjones-dev
- be sure to place these with yours:aws eks update-kubeconfig --name dev-cht-eks --profile mrjones --region eu-west-2
Starting and stopping (aka deleting)
Login with
eks-aws-mfa-login
script in the infra repo:./eks-aws-mfa-login USERNAME TOTP_HERE
Ensure you’re using dev EKS cluster:
kubectl config use-context arn:aws:eks:eu-west-2:720541322708:cluster/dev-cht-eks
If get an error
no context exists with the name
, changeuse-context
toset-context
in the command. This will create the entry the first time. Subsequent calls should useuse-context
.Create a new
values.yaml
file by copying this one. Be sure to update these values after you create it:<your-project-name>
and<your-namespace>
- set bothUSERNAME-dev
- for examplemrjones-dev
<password-value>
- put in a strong - this instance is exposed to the Internet! *<subdomain>
- yourusername
. For example:mrjones.dev.medicmobile.org
<your-project-name>
and<your-namespace>
- set bothUSERNAME-dev
- for examplemrjones-dev
<password-value>
- put in a strong - this instance is exposed to the Internet! *<subdomain>
- yourusername
. For example:mrjones.dev.medicmobile.org
clusteredCouch_enabled
- set totrue
* Please note some characters are unsupported in
password
::
,@
,"
,'
, etc. Be sure to enclose it in quotes""
and do not use spaces in your password. Your deployment will succeed but you won’t be able to log into the CHT instance.Ensure you have the latest code of
cht-core
repo:git checkout master;git pull origin
Ensure you have
node
dependencies installed forcht-deploy
script:cd scripts/deploy;npm install
Run deploy, being sure to update
PATH_TO
to be where you saved it in the prior step:./cht-deploy -f PATH_TO/values.yaml
Delete it when you’re done:
helm delete USERNAME-dev --namespace USERNAME-dev
Cloning a Medic hosted instance
Sometimes a Medic teammate will need to run tests on data from an instance hosted in a Medic EKS deployment. When cloning a production instance, use extreme caution as it will have real PII/PHI in it. This includes, but is not limited to:
- Using a secure password
- Only ever share credentials over 1Password
- Deleting the instances, volume and snapshot when they’re no longer being used
Otherwise, as Medic has selected AWS as it’s provider to host production instances, making clones is safe when the above basic security measures are followed.
Overview
The cloning process assumes you have access to EKS and to the snapshots and volumes you wish to clone and create. This is not a permission granted to normal teammates who use EKS, so check with SRE as needed.
After checking your permissions, first find a snapshot of the data you’re wanting to clone. Only production data has automated snapshots, so when cloning a development instance, manually create a snapshot first. After finding the snapshot and its ID (e.g. snap-081d1cc18de16d8c7
), create a new volume from this snapshot. Now label the volume so it’s flagged for EKS use. Finally, put your newly created volume ID (e.g. vol-047f57544f4085fb2
) in a values.yml
file to use with helm
and the deploy script.
Read on below for the exact steps on how to do this.
Steps
Note that a number of these steps can be done either on the command line or in the AWS web admin GUI. Do it the way you feel most comfortable!
Always always be sure of which context
you’re working on! Start off by setting your context
to dev-cht-eks
:
kubectl config use-context arn:aws:eks:eu-west-2:720541322708:cluster/dev-cht-eks
And then follow these steps:
Find the ID of the snapshot by using the production URL to retrieve the ID and date of the latest snapshot. Be sure to replace
moh-foo.app
with the real URL of the instance (note: this doesn’t matter whatcontext
you’re on):aws ec2 describe-snapshots --region=eu-west-2 --filters "Name=tag:Address,Values='moh-foo.app.medicmobile.org'" | jq '.Snapshots[0]'
This should result with the following JSON from which you can both verify it is current, but also that it is the correct instance to get the
SnapshotId
value from. This JSON truncated for brevity:{ "Description": "Created for policy: policy-43210483209 schedule: Default Schedule", "SnapshotId": "snap-432490821280432092", "StartTime": "2024-08-18T15:52:59.831000+00:00", "State": "completed", "VolumeId": "vol-4392148120483212", "VolumeSize": 900, "Tags": [ { "Key": "Address", "Value": "moh-foo.app.medicmobile.org" }, { "Key": "Name", "Value": "Production: moh-foo.app.medicmobile.org" }, { "Key": "Description", "Value": "4x foo production for bar" } ] }
Now that you found your snapshot ID, create a volume from it. Being sure to replace
snap-432490821280432092
with your ID, call:aws ec2 create-volume --region eu-west-2 --availability-zone eu-west-2b --snapshot-id snap-432490821280432092
Be sure to grab the
VolumeId
from the resulting JSON,vol-f9dsa0f9sad09f0dsa
in this case:{ "AvailabilityZone": "eu-west-2b", "CreateTime": "2024-08-23T21:31:27+00:00", "Encrypted": false, "Size": 900, "SnapshotId": "snap-432490821280432092", "State": "creating", "VolumeId": "vol-f9dsa0f9sad09f0dsa", "Iops": 2700, "Tags": [], "VolumeType": "gp2", "MultiAttachEnabled": false }
Run
describe-volumes
until that volume has aState
ofavailable
:aws ec2 describe-volumes --region eu-west-2 --volume-id vol-f9dsa0f9sad09f0dsa | jq '.Volumes[0].State' "available"
Once you have that volume created and
available
, tag it withkubernetes.io/cluster/dev-cht-eks: owned
andKubernetesCluster: dev-cht-eks
:aws ec2 create-tags --resources vol-f9dsa0f9sad09f0dsa --tags Key=kubernetes.io/cluster/dev-cht-eks,Value=owned Key=KubernetesCluster,Value=dev-cht-eks
You can verify your tags took effect by calling
describe-volumes
again:aws ec2 describe-volumes --region eu-west-2 --volume-id vol-f9dsa0f9sad09f0dsa | jq '.Volumes[0].Tags'
Which should result in this JSON:
[ { "Key": "kubernetes.io/cluster/dev-cht-eks", "Value": "owned" }, { "Key": "KubernetesCluster", "Value": "dev-cht-eks" } ]
Switch to the production cluster and then find the
subPath
of the deployment you made the snapshot from. TheCOUCH-DB-NAME
is usuallycht-couchdb
. But, it can sometimes becht-couchdb-1
(check./troubleshooting/list-deployments <your-namespace>
if you still don’t know). Including theuse-context
, the two calls are below. Note thattroubleshooting
directory is in the CHT Core repo:kubectl config use-context arn:aws:eks:eu-west-2:720541322708:cluster/prod-cht-eks ./troubleshooting/get-volume-binding <DEPLOYMENT> <COUCH-DB-NAME> | jq '.subPath'
Which shows the path like this:
"storage/medic-core/couchdb/data"
Create a
values.yml
file from this template and edit the following fields:project_name
- likely your username followed by-dev
. For examplemrjones-dev
namespace
- likely the same as project name, your user followed by-dev
. For examplemrjones-dev
chtversion
- this should match the version you cloned frompassword
- this should match the version you cloned fromsecret
- this should match the version you cloned fromuser
- usemedic
useruuid
- this should match the version you cloned fromcouchdb_node_storage_size
- use the same size as the volume you just clonedaccount-id
- this should always be720541322708
host
- this should be your username followed bydev.medicmobile.org
. For examplemrjones.dev.medicmobile.org
hosted_zone_id
- this should always beZ3304WUAJTCM7P
preExistingDataAvailable
- set this to betrue
dataPathOnDiskForCouchDB
- use the subPath you got in the step above. For examplestorage/medic-core/couchdb/data
preExistingEBSVolumeID-1
- set this to be the ID from step 2. For examplevol-f9dsa0f9sad09f0dsa
preExistingEBSVolumeSize
- use the same size as the volume you just cloned
Deploy this to development per the steps above. NB - Be sure to call
kubectl config use-context arn:aws:eks:eu-west-2:720541322708:cluster/dev-cht-eks
before you call./cht-deploy
! Always create test instances on the dev cluster.Login using the
user
andpassword
set above, which should match the production instance.When you’re done with this deployment, you can delete it with helm:
helm delete USERNAME-dev --namespace USERNAME-dev
Now that no resources are using the volume, you should delete it. If you created a snapshot, you should delete that as well. Be sure to replace
vol-f9dsa0f9sad09f0dsa
andsnap-432490821280432092
with your actual IDs. You only need to delete the snapshot if you created it above, do no delete snapshots you did not create:
aws ec2 delete-volume --region eu-west-2 --volume-id vol-f9dsa0f9sad09f0dsa
aws ec2 delete-snapshot --snapshot-id snap-432490821280432092
References and Debugging
More information on cht-deploy
script is available in the CHT Core GitHub repository which includes specifics of the values.yaml
file and more details about the debugging utilities listed below.
Debugging
A summary of the utilities in cht-core/scripts/deploy
directory, assuming mrjones-dev
namespace:
- list all resources:
./troubleshooting/list-all-resources mrjones-dev
- view logs, assuming
cht-couchdb-1
returned from prior command:./troubleshooting/view-logs mrjones-dev cht-couchdb-1
- describe deployment, assuming
cht-couchdb-1
returned from 1st command:./troubleshooting/describe-deployment mrjones-dev cht-couchdb-1
- list all deployments:
./troubleshooting/list-all-resources mrjones-dev
Getting shell
Sometimes you need to look at files and other key pieces of data that are not available with the current troubleshooting/view-logs
script. In this case, getting an interactive shell on the pod can be helpful.
- First, get a list pods for your namespace:
kubectl -n NAMESPACE get pods
- After finding the pod you’re interested, connect to the pod to get a shell:
kubectl -n NAMESPACE exec -it PODNAME/CONTAINERNAME -- /bin/bash
invalid apiVersion
Error
If you get the error:
exec plugin: invalid apiVersion “client.authentication.k8s.io/v1alpha1” when running
kubectl version
You might be using an version of kubernetes api client.authentication.k8s.io
which is not supported by your kubectl
client. This can sometimes happen in EKS clusters if aws cli is an older version, in most cases you need at least version 2
of aws cli. Check version by running: aws --version
and note that version 2
cannot be installed through pip
(See Command Line section above for installation instructions)
SRE Steps for granting users access to a namespace
If you’re on the SRE/Infra team and want to grant a Medic teammate access to EKS:
- Tools required: aws, eksctl, kubectl
- Create AWS User.
- Attach IAM policy: Force_MFA and share auto-generated password safely
- Have user log in and finish MFA, access key setup
- SRE adds you to mfa-required-users group
- Add the namespaces and users to
tf/eks/dev/access/main.tf
- Run tofu apply in the folder
tf/eks/dev/access
- Create
identitymapping
if needed:
Reading the AWS guide for principal access may help here!
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.