Technical Overview > Architecture
Overview of CHT components and interactions
This guide will walk you through setting up a deployment of CHT Sync with the CHT using Docker. This path is recommended if you host the CHT with Docker.
docker
or current version of Docker Desktop both of which include docker compose
. Note that the older docker-compose
is no longer supported.git clone https://github.com/medic/cht-sync
).In the cht-sync
folder, copy the values from the env.template
file to a .env
file. For more information, see the references on the Environment variables page.
Configure the COUCHDB_*
environment variables to connect to your CouchDB instance. For production CHT Core deployments, the port will most likely need to be set to 443
like this: COUCHDB_PORT=443
. This is because CHT Core uses an nginx
reverse proxy on port 443
, instead of the default 5984
port used in a stand-alone CouchDB instance which the env.template
has.
#8 DONE 0.0s
and you are returned to the command line. Be patient!Running CHT Sync uses docker compose profiles to organize different sets of services.
The following profiles are available:
local
- for local model developmentproduction
- recommended setup for productiontest
- for automated testsThis setup involves starting couch2pg, PostgreSQL, pgAdmin and dbt. It assumes you have a CouchDB instance running, and you updated the .env
CouchDB variables accordingly.
The dbt container needs a project to run. To develop and test models locally, set up a dbt project and set the path to the project to the DBT_LOCAL_PATH
environment variable in .env
. You must set this to a valid directory where you have your CHT Pipeline models. You can not use --local
profile without setting this.
When running, the dbt container then will use the local models in the path specified in DBT_LOCAL_PATH
with out needing to query a remote git repository.
Run the Docker containers locally and wait for every container to be up and running:
docker compose --profile local up -d
You can verify this command worked by running docker compose --profile local ps --format "{{.Names}}\t{{.Status}}"
. It should show four containers running including couch2pg, dbt, PostgreSQL, and pgAdmin:
cht-sync-couch2pg-1 Up About a minute
cht-sync-dbt-local-1 Up About a minute
cht-sync-pgadmin-1 Up About a minute
cht-sync-postgres-1 Up About a minute
When developing dbt models, it is helpful to test changes locally before committing them to a remote repository.
The dbt container will run the models in the path specified in DBT_LOCAL_PATH
.
This setup involves starting couch2pg and one dbt container. It assumes you have an external CouchDB instance and Postgres DB. The credentials and settings for these databases are configured in .env.
Run the Docker containers with profile production
and wait for every container to be up and running:
docker compose --profile production up -d
You can verify this command worked by running docker ps
. It should show three containers running: dbt, couch2pg and bastion.
In production setups with large tables, it can be helpful to tune how dbt runs.
To use threads or batching, set the corresponding environment variables in .env
.
DBT_THREADS=3
DBT_BATCH_SIZE=100000
To use multiple dbt containers, add an additional docker-compose file with different dbt containers and use profiles to control which services run.
Use the DBT_SELECTOR
environment variable to change which models each container runs.
name: ${COMPOSE_PROJECT_NAME:-cht-sync}
x-dbt-base: &dbt-common
build: ./dbt/
working_dir: /dbt/
environment: &dbt-env
POSTGRES_HOST: ${POSTGRES_HOST}
POSTGRES_PORT: ${POSTGRES_PORT:-5432}
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
POSTGRES_DB: ${POSTGRES_DB}
POSTGRES_TABLE: ${POSTGRES_TABLE}
POSTGRES_SCHEMA: ${POSTGRES_SCHEMA}
ROOT_POSTGRES_SCHEMA: ${POSTGRES_SCHEMA}
DATAEMON_INTERVAL: ${DATAEMON_INTERVAL}
DBT_THREAD_COUNT: ${DBT_THREAD_COUNT}
DBT_BATCH_SIZE: ${DBT_BATCH_SIZE}
services:
dbt-tag-one:
<<: *dbt-common
profiles:
- tags
environment:
<<: *dbt-env
DBT_SELECTOR: package:cht_pipeline_base
CHT_PIPELINE_BRANCH_URL: ${CHT_PIPELINE_BRANCH_URL}
dbt-tag-one:
<<: *dbt-common
profiles:
- tags
environment:
<<: *dbt-env
DBT_SELECTOR: tag:tag-one
CHT_PIPELINE_BRANCH_URL: ${CHT_PIPELINE_BRANCH_URL}
dbt-tag-two:
<<: *dbt-common
profiles:
- tags
environment:
<<: *dbt-env
DBT_SELECTOR: tag:tag-two
CHT_PIPELINE_BRANCH_URL: ${CHT_PIPELINE_BRANCH_URL}
docker compose --profile tags --profile production -f docker-compose.yml -f docker-compose.dbt-tags.yml up -d
When you are done using the services, you can clean everything by running down
.
For example, using the local profile, the command should look like:
docker compose -f docker-compose.yml down
To remove all the data volumes, add -v
at the end of this command.
To upgrade to a newer version of CHT Sync with docker, stop all containers, and pull the newer version. If there are any database changes that require a migration script, the major version will be changed and the changes detailed below. After running any migrations scripts, restart containers with the new docker-compose.yml
docker compose --profile production up -d
V2 added the source
column to the couchdb
table, and several other columns to the metadata table.
To add these columns log in to the database and run this sql.
ALTER TABLE _dataemon ADD COLUMN IF NOT EXISTS manifest jsonb;
ALTER TABLE _dataemon ADD COLUMN IF NOT EXISTS dbt_selector text;
ALTER TABLE couchdb ADD COLUMN IF NOT EXISTS source varchar;
CREATE INDEX IF NOT EXISTS source ON couchdb(source);
In V2, the commands for running CHT sync have changed; a profile is required as described above.
When testinging locally and using the local
profile, the environment variable DBT_LOCAL_PATH
must be set.
V2 adds new features for tuning dbt; to use batching, threads, or separate dbt processes, set the corresponding environment_variables in .env
as described above.
Overview of CHT components and interactions
Data synchronization tools to enable analytics
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.