CHT Sync

Data synchronization tools to enable analytics

Overview

CHT Sync is an integrated solution designed to enable data synchronization between CouchDB and PostgreSQL for the purpose of analytics. It combines several technologies to achieve this synchronization and provides an efficient workflow for data processing and visualization. The synchronization occurs in near real-time, ensuring that the data displayed on dashboards is up-to-date.

Read more about setting up CHT Sync.

CHT Sync replicates data from CouchDB to PostgreSQL in a near real-time manner. It listens to changes in the CHT database, and updates the analytics database accordingly. It is not designed to be accessed by users, and it does not have a user interface. It is designed to be run on the same server as the CHT, but it can be run on a separate server if necessary.

As CHT Sync puts all new data into a PostgreSQL database into a single table that has a jsonb column, this is not very useful for analytics. cht-pipeline contains a set of SQL queries that transform the data in the jsonb column into a more useful format. It uses dbt to define the models that are translated into PostgreSQL tables or views, which makes it easier to query the data in the analytics platform of choice.

couch2pg

couch2pg streams data from CouchDB and forwards it to PostgreSQL, ensuring near real-time updates.

PostgreSQL

A free and open source SQL database used for analytics queries. See more at the PostgreSQL site.

dbt

Once the data is synchronized and stored in PostgreSQL, it undergoes transformation using predefined dbt models from the cht-pipeline. dbt is used to ingest raw JSON data from the PosgtreSQL database (jsonb column) and normalize it into a relational schema to make it easier to query. A daemon runs the dbt models, and it updates the database whenever the data in the jsonb column changes.

Data Visualization

We recommend Apache Superset as the Data Visualization Tool. Superset is a free, open-source platform for data exploration and data visualization.

CHT Core Framework & CouchDB

For more information on these technologies, see CHT Core overview.


CHT Core Framework > Overview > CHT Core

The different pieces of a CHT project, how they interact, and what they’re used for

CHT Core Framework > Overview > Data Flows

An overview of data flows in the CHT for analytics, impact monitoring, and data science

Hosting > Data Synchronization and Analytics

Using CHT Sync for data synchronization and analytics