Advanced Data Migration Guide

Advanced Data Migration Guide

Data migration is just one step in the multi-step process of transitioning to CommCare. For larger or more complex systems, it can be the most intensive. To support these cases, we created the CommCare Advanced Data Migration Guide.

This guide focuses on complex migration processes that go beyond the basic activities covered in the CommCare Transition Guide, including the use of external databases and data transformation tools.

Please note that it references features available on higher-tier features that go beyond the $100 CommCare Standard Plan.

General Overview & Recommendation

Partners transitioning from another solution to CommCare may feel the urge to transfer all their historical data into CommCare. However, in most cases, they are primarily concerned with the most recent updated values of the entities they track. This facilitates a smooth transition of future data collection from their previous tool to CommCare.

In these instances, Dimagi strongly advises preserving all historical data in secure environments such as cloud-based data warehouses and only importing the latest data into CommCare to ensure uninterrupted future workflows. Historical data can be integrated with BI tools along with data from CommCare for business intelligence purposes.

The architecture of technology systems varies, leading to differences in how data is processed across platforms. On CommCare, for example, any data collected via CommCare apps in the field are considered “form submissions” - in other words, frontline workers (called mobile users/workers in our ecosystem) login to their CommCare apps and open individual forms and submit them after collecting data for clients or entities they are tracking. Once these forms are submitted, depending on what data model has been designed for that specific workflow, CommCare can create new cases, update existing ones or simply save these submissions on the server. The same underlying principles apply for web apps (CommCare apps running on the browser with no offline capabilities) submissions as well. In other words, data submitted on CommCare may have multiple dependencies.

As partners transition to CommCare, they are usually interested in shifting their current workflows to the new system after a certain cutoff date where they use the last updated values for the entities they are tracking, upload them to CommCare and continue tracking them through CommCare for the remaining life cycle of their projects.

See the General Transition Guide | 5. Migrate Existing Data into CommCare for instructions on migrating last updated values to CommCare.

The following sections provide a walkthrough for a historical migration of all data (including form content and case properties) into CommCare.

Understand Your Existing Data Structure

Data can be structured in numerous ways in other tools. Depending on the complexity of this structure, the level of effort involved in translating it to CommCare’s data architecture can vary from high to low. If single entities like clients or households are being tracked, setting up the similar structure in CommCare would be easier compared to a hierarchical structure with multiple related entities and repeat data.

Let’s look at a complex structure to understand how it can be replicated on CommCare. This will also inform approaches for more simpler structures.

Sample Data Structure in Other Tool

Here's a summarised table based on the diagram, detailing the entities, their relationships, and key indicators:

Entity

Relationship

Indicators

Household

Top Level Entity

Household ID, Location

Household Member

Linked to Household

Member ID, Name, Age, Relation to Head, Gender

Visit

Linked to Household Member

Visit ID, Visit Count, Health Score, Weight, Vaccinated      

Medicines

Linked to Visit

Medicine Name, Dosage

  • Household is the top-level entity, identifying the group with unique identifiers and location.

  • Household Member is linked to the Household, with specific details about each member.

  • Visit records are associated with each Household Member, tracking various health-related indicators.

  • Medicines and Dosage are specific to each Visit, detailing the medical treatment received.

Setup in CommCare

Entity → Case Type

Relationship → Parent to Child Case Linking

Indicators → Case Properties

Household

Parent Case

Household ID, Location

Household Member

Child Case of Household

Member ID, Name, Age, Relation to Head, Gender

Visit
→ Medicines (repeat group in Visit)

Child Case of Household Member

Visit ID, Visit Count, Health Score, Weight, Vaccinated      

Medicines

Repeat Group Data in Visit

Medicine Name, Dosage

The output of this setup will look like:

Case Hierarchy in CommCare showing parent and child cases linked together

Using the documentation on the help site, this can be easily built as an application in a project space on CommCare. See more information on Parent and Child cases to learn about how to build hierarchical case relationships in CommCare.

Sample Application Structure for Example Above

Households Module:

Household Registration and Household Member Registration

Household Members Module:

Creation of visits case which includes repeat group data for medicines

Visits Module:

Followup on Previous Visits

Migrating Data Using Azure SQL Database and PowerAutomate

This tutorial explains how data from other tools can be programmatically migrated to CommCare using Azure SQL Database and Power Automate.

Selecting the Right Tool

Here's a simplified table listing tools that can be used for this purpose, in order of technical complexity.

Tool

Description

Technical Complexity

Ideal For

Power Automate

Cloud-based service for automating workflows across applications and services.

Low

Low-code/no-code scenarios, simple to moderate workflow automation.

Azure Data Factory

Cloud-based data integration service for creating data-driven workflows.

Moderate

Moderate to complex workflows, data transformation, cloud integrations.

Azure Data Factory

Combined category for complex workflows and data integration tools.

High

Complex ETL processes, data quality, flexible and customizable workflows, large-scale enterprises.

Step-by-Step Instructions

Setup in SQL Database

The ideal approach to migrating data from other sources to CommCare requires standardisation of storage in a structured, secure and scalable tool like a SQL DB. As different tools may have different options for data extraction, this tutorial assumes that relevant steps would be taken to load all data to be migrated into structured SQL tables.

Sample tables in SQL

Required Fields in SQL Tables

  1. case_id: The case_id column is a UUID that is required for creating a case in CommCare. While this can be generated later as well, it would be easiest to generate UUIDs in SQL for each entity mapped to a unique case type above.

  2. parent_case_id: For any child case data, the parent case’s case_id should be mapped to the relevant SQL table as well.

  3. Indicators: While column headers do not need to match, it should be ensured that all indicators that are mapping to case properties in CommCare are present in the table. Similarly, any fields that need to be part of form submission and not necessarily case properties should also be in the table.

  4. owner_id (optional): It is recommended that users and/or organization/groups are created in CommCare and bulk reassignment done in CommCare once data has been migrated successfully. However, if needed, this column can be populated as well with the correct IDs from CommCare.

Setup in Power Automate

For a full overview of migrating your data into CommCare using PowerAutomate, please see Power Automate for Data Migration

Create individual flows for every case type. In this example there are three flows:

Setup in Azure Data Factory

For a full overview of migrating your data into CommCare using PowerAutomate, please see Azure Data Factory for Data Migration

Create individual data flows and pipelines for every case type. In this example there are three sections:

  • CommCare Migration - Parent Cases: These pipelines contains a data flow for the parent case type data wherein data is transformed and final XML content is saved to the SQL DB and actions through which this data is submitted to CommCare.

  • CommCare Migration - Child Cases: These pipelines contains data flows for the parent case block and household member data block wherein data is transformed and final XML content is saved to the SQL DB and actions through which this data is submitted to CommCare.

  • CommCare Migration - Repeat Groups: These pipelines contains a data flow for the visit case type data (child cases to household members) along with repeat group data for medicines wherein data is transformed and final XML content is saved to the SQL DB and actions through which this data is submitted to CommCare.

The aim of these flows is to show the general approach taken to receive data into CommCare from a SQL DB source. Flows will need tweaking based on individual requirements and data structures.