Advanced Data Migration Guide
Data migration is just one step in the multi-step process of transitioning to CommCare. For larger or more complex systems, it can be the most intensive. To support these cases, we created the CommCare Advanced Data Migration Guide.
This guide focuses on complex migration processes that go beyond the basic activities covered in the CommCare Transition Guide, including the use of external databases and data transformation tools.
Please note that it references features available on higher-tier features that go beyond the $100 CommCare Standard Plan.
General Overview & Recommendation
Partners transitioning from another solution to CommCare may feel the urge to transfer all their historical data into CommCare. However, in most cases, they are primarily concerned with the most recent updated values of the entities they track. This facilitates a smooth transition of future data collection from their previous tool to CommCare.
In these instances, Dimagi strongly advises preserving all historical data in secure environments such as cloud-based data warehouses and only importing the latest data into CommCare to ensure uninterrupted future workflows. Historical data can be integrated with BI tools along with data from CommCare for business intelligence purposes.
The architecture of technology systems varies, leading to differences in how data is processed across platforms. On CommCare, for example, any data collected via CommCare apps in the field are considered “form submissions” - in other words, frontline workers (called mobile users/workers in our ecosystem) login to their CommCare apps and open individual forms and submit them after collecting data for clients or entities they are tracking. Once these forms are submitted, depending on what data model has been designed for that specific workflow, CommCare can create new cases, update existing ones or simply save these submissions on the server. The same underlying principles apply for web apps (CommCare apps running on the browser with no offline capabilities) submissions as well. In other words, data submitted on CommCare may have multiple dependencies.
As partners transition to CommCare, they are usually interested in shifting their current workflows to the new system after a certain cutoff date where they use the last updated values for the entities they are tracking, upload them to CommCare and continue tracking them through CommCare for the remaining life cycle of their projects.
See the General Transition Guide | 5. Migrate Existing Data into CommCare for instructions on migrating last updated values to CommCare.
The following sections provide a walkthrough for a historical migration of all data (including form content and case properties) into CommCare.
Understand Your Existing Data Structure
Data can be structured in numerous ways in other tools. Depending on the complexity of this structure, the level of effort involved in translating it to CommCare’s data architecture can vary from high to low. If single entities like clients or households are being tracked, setting up the similar structure in CommCare would be easier compared to a hierarchical structure with multiple related entities and repeat data.
Let’s look at a complex structure to understand how it can be replicated on CommCare. This will also inform approaches for more simpler structures.
Sample Data Structure in Other Tool
Here's a summarised table based on the diagram, detailing the entities, their relationships, and key indicators:
Entity | Relationship | Indicators |
Household | Top Level Entity | Household ID, Location |
Household Member | Linked to Household | Member ID, Name, Age, Relation to Head, Gender |
Visit | Linked to Household Member | Visit ID, Visit Count, Health Score, Weight, Vaccinated |
Medicines | Linked to Visit | Medicine Name, Dosage |
Household is the top-level entity, identifying the group with unique identifiers and location.
Household Member is linked to the Household, with specific details about each member.
Visit records are associated with each Household Member, tracking various health-related indicators.
Medicines and Dosage are specific to each Visit, detailing the medical treatment received.
Setup in CommCare
Entity → Case Type | Relationship → Parent to Child Case Linking | Indicators → Case Properties |
Household | Parent Case | Household ID, Location |
Household Member | Child Case of Household | Member ID, Name, Age, Relation to Head, Gender |
Visit | Child Case of Household Member | Visit ID, Visit Count, Health Score, Weight, Vaccinated |
Medicines | Repeat Group Data in Visit | Medicine Name, Dosage |
The output of this setup will look like:
Using the documentation on the help site, this can be easily built as an application in a project space on CommCare. See more information on Parent and Child cases to learn about how to build hierarchical case relationships in CommCare.
Sample Application Structure for Example Above
Households Module: Household Registration and Household Member Registration Household Members Module: Creation of visits case which includes repeat group data for medicines Visits Module: Followup on Previous Visits |
Migrating Data Using Azure SQL Database and PowerAutomate
This tutorial explains how data from other tools can be programmatically migrated to CommCare using Azure SQL Database and Power Automate.
Selecting the Right Tool
Here's a simplified table listing tools that can be used for this purpose, in order of technical complexity.
Tool | Description | Technical Complexity | Ideal For |
Power Automate | Cloud-based service for automating workflows across applications and services. | Low | Low-code/no-code scenarios, simple to moderate workflow automation. |
Azure Data Factory | Cloud-based data integration service for creating data-driven workflows. | Moderate | Moderate to complex workflows, data transformation, cloud integrations. |
Azure Data Factory | Combined category for complex workflows and data integration tools. | High | Complex ETL processes, data quality, flexible and customizable workflows, large-scale enterprises. |
Step-by-Step Instructions
Setup in SQL Database
The ideal approach to migrating data from other sources to CommCare requires standardisation of storage in a structured, secure and scalable tool like a SQL DB. As different tools may have different options for data extraction, this tutorial assumes that relevant steps would be taken to load all data to be migrated into structured SQL tables.
Sample tables in SQL
Required Fields in SQL Tables
|
Setup in Power Automate
For a full overview of migrating your data into CommCare using PowerAutomate, please see Power Automate for Data Migration
Create individual flows for every case type. In this example there are three flows:
CommCare Migration - Parent Cases: This flow will receive data from SQL, transform and make API calls to submit forms to CommCare for all household (parent case) data.
CommCare Migration - Child Cases: This flow will receive data from SQL, transform and make API calls to submit forms to CommCare for all household members (child case) data.
CommCare Migration - Repeat Groups: This flow will receive data from SQL, transform and make API calls to submit forms to CommCare for all visit (child cases to household members) data along with repeat group data within the visit data for medicines.
Setup in Azure Data Factory
For a full overview of migrating your data into CommCare using PowerAutomate, please see Azure Data Factory for Data Migration
Create individual data flows and pipelines for every case type. In this example there are three sections:
CommCare Migration - Parent Cases: These pipelines contains a data flow for the parent case type data wherein data is transformed and final XML content is saved to the SQL DB and actions through which this data is submitted to CommCare.
CommCare Migration - Child Cases: These pipelines contains data flows for the parent case block and household member data block wherein data is transformed and final XML content is saved to the SQL DB and actions through which this data is submitted to CommCare.
CommCare Migration - Repeat Groups: These pipelines contains a data flow for the visit case type data (child cases to household members) along with repeat group data for medicines wherein data is transformed and final XML content is saved to the SQL DB and actions through which this data is submitted to CommCare.
The aim of these flows is to show the general approach taken to receive data into CommCare from a SQL DB source. Flows will need tweaking based on individual requirements and data structures.