Case Deduplication in CommCare

The CommCare case de-duplication feature enhances data management and quality assurance by enabling organizations to identify and remove duplicate cases. This improvement leads to more accurate, reliable, and efficient project operations, allowing for cleaner data and more informed decision-making.

Permissions Required

A user must have "Full Organizational Access" and "Data" permissions enabled in order to use this feature.

How it works

The de-duplication feature allows organizations to find and eliminate duplicates by allowing you to configure your own duplicate case identification rules and exploring the duplicates within the Case List Explorer. You can then use data cleaning techniques to edit, archive, or delete cases within CommCare. You can pre-configure as many rules as you need to run in order to identify duplicate cases. These rules run anytime a case update is made on your project space that matches a rule criteria. For example, if you are finding duplicates where the "name" property is the same, anytime a case update is made where the "name" property is updated, the rule will re-run to find all potential duplicates. In practice, the deduplication rules are dynamic and always up to date!

Creating a Duplication Rule

To begin creating a duplication rule, navigate to Data → Deduplicate Cases:

image-20240208-170901.png

From here, click on the Add Deduplication Rule button:

Basic Information

Name your Deduplication Rule. Remember to name the rule something clear and easy to understand for future reference. 

Rule Criteria

Case Type:

  • Choose a case type to check for duplicates. You cannot choose more than one case type to search for duplicates against.

  • All case types within the project space are shown in the drop down menu.

Match Type:

  • This property tells CommCare HQ whether all of the case properties you are about to check for duplicates against should match or whether any of them should match. 

  • "All" means if you set 3 case properties, all three must be met for a case to be considered a "duplicate" case.

  • "Any" means if you set 3 case properties, any of the three have to match in order for the case to be considered a duplicate. 

  • An example is illustrated below:

  • Imagine you have cases in your system with the following case properties:

    • Name

    • Date of Birth

    • Location of Birth

  • And you have two cases with the following case properties:

  • If you use the "All" match type for all three case properties, the system will NOT identify these two as a duplicate because not all of the case properties are the same (Name is not the same).

  • If you use the "Any" match type for all three case properties, the system WILL identify these two as a duplicate because at least one of the case properties are matching (Date of Birth and Location of Birth are the same).

Case Property

  • This is where you define case properties to identify matching.

  • You must identify at least one case property to run a rule against.

  • Only case properties relevant for the chosen case type will be shown.

Add Case Property

  • This is where you can define more case properties to match against. This is an optional field.

Include Closed Cases

  • This checkbox indicates whether you want to include closed cases (checked) or not include closed cases (unchecked). 

Cases Filter

This section allows you to further refine your duplicate rules to show you only cases that fit a specific filter criteria. 

Add Filter

  • Here, you can define a filter to further refine the cases you see in the ultimate duplicate report. 

  • For example, if you are matching on the case property "name" but you only want to see duplicates where the name is "Jaya", you would add a filter at this step and define the case name to equal "Jaya".

Viewing Matching Duplicates

To view all matching duplicates, click on the Explore Duplicates button found on the Deduplicate Cases main page:

This will automatically generate the relevant report for the Deduplication rule you were interested in.

An alternative way to navigate to your Duplicate Cases report is by going to Reports → Inspect Data → Duplicate Cases

Choosing the Duplicate Case Rule

Choose the duplicate case rule you are interested in via the "Duplicate Case Rule" dropdown menu. This shows you a list of all pre-configured duplicate case rules.

You can then specify further search criteria, configure columns, and filter by case owners exactly the same as you would in a Case List Explorer report. You can also export these duplicates to Excel or email the report. You can access the cases by clicking on "View Case" and taking actions to clean, close, or delete cases as necessary!