[Azure Data Factory] Child Cases Flow Configuration

This tutorial provides a comprehensive walkthrough on creating and submitting household member cases XML for CommCare using Azure Data Factory (ADF). The process entails generating XML data in ADF, storing it in an Azure SQL Database, and subsequently posting it to CommCare’s API.

Step 1: Create Household Member XML Data in ADF

Data Flow for Building Child Case Block XML for Child Cases
  1. Create a Data Flow:

    • In Azure Data Factory Studio, go to the Author tab.

    • Create a new Data Flow and name it HouseholdMemberXML.

  2. Add Source:

    • Select your Azure SQL Database dataset containing household member data (hm_sqldata).

    • Ensure columns include Household_ID, Member_ID, Name, Age, Gender, Relation_to_Head, case_id, and parent_case_id.

  3. Add Derived Column Transformation for Household Member Details Block:

    • Name the transformation EditHouseholdMemberXMLBlock.

    • Create a derived column hm_xml_snippet using the following expression. This can be easily copied from the Raw XML section of any form submission within CommCare for a household member registration.

    concat( ' <household_member_details>', ' <n0:case xmlns:n0="<http://commcarehq.org/case/transaction/v2>" case_id="', lower(case_id), '" date_modified="', $formatted_datetime, '" user_id="', $mapped_userid, '">', ' <n0:create>', ' <n0:case_name>', Name, '</n0:case_name>', ' <n0:owner_id>', $mapped_ownerid, '</n0:owner_id>', ' <n0:case_type>household_member</n0:case_type>', ' </n0:create>', ' <n0:update>', ' <n0:age>', toString(Age), '</n0:age>', ' <n0:gender>', Gender, '</n0:gender>', ' <n0:member_id>', Member_ID, '</n0:member_id>', ' <n0:relation_to_head>', Relation_to_Head, '</n0:relation_to_head>', ' </n0:update>', ' <n0:index>', ' <n0:parent case_type="household">', lower(parent_case_id), '</n0:parent>', ' </n0:index>', ' </n0:case>', ' <member_id>', Member_ID, '</member_id>', ' <name>', Name, '</name>', ' <age>', toString(Age), '</age>', ' <gender>', Gender, '</gender>', ' <relation_to_head>', Relation_to_Head, '</relation_to_head>', ' </household_member_details>' )
  4. Add Aggregate Transformation:

    • Name the transformation AggregatebyParentCaseID.

    • Aggregate by parent_case_id to group household members under their respective parents.

    • Use collect() to gather all hm_xml_snippet into an array.

  5. Convert Array to String:

    • Add a derived column to convert the aggregated array into a string and clean up the formatting.

    • Use the following expression:

    replace( replace( replace( replace(hm_xml_string, '["', ''), '"]', ''), '\\\\\\\\n', ''), '\\\\', '')
  6. Add Sink:

    • Add a sink transformation to save the generated XML to your Azure SQL Database in the hm_submission table under the child_xml_block column.

  7. Publish the Data Flow:

    • Save and publish your Data Flow to make it available for execution.

1.2 Execute the Data Flow

  • Trigger the Data Flow in your pipeline to generate and store the household member case XML in the Azure SQL Database.

Step 2: Combine with Parent XML and Finalize Submission

Data Flow for Building Parent Case Block XML for Child Cases
  1. Create a Data Flow:

    • In Azure Data Factory Studio, go to the Author tab.

    • Create a new Data Flow and name it ParentCaseBlock.

  2. Add a Source Transformation:

    • Name the source HouseholdMembers.

    • Use your Azure SQL Database dataset (hm_sqldata) as the source.

    • Ensure the following columns are available:

    • Household_ID, Member_ID, Name, Age, Gender, Relation_to_Head, case_id, parent_case_id (or any other relevant fields)

  3. Add a Window Transformation:

    • Name it SplitNodes.

    • This transformation will generate a row number for each household member under the same parent case, which is useful for uniquely identifying each XML block.

    • Configure the window to partition by parent_case_id and order by Member_ID.

window( over(parent_case_id), asc(Member_ID, true), node = rowNumber()-1 )
  1. Add an Aggregate Transformation:

    • Name it AggregateByParent.

    • Aggregate the data by parent_case_id to find the maximum node number for each parent case.

  1. Join Aggregated Data with Source:

  • Name this transformation AggregateCaseNodes.

  • Join the output of SplitNodes with AggregateByParent on the parent_case_id.

  • This will allow you to create a parent case XML block that contains all related household members.

  1. Add a Filter Transformation:

    • Name it OnlyParentCases.

    • Filter the rows to isolate the maximum node value for each parent case.

Step 2: Create Parent Case XML Block

  1. Add a Derived Column Transformation:

    • Name it ParentCaseXMLBlock.

    • Use this transformation to generate the XML for the parent case block.

    • The derived column should be named parent_case_xmlblock and use the following expression:

  2. Finalize Parent Case XML:

    • Add another derived column transformation named FinalParentCaseXMLBlock.

    • This transformation will ensure that the node value in the XML is unique by replacing placeholders with the correct node numbers.

Step 3: Generate Footer XML Block

  1. Derived Column for Footer XML:

    • Add a derived column transformation named FooterXMLBlock.

    • This block creates the footer XML for the submission.

  2. Final Footer XML Block:

    • Add a derived column named FinalFooterXMLBlock.

    • This will finalize the footer by ensuring all placeholders are replaced correctly.

Step 4: Join and Combine XML Blocks

  1. Join Transformation:

    • Add a join transformation named JoinwithFooterXML.

    • Combine the header, parent case, and footer XML blocks to create the final XML document.

  2. Combine Final XML:

    • Add a derived column transformation named FinalParentandFooterXML.

    • Combine all XML parts into the final submission XML.

Step 5: Upsert the Data to SQL Database

  1. Add an Upsert Transformation:

    • Name it UpsertCondition.

    • Use an alterRow() transformation to determine if the row should be upserted based on the parent_case_id.

  2. Configure the Sink:

    • Name the sink transformation SQLDBSink.

    • Configure it to upsert the XML data into the hm_submission table, updating the parent_xml_block column.

Step 6: Submit XML Data to CommCare

  1. Create a Pipeline:

    • Name it Household Member Submission.

  2. Add Lookup Activity:

    • Add a Lookup activity to retrieve the final_xml data from the hm_submission table.

    • Configure the SQL query as follows:

  3. Add Get Credentials Activity:

    • Use a Web Activity to retrieve the API key from Azure Key Vault.

    • Set the method to GET and point to the Key Vault secret.

  4. Add ForEach Activity:

    • Add a ForEach activity to iterate over the results of the Lookup activity.

    • In the ForEach activity, add a Web Activity to submit each final_xml to CommCare’s Form Submission API.

  5. Configure Web Activity for Submission:

    • Set the HTTP method to POST.

    • Add a header for Authorization using the output from the GetCredentials activity.

    • Set the body to @item().final_xml to send the XML data.

  6. Publish and Execute the Pipeline:

    • Publish the pipeline and run it to submit your household member case XMLs to CommCare.