Understanding Data and Terms

This page is structured as follows. We recommend reading through each section to gain a comprehensive understanding of the dual subscription model that enables chatbot development on Dimagi's Open Chat Studio platform.

Dual Subscription Model: Navigating Dimagi's Open Chat Studio and LLM Provider Services

Overview of types of data
How is my organisation's data protected on Open Chat Studio?
What are the privacy policies of LLM providers?
Will LLM providers use chatbot data to train their models?
Who owns my organisation's data?

Recommended Resources

Dual Subscription Model: Navigating Dimagi's Open Chat Studio and LLM Provider Services

When using Open Chat Studio, a client subscribes to the platform run by Dimagi.

In order to use the platform, the client also procures access to a Large Language Model (LLM), via a provider. For example, the provider of the GPT-4 LLM is OpenAI.

Procuring access means the client subscribes to and obtains an API key. An API key is a unique identifier, similar to a password, that enables secure communication (here, with Open Chat Studio). Once the API key is procured from the provider, the client can upload this on Open Chat Studio and begin to build LLM-based chatbots.

As such, the client is using services from two different providers simultaneously in order to develop LLM-based chatbots: Dimagi and the LLM provider. Accordingly, the client is agreeing to two sets of policies: both Dimagi’s terms of service and privacy policies, as well as those of the LLM provider they elect to use.

Currently, Open Chat Studio supports two LLM providers, OpenAI and Anthropic, with more to come.

Fig 1. Dual Subscription Model for Building Chatbots on Open Chat Studio

Understanding Data and Terms

Overview of Types of Data

There are two categories of data on Open Chat Studio.

General: This includes account data, usage, device information, cookies, other technologies and aggregate data sets.

Chatbot-specific data:

Inputs: These comprise all inputs you provide in creating a chatbot, including the prompt, any test messages sent on Prompt Builder and source material. Inputs also include all messages sent by your chatbot users to the chatbot you create.
Outputs: These comprise all outputs generated by the Large Language Model (LLM) you use (e.g. GPT-4 by the provider OpenAI). This includes all chatbot responses that are sent to you while you test the chatbot on Prompt Builder, chatbot responses sent to chatbot users and transcripts of conversations with your chatbot.

The figure below outlines how input and output data flows between a chatbot user, Open Chat Studio and the LLM.

Fig 2. Flow of Chatbot-Specific Data: Inputs and Outputs

How is my organisation's data protected on Open Chat Studio?

For both general and chatbot-specific data, both Dimagi's Privacy Policy applies, as well as the policies of the LLM provider you use to create your chatbot.

As an example, let’s say you created a chatbot with OpenAI as the LLM provider and GPT-4 as the LLM. To do so, you would be using an OpenAI API key (e.g. an access code to OpenAI’s LLMs) that you would procure from OpenAI. In this case, your data would be subject to the Privacy Policy of Open AI for their API Platform.

What are the privacy policies of LLM providers?

The privacy policies of LLM providers differ by provider. It is important to note that data is handled differently when you are an individual user of an LLM vs. when you are using an LLM via an API access key.

For example, you might have a personal account on ChatGPT, unrelated to your work on Open Chat Studio. In this case, your data is governed by the privacy policy the LLM provider has (here OpenAI) for individual users.

However, when using Open Chat Studio, you will be using the LLM via an API key (e.g. an access code). This means that all the inputs and outputs are governed by OpenAI’s policies for their Enterprise customers and their API Platform. Similarly, if you choose Anthropic as an LLM provider, your data will be governed by their Commercial Terms of Service.

Will LLM providers use chatbot data to train their models?

It depends on the provider.

For example, as of this writing, OpenAI does not use chatbot data from its API Platform to train its models (as per its Enterprise Privacy Policy) and nor does Anthropic (as per their Commercial Terms of Service). This means that when you use OpenAI or Anthropic as LLM providers on Open Chat Studio, none of the inputs (source material, prompts, user messages to the bot etc) or the outputs (responses by the chatbot) will be used by these providers to train their models.

Dimagi recommends reading the terms of use and privacy policies of both providers before deciding which to select. More resources are provided below.

Who owns my organisation's data?

As outlined in our Terms of Service and Privacy Policy, Dimagi is not the owner of customer data content hosted in our systems. As stewards of our customer’s data, we believe that they should have as much control as possible over the use and disclosure of their own data.

Chatbot-specific data is subject to the Privacy Policy and Terms of Service of LLM providers, under their business or enterprise offerings. For example, as of this writing, both OpenAI and Anthropic specify under that for API customer, the customer owns all input and output data.

Recommended Resources

Dimagi's Policies

LLM Provider Policies

OpenAI: Enterprise Privacy
Anthropic: Commercial Terms of Service