Understanding Data and Terms on Open Chat Studio

This page is structured as follows. We recommend reading through each section to gain a comprehensive understanding of the dual subscription model that enables chatbot development on Dimagi's Open Chat Studio platform. 

Dual Subscription Model: Navigating Dimagi's Open Chat Studio and LLM Provider Services

When using Open Chat Studio, a client subscribes to the platform run by Dimagi. 

In order to use the platform, the client also procures access to a Large Language Model (LLM), via a provider.  For example, the provider of the GPT-4 LLM is OpenAI. 

Procuring access means the client subscribes to and obtains an API key. An API key is a unique identifier, similar to a password, that enables secure communication (here, with Open Chat Studio). Once the API key is procured from the provider, the client can upload this on Open Chat Studio and begin to build LLM-based chatbots. 

As such, the client is using services from two different providers simultaneously in order to develop LLM-based chatbots: Dimagi and the LLM provider. Accordingly, the client is agreeing to two sets of policies: both Dimagi’s terms of service and privacy policies, as well as those of the LLM provider they elect to use. 

Currently, Open Chat Studio supports two LLM providers, OpenAI and Anthropic, with more to come. 

 

Fig 1. Dual Subscription Model for Building Chatbots on Open Chat Studio

 

Understanding Data and Terms 

Overview of Types of Data 

There are two categories of data on Open Chat Studio. 

General: This includes account data, usage, device information, cookies, other technologies and aggregate data sets. 

Chatbot-specific data:

  • Inputs: These comprise all inputs you provide in creating a chatbot, including the prompt, any test messages sent on Prompt Builder and source material. Inputs also include all messages sent by your chatbot users to the chatbot you create. 

  • Outputs: These comprise all outputs generated by the Large Language Model (LLM) you use (e.g. GPT-4 by the provider OpenAI). This includes all chatbot responses that are sent to you while you test the chatbot on Prompt Builder, chatbot responses sent to chatbot users and transcripts of conversations with your chatbot. 

The figure below outlines how input and output data flows between a chatbot user, Open Chat Studio and the LLM. 

Fig 2. Flow of Chatbot-Specific Data: Inputs and Outputs

How is my organisation's data protected on Open Chat Studio? 

For both general and chatbot-specific data, both Dimagi's Privacy Policy applies, as well as the policies of the LLM provider you use to create your chatbot. 

As an example, let’s say you created a chatbot with OpenAI as the LLM provider and GPT-4 as the LLM. To do so, you would be using an OpenAI API key (e.g. an access code to OpenAI’s LLMs) that you would procure from OpenAI. In this case, your data would be subject to the Privacy Policy of Open AI for their API Platform

What are the privacy policies of LLM providers?* 

The privacy policies of LLM providers differ by provider. It is important to note that data is handled differently when you are an individual user of an LLM vs. when you are using an LLM via an API access key

For example, you might have a personal account on ChatGPT, unrelated to your work on Open Chat Studio. In this case, your data is governed by the privacy policy the LLM provider has (here OpenAI) for individual users. 

However, when using Open Chat Studio, you will be using the LLM via an API key (e.g. an access code). This means that all the inputs and outputs are governed by OpenAI’s policies for their Enterprise customers and their API Platform. Similarly, if you choose Anthropic as an LLM provider, your data will be governed by their Commercial Terms of Service

*Note: these rules change frequently and the client is responsible for keeping up to date on them.

Will LLM providers use chatbot data to train their models? 

It depends on the provider. 

For example, as of this writing, OpenAI does not use chatbot data (specifically inputs and outputs) from its API Platform to train its models, as per its Enterprise Privacy Policy. Neither does Anthropic, as per its Commercial Terms of Service. This means that when you use OpenAI or Anthropic as LLM providers on Open Chat Studio, none of the inputs (source material, prompts, user messages to the bot etc) or the outputs (responses by the chatbot) will be used by these providers to train their models. 

Dimagi recommends reading the terms of use and privacy policies of both providers before deciding which to select. More resources are provided below. 

Who owns my organisation's data?

As outlined in our Terms of Service and Privacy Policy, Dimagi is not the owner of customer data content hosted in our systems. As stewards of our customer’s data, we believe that they should have as much control as possible over the use and disclosure of their own data.

Your organisation's data is also subject to the Privacy Policy and Terms of Service of LLM providers, under their business or enterprise offerings. For example, as of this writing, both OpenAI and Anthropic specify that API customers own all input and output data (often referred to as "customer content"). We recommend reading their policies (linked below) closely to fully understand how your data is managed. 

Further context on Open Chat Studio as Experimental Software

Open Chat Studio is an experimental platform and use of Open Chat Studio is governed by Dimagi's terms of service (linked below). As per our terms of service, Dimagi may discontinue Open Chat Studio in response to unforeseen circumstances beyond Dimagi's control or to comply with a legal requirement. Further, given that this is experimental software, features and functions are likely to be made available and discontinued at a much higher rate than other Dimagi products.

Recommended Resources

Note: the policies of LLM providers (linked below) change frequently and the client is responsible for keeping up with them. 

Dimagi's Policies

LLM Provider Policies