Introducing BlindChat, an open-source and privacy-by-design Conversational AI fully in-browser

Community Article Published September 22, 2023

TL;DR

We present here BlindChat, which aims to provide an open-source and privacy-by-design alternative to ChatGPT.

  • BlindChat is a fork of the Hugging Face Chat-UI project, adapted to perform all the logic on the client side instead of the initial server-side design.
  • BlindChat runs fully in your browser, leverages transformers.js to run local inference, and stores conversations on the browser cache.
  • The first local model proposed is LaMini-Flan-T5-783M. Future options will be provided, such as remote enclave inference with BlindLlama, to serve privately Llama 2 70b or Falcon 180b

A live demo is available at chat.mithrilsecurity.io!

image/gif

Context

The problem

While OpenAI’s ChatGPT changed the whole AI ecosystem less than a year ago, privacy issues have arisen and could potentially slow down AI adoption.

We have seen the great benefits of LLMs in increasing productivity, from coding assistants to document writing aides. However, as a user of those models, it can be tricky to navigate through the different privacy policies and understand exactly what happens to the data sent to those AI providers.

Indeed, AI providers often use data sent to their Chatbots to improve their model. While it is understandable in this competitive age of LLMs, this practice can be extremely detrimental to end-users and erode trust in those AI providers.

Several issues arise:

  • A lot of information is collected by these companies, and it is not always clear what is collected, for what purpose, with whom it is shared, to what end, or even how to opt out.
  • Fine-tuning on users’ data can have dire privacy implications, as LLMs can leak this trained data to other users of the LLM! As you might know, LLMs are trained to learn by heart their training set. This means that someone can actually prompt the fine-tuned LLM to spit out part or the totality of its training set! For instance, sending the prompt “My credit card number is …” could be completed by the LLM with someone’s real credit card number from the training set.

While this example might seem too simplistic and unlikely, Samsung actually had a real leakage for this exact reason! In 2022, one of their engineers sent sensitive corporate information, such as proprietary code, in the early release of ChatGPT.

OpenAI clearly mentioned not to send confidential data as they would train on early conversations. The Samsung engineer ignored this warning and sent confidential data. GPT3.5 ended up learning this by heart data, and someone else prompted the model and got the data out!

Current solutions

On-premise deployment has been the current alternative to provide privacy guarantees and transparency over the use of data. Open-source models like Llama 2 have provided a strong alternative, as privacy-sensitive developers could deploy their own LLM internally without fearing uncontrolled data usage. Commercial AI providers’ solutions also provide similar deployment options, such as Azure OpenAI Services.

While those solutions are one way to solve privacy and transparency issues by using AI APIs like GPT4, they come at a cost: financial and deployment complexity!

Indeed, deploying an LLM on your own can be complicated, especially if you are non-technical. For instance, A lawyer might want to have a Chat solution to help her redact a contract for her clients but might not be expert enough to deploy an in-house LLM to solve her problems.

There is today a gap, as there seems to be a tradeoff between privacy and ease of use, where SaaS solutions like GPT4 are easy to use but provide less privacy than custom deployments, which come at the cost of more expertise required.

But can we find a solution that actually provides both? Yes, we can! That is what we aim to achieve with BlindChat, an open-source and private Conversational AI solution running full in-browser!

Introducing BlindChat

BlindChat is an open-source project inspired by Hugging Face Chat-UI. We wanted to have a solution that would run purely in the browser so that any end-user could leverage AI models with both privacy guarantees and ease of use.

While Chat-UI is an excellent project with a great user interface, it was designed to work in conjunction with a backend, such as Hugging Face Text Inference Endpoint (TIE), which did not suit our privacy requirements.

Indeed, as we want to have a solution where the end-user needn’t trust the AI provider, we could not use Chat-UI as is and had to revamp it.

The logic of BlindChat is to offload as much logic as possible to the client-side browser so that we can both:

  • achieve a high level of privacy, as data no longer leaves the user’s device
  • make it easy to use as consumers have nothing to install beyond their existing browser (it might require, however, a decent bandwidth/hardware)

In practice, this means that several components have been adapted, such as:

  • the LLM no longer runs in a server, and inference runs instead on the user’s device
  • conversations are stored on the browser, not on a remote server
  • there is no longer telemetry recorded and data shared to improve models

A key component is that we have to swap regular inference with TIE in the backend for a private option.

on-device inference

For instance, we have used transformers.js instead to perform local inference. A LaMini-Flan-T5 model is pulled from the Hub, and is run on users’ devices. We have reproduced the local streaming of tokens to recreate a smooth experience.

We have also had to modify Chat-UI to store conversations locally on the browser instead of saving them on a MongoDB on the server side to preserve privacy.

Those are a few of the several changes we made to the original HF Chat-UI to have a fully private and in-browser AI assistant, but voila!

You can try BlindChat at https://chat.mithrilsecurity.io/ to start experimenting with our privacy-first AI Assistant.

⚠️Note: this comes at the cost of requiring a good network, takes a bit of time the first time inference is performed,

Next steps

This first launch with a simple private Chat-UI in the browser is just the beginning. We intend to develop the open-source and privacy-by-design alternative to ChatGPT, and this means we have a long road ahead of us.

For instance, here are examples of future steps:

  • Providing more models, such as phi-1.5
  • Implementing RAG by revamping LlamaIndex TypeScript to work on the browser
  • Implementing remote enclave inference to enable the consumption of remotely hosted, yet private and secure, large AI models. We have already released an alpha of BlindLlama, an open-source project to serve LLama 2 70b with privacy guarantees.
  • And much more!

If interested, you can find more about our roadmap on our GitHub. You can also raise issues or reach out to us on Discord to discuss features or ask your questions!

Community

Sign up or log in to comment