In-browser LLM app in pure Python: Gemini Nano + Gradio-Lite
Google's local LLM Gemini Nano, which runs competely in the browser, is available in recent versions of Chrome Canary. Gradio is a Python package that allows you to create Web UIs upon your models in a few lines of Python code, and Gradio-Lite is its in-browser version that also runs completely in the browser. With a combination of these two, you can create a local LLM-based web app only with Python.
๐ Demo app is here.
Try out Gradio-Lite
Access the Gradio Playground, where you can try out and edit various sample apps using Gradio. If you are not familiar with Gradio, you can also start with its quickstart guide.
The Playground actually uses Gradio-Lite, the in-browser version of Gradio, so the apps there run completely in the browser.
In this article, we are going to write code and preview the app in the Playground, but you can also deploy Gradio-Lite apps in different ways. To learn more about it, reading documents like this would be helpful.
Chat app mockup with Gradio-Lite
Let's see the "Chatbot" example in the Playground. The code is as follows:
import random
import gradio as gr
def random_response(message, history):
return random.choice(["Yes", "No"])
demo = gr.ChatInterface(random_response)
if __name__ == "__main__":
demo.launch()
Building a chat interface is very easy with Gradio. You can define a function that takes a message and returns a response, and pass it to gr.ChatInterface
.
This example just generates random responses by random_response()
. By replacing it with Gemini Nano, you can chat with the LLM!
Use Gemini Nano from Python in the browser
First of all, you need to install Chrome version 127 or higher and enable Gemini Nano by following the "Installation" section in this article. Then open the Gradio Playground again in Chrome Canary where Gemini Nano via the Prompt API is available.
Gemini Nano and the Prompt API are available as JavaScript methods such as window.ai.createTextSession()
, so how can we use them in Python with Gradio-Lite?
Gradio-Lite uses Pyodide, which is a Python runtime compiled for the browser environment, and your Python code passed to Gradio-Lite is actually executed on the Pyodide runtime. In the Pyodide environment, some special modules are available in Python code and you can use the js
module to access JavaScript APIs, as it represents the JavaScript global scope. So, for example, you can access the window
object by importing window
from js
.
from js import window # This doesn't work in Gradio-Lite. See below.
However, there is another thing to note: the Pyodide runtime of Gradio-Lite runs in a WebWorker isolated from the main browser environment, where the window
object is not available. Instead, the Prompt API in the WebWorker environment is available via the self
object. So, you can use the Prompt API in the Gradio-Lite environment like this:
from js import self
can_create = await self.ai.canCreateTextSession() # "readily" if available
Also here is another magic: top-level await
can be used in the Gradio-Lite environment unlike in the normal Python environment. This is sometimes necessary to use JavaScript APIs that return promises.
Chat with Gemini Nano
You can find several prompt API samples on the web, and the following is a one with a streaming output. We will use this streaming API in this article for a better chat experience.
// This is a JavaScript sample to use the Prompt API in the browser environment.
const canCreate = await window.ai.canCreateTextSession();
if (canCreate !== "no") {
const session = await window.ai.createTextSession();
const stream = session.promptStreaming("Write me an extra-long poem");
for await (const chunk of stream) {
console.log(chunk);
}
}
However, it's written in JavaScript. We have to translate it into Python code that can run in the Gradio-Lite environment, being aware of the following points:
- Use
js
module to access JavaScript APIs. - Use
self
instead ofwindow
. - Top-level
await
can be used in the Gradio-Lite environment.
This is the final code to chat with Gemini Nano with Gradio's chat interface executed on Gradio-Lite. You can try this out by copying and pasting it into the Playground, or just accessing this link.
import gradio as gr
from js import self
session = None
try:
can_ai_create = await self.ai.canCreateTextSession()
if can_ai_create != "no":
session = await self.ai.createTextSession()
except:
pass
self.ai_text_session = session
async def prompt(message, history):
session = self.ai_text_session
if not session:
raise Exception("Gemini Nano is not available in your browser.")
stream = session.promptStreaming(message)
async for chunk in stream:
yield chunk
demo = gr.ChatInterface(fn=prompt)
demo.launch()
At the beginning, the session
is initialized. This part is just like the JavaScript sample above, but a translated version to Python.
The random_response()
function in the first example is replaced with prompt()
that uses the Gemini Nano session
. This part is also similar to the JavaScript sample, but with Python syntax, where the for await
in JS is translated to async for
in Python to iterate over the streaming output.
Conclusion
In this article, we have seen how to use Gemini Nano with Gradio-Lite/Pyodide. With the combination of them, you can create a local LLM-based web app with rich interface only with Python code, that runs completely in the browser.
Further reading and references
- Gradio-Lite documentation: If you want to develop Gradio-Lite apps outside the Playground, this document explains how.
- How to run Gemini Nano locally in your browser: This article contains more advanced usage of Gemini Nano.
- Gradio documentation
- Pyodide documentation
- Explainer for the Prompt API
- Built-in AI Early Preview Program documentation