Local AI with Docker's Testcontainers

Community Article Published August 3, 2024

๐Ÿ™‹๐Ÿปโ€โ™‚๏ธ Hey there folks , i've really enjoyed using Docker's new opensource startup acquision called test containers.

  • see below for [Links](### Links)
  • they want you to use it for [testing](### Testing Layers)
  • i use it for [simple AI/ML serving](### Try Testcontainers for AI)
  • [join me to make more useful and cool testcontainers](### Join Us)

What is it ?

Testcontainers is an open source framework for providing throwaway, lightweight instances of databases, message brokers, web browsers, or just about anything that can run in a Docker container.

Testing Layers

leveraging containers for testing enhances reliability, reduces configuration overhead, and improves the consistency of test environments, ultimately leading to more robust and maintainable code.

Test Containers offer significant benefits across different testing layers:

  • Integration Tests: Containers provide isolated, reproducible environments for data access tests, ensuring consistency and reducing setup complexity.
  • UI/Acceptance Tests: Containerized browsers enable consistent, reliable UI testing by eliminating variations caused by local browser setups.
  • Application Integration Tests: Using containers for testing applications with all dependencies allows for accurate, end-to-end testing in environments that closely resemble production.

Data Access Layer Integration Tests

Implications:

  • Isolation and Consistency: Containerizing your database ensures that each test runs against a fresh, known state. This isolation eliminates the variability and inconsistencies associated with differing local database setups or residual data from previous tests.
  • Reduced Setup Complexity: Developers avoid the hassle of complex database setups on their local machines, which can lead to a more consistent development experience across the team.
  • Scalability and Reliability: Tests can be scaled easily without worrying about local resources, and container orchestration tools can manage test database instances efficiently.

Example: A developer writes integration tests for an applicationโ€™s data access layer using PostgreSQL. By spinning up a PostgreSQL container for each test, they ensure that each test starts with a clean slate, avoiding conflicts caused by residual data or schema changes from other tests.

UI/Acceptance Tests

Implications:

  • Consistent Test Environment: Using containerized browsers with tools like Selenium guarantees that UI tests run in a consistent environment, reducing the impact of browser-specific issues or variations caused by different versions or plugins.
  • Ease of Setup and Maintenance: Containers simplify the setup process for UI tests by providing pre-configured, standardized browser instances, eliminating the need to manually manage browser versions or extensions.
  • Reproducibility: Each test execution starts with a fresh browser instance, leading to more reliable and reproducible test outcomes.

Example: An automated UI test suite for a web application is run using containerized Chrome instances. Each test starts with a clean browser, ensuring that issues related to browser cache or extensions do not affect the test results.

Application Integration Tests

Implications:

  • End-to-End Testing: Containerized environments enable comprehensive testing of the application with all its dependencies (e.g., databases, message queues, web servers) in a single, isolated environment.
  • Resource Management: Short-lived test containers help manage resources efficiently, avoiding the overhead of long-running services and allowing for quick test execution and teardown.
  • Environment Parity: Tests can be executed in an environment that closely mimics production, enhancing the accuracy of test results and reducing the risk of discrepancies between test and production environments.

Example: A microservices-based application is tested using containers for each microservice, along with associated dependencies like a Redis cache and RabbitMQ message broker. This setup allows for end-to-end tests that validate interactions between services in a controlled, isolated environment.

"Very useful , Let's goof off with AI now"

that's great and i'm sure it's useful for saving on DevOps pipelines for production builds and so much more, but i love using it for it's Ollama Object !

Try Testcontainers for AI

The OllamaContainer class from the testcontainers Python library facilitates the setup and management of containerized environments for serving AI models with Ollama. This allows you to run inference tasks in a consistent and isolated environment, making it easier to test and develop AI applications without worrying about local configuration issues.

Testcontainers Object for Ollama AI Serving Inference

The OllamaContainer class provides a streamlined way to interact with a containerized instance of Ollama, which serves AI models for inference. It simplifies tasks such as starting the Ollama service, listing available models, pulling new models, and querying the service for predictions. Below is a detailed description of how to use the OllamaContainer class, along with code snippets illustrating typical use cases.

Features

  1. Container Management:

    • Automatically handles the lifecycle of the Ollama container.
    • Starts the container and ensures it is properly stopped and cleaned up after use.
  2. Model Management:

    • Allows listing available models.
    • Supports pulling new models if they are not already available.
  3. Inference Requests:

    • Provides methods to interact with the model for inference tasks via HTTP endpoints.

Code Examples

The OllamaContainer class from the testcontainers library is a powerful tool for developers working with AI models, providing a straightforward way to manage containerized environments for inference tasks. By encapsulating the complexity of setup and teardown, it allows developers to focus on building and testing their applications with reliable and reproducible results.

Basic Usage

The following code snippet demonstrates how to use the OllamaContainer class to set up an environment, list available models, and perform inference tasks.


from json import loads
from pathlib import Path
from requests import post, RequestException
from testcontainers.ollama import OllamaContainer

def split_by_line(generator):
    data = b''
    for each_item in generator:
        for line in each_item.splitlines(True):
            data += line
            if data.endswith((b'\r\r', b'\n\n', b'\r\n\r\n', b'\n')):
                yield from data.splitlines()
                data = b''
    if data:
        yield from data.splitlines()

def main():
    with OllamaContainer(ollama_home=Path.home() / ".ollama") as ollama:
        # List available models
        models = ollama.list_models()
        print("Available models:", models)

        # Choose a specific model, for example 'yi:6b-v1.5'
        model_name = "yi:6b-v1.5"
        if model_name not in [model["name"] for model in models]:
            print(f"Model '{model_name}' not found, pulling the model.")
            ollama.pull_model(model_name)
            print(f"Model '{model_name}' has been pulled.")
        
        # Use the model to generate responses in an interactive chat
        try:
            endpoint = ollama.get_endpoint()
            print("You can now start chatting with the model. Type 'exit' to quit.")

            while True:
                user_input = input("You: ")
                if user_input.lower() == "exit":
                    print("Exiting chat.")
                    break
                
                response = post(
                    url=f"{endpoint}/api/chat", 
                    stream=True, 
                    json={
                        "model": model_name,
                        "messages": [{
                            "role": "user",
                            "content": user_input
                        }]
                    }
                )
                response.raise_for_status()

                for chunk in split_by_line(response.iter_content()):
                    model_response = loads(chunk)["message"]["content"]
                    print(f"Model: {model_response}", end="")

        except RequestException as e:
            print(f"An error occurred: {e}")

if __name__ == "__main__":
    main()

Key Methods and Properties

  1. __init__(ollama_home: Path):

    • Parameters: ollama_home (Path) - The directory where Ollama configuration and model files are stored.
    • Description: Initializes the container with a specified home directory for Ollama.
  2. list_models():

    • Returns: A list of available models.
    • Description: Retrieves a list of models currently available in the container.
  3. pull_model(model_name: str):

    • Parameters: model_name (str) - The name of the model to pull.
    • Description: Pulls the specified model if it is not already available.
  4. get_endpoint():

    • Returns: A string containing the endpoint URL for accessing the model.
    • Description: Provides the URL where the model's inference API can be accessed.
  5. __enter__() and __exit__():

    • Description: Handle the context management for the container, starting and stopping it as needed.

The Best Part

Now you can deploy containers directly in your code , remove containers, save on ressources, test with more rigoror , and create your own objects !

Join US

Links

Ask Questions Below ! ๐Ÿ‘‡๐Ÿป

have trouble running this script ? ask a lot of questions below, no problem !

Community

Sign up or log in to comment