Fork of salesforce/BLIP for a image-captioning task on 🤗Inference endpoint.

This repository implements a custom task for image-captioning for 🤗 Inference Endpoints. The code for the customized pipeline is in the To use deploy this model a an Inference Endpoint you have to select Custom as task to use the file. -> double check if it is selected

expected Request payload

  "image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes

below is an example on how to run a request using Python and requests.

Run Request

  1. prepare an image.
!wget request

import json
from typing import List
import requests as r
import base64


def predict(path_to_image: str = None):
    with open(path_to_image, "rb") as i:
        image =
    payload = {
        "inputs": [image],
        "parameters": {
                   "do_sample": True,
    response =
        ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
    return response.json()
prediction = predict(

Example parameters depending on the decoding strategy:

  1. Beam search
        "parameters": {
  1. Nucleus sampling
        "parameters": {
                   "do_sample": True,
  1. Contrastive search
        "parameters": {

See generate() doc for additional detail

expected output

['buckingham palace with flower beds and red flowers']
Downloads last month


Downloads are not tracked for this model. How to track
Inference Examples
Inference API (serverless) does not yet support generic models for this pipeline type.

Space using florentgbelidji/blip_captioning 1