GSAM - Grounded SegmentAnyThing

The Grounded SAM algorithm integrates the capabilities of both GroundingDINO and SAM algorithms, utilizing text prompts to efficiently detect bounding boxes and segmentation masks.

This powerful algorithm is available in DDS CloudAPI SDK in two variants:

  • the “tiny” model through TinyGSAMTask

  • the “base” model through BaseGSAMTask

Usage Pattern

This section demonstrates the usages of both TinyGSAMTask and BaseGSAMTask.

TinyGSAMTask

First of all, make sure you have installed this SDK by pip:

pip install dds-cloudapi-sdk

Then trigger the Grounded-SAM algorithm with tiny model by TinyGSAMTask:

from dds_cloudapi_sdk import Client
from dds_cloudapi_sdk import Config
from dds_cloudapi_sdk import TextPrompt
from dds_cloudapi_sdk import TinyGSAMTask

# Step 1: initialize the config
token = "Your API token here"
config = Config(token)

# Step 2: initialize the client
client = Client(config)

# Step 3: run the task by TinyGSAMTask class

image_url = "https://dds-frontend.oss-cn-shenzhen.aliyuncs.com/static_files/playground/grounded_sam/05.jpg"
# if you are processing local image file, upload them to DDS server to get the image url
# image_url = client.upload_file("/path/to/your/image.png")

task = TinyGSAMTask(
    image_url=image_url,
    prompts=[TextPrompt(text="iron man")]
)

client.run_task(task)
print(task.result.mask_url)  # https://host.com/image.png

for obj in task.result.objects:
    print(obj.category)  # iron man
    print(obj.score)  # 0.49
    print(obj.bbox)  # [653.08, 329.13, 942.05, 842.50]

BaseGSAMTask

The usage pattern of BaseGSAMTask is exactly the same like TinyGSAMTask, except that it triggers the algorithm with a different task class:

# install the SDK by pip
pip install dds-cloudapi-sdk

Then trigger the task using the BaseGSAMTask class:

from dds_cloudapi_sdk import Config
from dds_cloudapi_sdk import Client
from dds_cloudapi_sdk import TextPrompt
from dds_cloudapi_sdk import BaseGSAMTask

# Step 1: initialize the config
token = "Your API token here"
config = Config(token)

# Step 2: initialize the client
client = Client(config)

# Step 3: run the task by BaseGSAMTask class

image_url = "https://dds-frontend.oss-cn-shenzhen.aliyuncs.com/static_files/playground/grounded_sam/05.jpg"
# if you are processing local image file, upload them to DDS server to get the image url
# image_url = client.upload_file("/path/to/your/image.png")

task = BaseGSAMTask(
    image_url=image_url,
    prompts=[TextPrompt(text="iron man")]
)

client.run_task(task)
print(task.result.mask_url)  # https://host.com/image.png

for obj in task.result.objects:
    print(obj.category)  # iron man
    print(obj.score)  # 0.49
    print(obj.bbox)  # [653.08, 329.13, 942.05, 842.50]

API Reference

class TinyGSAMTask(image_url, prompts=None, box_threshold=0.3, text_threshold=0.2, nms_threshold=0.8)[source]

Trigger the Grounded-SegmentAnything algorithm with the tiny ModelType.

Parameters:
  • image_url (str) – the segmenting image url.

  • prompts (List[TextPrompt]) – a list of TextPrompt object. But for This task, only positive prompts are permitted.

  • box_threshold (float) – a threshold to filter out objects by bbox score, default to 0.3.

  • text_threshold (float) – a threshold to filter out objects by text score, default to 0.2.

  • nms_threshold (float) – a threshold for nms to filter out overlapping boxes, default to 0.8.

property result: TaskResult

Get the formatted TaskResult object.

class BaseGSAMTask(image_url, prompts=None, box_threshold=0.3, text_threshold=0.2, nms_threshold=0.8)[source]

Trigger the Grounded-SegmentAnything algorithm with the base ModelType.

Parameters:
  • image_url (str) – the segmenting image url.

  • prompts (List[TextPrompt]) – a list of TextPrompt object. But for This task, only positive prompts are permitted.

  • box_threshold (float) – a threshold to filter out objects by bbox score, default to 0.3.

  • text_threshold (float) – a threshold to filter out objects by text score, default to 0.2.

  • nms_threshold (float) – a threshold for nms to filter out overlapping boxes, default to 0.8.

property result: TaskResult

Get the formatted TaskResult object.

class TextPrompt(*, text, is_positive=True)[source]

A text prompt.

Parameters:
  • text (str) – the str content of the prompt

  • is_positive (bool) – whether the prompt is positive, default to True

text: str

the str content of the prompt

is_positive: bool

whether the prompt is positive, default to True

property type

constant string ‘text’ for TextPrompt.

class TaskResult(*, mask_url, objects)[source]

The task result of the GSAM tasks.

Parameters:
  • mask_url (str) – an image url with all objects’ mask drawn on

  • objects (List[GSAMObject]) – a list of detected objects of GSAMObject

mask_url: str

an image url with all objects’ mask drawn on

objects: List[GSAMObject]

a list of detected objects of GSAMObject

class GSAMObject(*, category, score, bbox)[source]

The object detected by GSAM tasks.

Parameters:
  • category (str) – the category name of the object

  • score (float) – the predict score of the object

  • bbox (List[float]) – the bbox of the object, [xmin, ymin, xmax, ymax]

category: str

the category name of the object

score: float

the predict score of the object

bbox: List[float]

the bbox of the object, [xmin, ymin, xmax, ymax]