GSAM - Grounded SegmentAnyThing¶
The Grounded SAM algorithm integrates the capabilities of both GroundingDINO and SAM algorithms, utilizing text prompts to efficiently detect bounding boxes and segmentation masks.
This powerful algorithm is available in DDS CloudAPI SDK in two variants:
the “tiny” model through TinyGSAMTask
the “base” model through BaseGSAMTask
Usage Pattern¶
This section demonstrates the usages of both TinyGSAMTask and BaseGSAMTask.
TinyGSAMTask¶
First of all, make sure you have installed this SDK by pip:
pip install dds-cloudapi-sdk
Then trigger the Grounded-SAM algorithm with tiny model by TinyGSAMTask:
from dds_cloudapi_sdk import Client
from dds_cloudapi_sdk import Config
from dds_cloudapi_sdk import TextPrompt
from dds_cloudapi_sdk import TinyGSAMTask
# Step 1: initialize the config
token = "Your API token here"
config = Config(token)
# Step 2: initialize the client
client = Client(config)
# Step 3: run the task by TinyGSAMTask class
image_url = "https://dds-frontend.oss-cn-shenzhen.aliyuncs.com/static_files/playground/grounded_sam/05.jpg"
# if you are processing local image file, upload them to DDS server to get the image url
# image_url = client.upload_file("/path/to/your/image.png")
task = TinyGSAMTask(
image_url=image_url,
prompts=[TextPrompt(text="iron man")]
)
client.run_task(task)
print(task.result.mask_url) # https://host.com/image.png
for obj in task.result.objects:
print(obj.category) # iron man
print(obj.score) # 0.49
print(obj.bbox) # [653.08, 329.13, 942.05, 842.50]
BaseGSAMTask¶
The usage pattern of BaseGSAMTask is exactly the same like TinyGSAMTask, except that it triggers the algorithm with a different task class:
# install the SDK by pip
pip install dds-cloudapi-sdk
Then trigger the task using the BaseGSAMTask class:
from dds_cloudapi_sdk import Config
from dds_cloudapi_sdk import Client
from dds_cloudapi_sdk import TextPrompt
from dds_cloudapi_sdk import BaseGSAMTask
# Step 1: initialize the config
token = "Your API token here"
config = Config(token)
# Step 2: initialize the client
client = Client(config)
# Step 3: run the task by BaseGSAMTask class
image_url = "https://dds-frontend.oss-cn-shenzhen.aliyuncs.com/static_files/playground/grounded_sam/05.jpg"
# if you are processing local image file, upload them to DDS server to get the image url
# image_url = client.upload_file("/path/to/your/image.png")
task = BaseGSAMTask(
image_url=image_url,
prompts=[TextPrompt(text="iron man")]
)
client.run_task(task)
print(task.result.mask_url) # https://host.com/image.png
for obj in task.result.objects:
print(obj.category) # iron man
print(obj.score) # 0.49
print(obj.bbox) # [653.08, 329.13, 942.05, 842.50]
API Reference¶
- class TinyGSAMTask(image_url, prompts=None, box_threshold=0.3, text_threshold=0.2, nms_threshold=0.8)[source]¶
Trigger the Grounded-SegmentAnything algorithm with the tiny
ModelType
.- Parameters:
image_url (str) – the segmenting image url.
prompts (List[TextPrompt]) – a list of
TextPrompt
object. But for This task, only positive prompts are permitted.box_threshold (float) – a threshold to filter out objects by bbox score, default to 0.3.
text_threshold (float) – a threshold to filter out objects by text score, default to 0.2.
nms_threshold (float) – a threshold for nms to filter out overlapping boxes, default to 0.8.
- property result: TaskResult¶
Get the formatted
TaskResult
object.
- class BaseGSAMTask(image_url, prompts=None, box_threshold=0.3, text_threshold=0.2, nms_threshold=0.8)[source]¶
Trigger the Grounded-SegmentAnything algorithm with the base
ModelType
.- Parameters:
image_url (str) – the segmenting image url.
prompts (List[TextPrompt]) – a list of
TextPrompt
object. But for This task, only positive prompts are permitted.box_threshold (float) – a threshold to filter out objects by bbox score, default to 0.3.
text_threshold (float) – a threshold to filter out objects by text score, default to 0.2.
nms_threshold (float) – a threshold for nms to filter out overlapping boxes, default to 0.8.
- property result: TaskResult¶
Get the formatted
TaskResult
object.
- class TextPrompt(*, text, is_positive=True)[source]¶
A text prompt.
- Parameters:
text (str) – the str content of the prompt
is_positive (bool) – whether the prompt is positive, default to True
- text: str¶
the str content of the prompt
- is_positive: bool¶
whether the prompt is positive, default to True
- property type¶
constant string ‘text’ for TextPrompt.
- class TaskResult(*, mask_url, objects)[source]¶
The task result of the GSAM tasks.
- Parameters:
mask_url (str) – an image url with all objects’ mask drawn on
objects (List[GSAMObject]) – a list of detected objects of
GSAMObject
- mask_url: str¶
an image url with all objects’ mask drawn on
- objects: List[GSAMObject]¶
a list of detected objects of
GSAMObject
- class GSAMObject(*, category, score, bbox)[source]¶
The object detected by GSAM tasks.
- Parameters:
category (str) – the category name of the object
score (float) – the predict score of the object
bbox (List[float]) – the bbox of the object, [xmin, ymin, xmax, ymax]
- category: str¶
the category name of the object
- score: float¶
the predict score of the object
- bbox: List[float]¶
the bbox of the object, [xmin, ymin, xmax, ymax]