Reading Note with ChatGPT: Segment Anything (Pre-training, Zero-short Transfer, Related tasks, Discussion)
We adapt this method from interactive segmentation [109, 70], although unlike interactive segmentation whose aim is to eventually predict a valid mask after enough user input, our aim is to always predict a valid mask for any prompt even when the prompt is ambiguous.
May 5
--
Zero-shot transfer. Intuitively, our pre-training task en- dows the model with the ability to respond appropriately to any prompt at inference time, and thus downstream tasks can be solved by engineering appropriate prompts. For ex- ample, if one has a bounding box detector for cats, cat in- stance segmentation can be solved by providing the detec- tor’s box output as a prompt to our model. In general, a wide array of practical segmentation tasks can be cast as prompt- ing.