Select anything in a photo with SAM by Meta | | | | Turtles AI
Select anything in a photo with SAM by Meta
DukeRem
#Meta, the parent company of #Facebook, just released an enticing #scientific #paper about #Segment #Anything Meta ( #SAM ) a new kid on the block based on #AI, able to recognize and select (and isolate) any object in an image.
In the paper, the project called Segment Anything (SA) has been introduced, offering a new #task, #model, and #dataset for #image #segmentation. With an efficient model in a data collection loop, SA has built the largest segmentation dataset to date, containing over 1 billion #masks on 11 million licensed and privacy-respecting images. The model is designed to be promptable, allowing for zero-shot transfer to new image distributions and tasks. SA's capabilities have been evaluated on numerous tasks, with impressive zero-shot performance often competing with or surpassing prior fully supervised results.
The SA team is releasing the Segment Anything Model (SAM) and corresponding dataset (SA-1B) on their website, https://segment-anything.com, to encourage research into foundation models for computer vision. Large language models pre-trained on web-scale datasets have transformed natural language processing with strong zero-shot and few-shot generalization. Similarly, SA seeks to develop a foundation model for image segmentation by pre-training on a broad dataset using a promptable model to solve downstream segmentation problems on new data distributions.
The success of the SA project depends on three components: task, model, and data. To address these components, the SA team defines a promptable segmentation task, develops a corresponding model architecture that supports flexible prompting, and builds a diverse, large-scale source of data using an efficient model to assist in data collection. This data engine iteration allows for interactive use and improves the model with newly collected data.
While much progress has been made on vision and language encoders, computer vision encompasses a wide range of problems beyond this scope, and for many of these, abundant training data does not exist. The SA project addresses this gap by introducing a new model, dataset, and task for image segmentation to encourage research into foundation models for computer vision.