Multimodal Four-dimensional Panoptic Segmentation

Abstract

A method implements multimodal four-dimensional panoptic segmentation. The method includes receiving a set of images and a set of point clouds and executing an image encoder model using the set of images to extract a set of image feature maps. The method further includes executing a point voxel encoder model using the set of image feature maps and the set of point clouds to extract a set of voxel features, a set of image features, and a set of point features and executing a panoptic decoder model using the set of voxel features, the set of image features, the set of point features, and a set of queries to generate a semantic mask and a track mask. The method further includes performing an action responsive to at least one of the semantic mask and the track mask.

Type
Publication
In US Patent App. 18/736,525
Ali Athar
Ali Athar
Applied Scientist