Object Clustering with Dirichlet Process Mixture Model for Data Association in Monocular SLAM


Semantic SLAM with a monocular camera is particularly attractive because of the deployment simplicity and economic availability. Data association problem which assigns unique identities for objects shown in multiple frames plays a fundamental role in semantic slam. Previous prevalent methods which mainly focused on associating geometric KeyPoints are no longer suitable. Some naive methods that rely on object distance or 2D/3D Intersection over Union are also vulnerable when occlusions happen. In this paper, we propose a novel data association method for cuboid landmarks based on Dirichlet Process Mixture Model. By jointly considering object class, position, and size, our method can perform data association robustly. We evaluated our method in simulated datasets, public benchmark KITTI and on a real robot in an office environment. Experimental results show that our method not only associates cuboids robustly but also achieves SOTA pose estimation accuracy in monocular SLAMs.

IEEE Transactions on Industrial Electronics