Muhammet Esat Kalfaoğlu, Multi-View Multimodal BEV Perception for Centerline-Centric Road Topology Understanding with Transformer Decoders
This thesis studies multi-view, multimodal BEV perception for centerline-centric road topology understanding in autonomous driving. It focuses on improving centerline detection within a transformer-decoder framework and examines how those gains propagate to topology reasoning. The work develops three stages: mask-based centerline prediction with directional supervision and mask-Bezier fusion, Bezier-driven decoder attention through multi-point and Bezier deformable attention, and geographically disjoint plus long-range multimodal evaluation. Experiments on OpenLane-V2 and OpenLane-V1 show strong camera-only and fused camera-LiDAR performance, supporting state-of-the-art road topology understanding under consistent protocols.
Date: 16.04.2026 / 14:00 Place: A-212









