My Account Log in

1 option

Long-Term Temporal Hierarchical Fusion Bird's-Eye View Perception Method Based on Multiple Position Encodings Wuhan University of Technology, School of Automotive Enginee

SAE Technical Papers (1906-current) Available online

View online
Format:
Book
Conference/Event
Author/Creator:
Chen, Pengyu, author.
Contributor:
Chen, Zhenwei
Wei, Xiaoxu
Conference Name:
SAE 2025 Intelligent and Connected Vehicles Symposium (2025-09-19 : Shanghai, China)
Language:
English
Physical Description:
1 online resource cm
Place of Publication:
Warrendale, PA SAE International 2025
Summary:
With the rapid development of autonomous driving technology, environmental perception, as its core module, has attracted much attention. Among them, the pure visual bird's-eye-view (BEV) 3D detection scheme has become a research hotspot due to its high spatial resolution and excellent semantic recognition ability in specific scenarios. Existing methods mainly utilize the Transformer encoder structure to perform position encoding in the BEV domain to achieve 3D perspective transformation, but they often fail to fully exploit the potential value of multi-perspective image information. To address this challenge, this paper proposes an improved Transformer-based visual BEV vehicle perception method that enhances perception performance by deeply fusing BEV domain and image domain information: an innovative multi-perspective position encoding mechanism is designed, which decouples camera parameters to more efficiently learn the mapping from images to 3D space; at the same time, a cyclic interaction attention mechanism is introduced to enhance the fine-grained association and fusion ability of pixel-level features, effectively improving the discriminability of features. In addition, to deal with challenges such as target occlusion in dynamic scenes, this method further proposes a long-term temporal perception framework that fuses multi-frame temporal information and designs a cross-time guidance module, significantly improving the robustness of target localization by injecting historical geometric constraints. Experiments on the nuScenes dataset verify the effectiveness of this method, and the results show that it achieves excellent performance in both spatial perception accuracy and temporal modeling capability, providing an innovative and practical solution for autonomous driving environmental perception
Notes:
Vendor supplied data
Publisher Number:
2025-01-7307
Access Restriction:
Restricted for use by site license

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account