ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting

Mar 28, 2025·
Ruifeng Luo
Equal contribution
,
Zhengjie Liu
Equal contribution
,
Tianxiao Cheng
,
Jie Wang
,
Tongjie Wang
Xingguang Wei
Xingguang Wei
,
Haomin Wang
,
Yanpeng Li
,
Fu Chai
,
Fei Cheng
,
Shenglong Ye
,
Wenhai Wang
,
Yanting Zhang
,
Yu Qiao
,
Hongjie Zhang
,
Xianzhong Zhao
· 0 min read
Overview of DPSS
Abstract
Recognizing symbols in architectural CAD drawings is critical for various advanced engineering applications. In this paper, we propose a novel CAD data annotation engine that leverages intrinsic attributes from systematically archived CAD drawings to automatically generate high-quality annotations, thus significantly reducing manual labeling efforts. Utilizing this engine, we construct ArchCAD-400K, a large-scale CAD dataset consisting of 413,062 chunks from 5538 highly standardized drawings, making it over 26 times larger than the largest existing CAD dataset. ArchCAD-400K boasts an extended drawing diversity and broader categories, offering line-grained annotations. Furthermore, we present a new baseline model for panoptic symbol spotting, termed Dual-Pathway Symbol Spotter (DPSS). It incorporates an adaptive fusion module to enhance primitive features with complementary image features, achieving state-of-the-art performance and enhanced robustness. Extensive experiments validate the effectiveness of DPSS, demonstrating the value of ArchCAD-400K and its potential to drive innovation in architectural design and construction.
Type
Publication
In Neural Information Processing Systems 2025