Abstract:
To address the challenges of complex deployment environments, high computational overhead, and insufficient real-time performance when deploying OpenCV on PCs, this study proposes a hardware acceleration solution based on the ZYNQ platform. The traditional template matching algorithm is improved by incorporating a multi-scale template pyramid to enhance scale adaptability. The C++ algorithm is encapsulated into a synthesizable IP core using Vivado HLS. A hardware architecture is constructed by integrating multimedia transceiver chips, enabling PS-PL collaboration via the AXI bus, and real-time 1080P video processing and display are achieved through bare-metal programming drivers. Experimental results demonstrate that the system processes a single frame in 6.562 ms, with logic resource utilization ranging from 12% to 24%, achieving a significant improvement over the PC-based implementation (196 ms per frame). The HDMI interface outputs processed video with clearly marked matching regions. This solution effectively resolves the real-time bottleneck of traditional OpenCV deployment in embedded systems and provides a scalable implementation pathway for hardware acceleration of complex image processing algorithms.