A YOLO Analysis for Vehicle Recognition - Detection Improvements, Cropping Strategies and Cascading Architecture
Abstract
With the development of high-resolution cameras and enhanced storage devices, image datasets are increasingly captured with higher spatial dimensions. Many modern object detector architectures, such as the YOLO family, typically operate at a 640x640 resolution. When training on high-resolution images, the rescaling process can drastically reduce the region of interest, especially for low- or medium-sized objects far from the camera. To analyze the impact of rescaling, a six-vehicle dataset was constructed from high-resolution images. An automated preprocessing pipeline is built, that crops each car region, enforcing a minimum size of 640x640 by symmetrically expanding the margins. A refactoring process was applied to the new set of images, resulting in a new dataset centered on the region of interest. A YOLOv10 model was trained for each dataset, with the cropped dataset achieving a higher mAP50 of 0.995 compared to the uncropped version's 0.987. A cascading multilevel model employing both YOLO models was further proposed, with the first model analyzing the initial high-resolution image at a coarser level, cropping the vehicle area, and sending it to the second YOLO model for fine-level analysis. This architecture highlights the importance of preserving fine, detailed features that would otherwise be lost during scaling.
Full Text:
PDFReferences
K. Aminiyeganeh, R.W.L. Coutinho, A. Boukerche, IoT video analytics for surveillance based systems in smart cities, Computer Communication 224 (2024), 95–105.
F. Arena, G. Pau, A. Severino, An Overview on the Current Status and Future Perspectives of Smart Cars, Infrastructures 5 (2020), no. 53.
D. Avianto, A. Harjoko, Afiahayati, CNN-Based Classification for Highly Similar Vehicle Model Using Multi-Task Learning, Journal of Imaging 8 (2022), 293.
I. Bouderbal, A. Amamra, M. A. Benatia, How Would Image Down-Sampling and Compression Impact Object Detection in the Context of Self-driving Vehicles?, Proceedings of Advances in Computing Systems and Applications, Lecture Notes in Networks and Systems 199, Springer (2020), 25–37.
W. Chen, S. Ran, T. Wang, L. Cao, Learning How to Zoom In: Weakly Supervised ROI Based-DAM for Fine-Grained Visual Classification, Proceedings of Artificial Neural Networks and Machine Learning– ICANN 2021, Lecture Notes in Computer Science 12892, Springer (2021), 118–130.
Z. Chen, K. Chen, W. Lin, J. See, H. Yu, Y. Ke, C. Yang, PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments, Proceedings of Computer Vision– ECCV 2020, Lecture Notes in Computer Science 12350, Springer (2020), 195–211.
CRAFT LAW FIRM, Autonomous Vehicle Accidents: NHTSA Crash Data (2019-2025). Available: https://www.craftlawfirm.com/autonomous-vehicle-accidents-2019-2024-crash-data/ [Accessed: Mar. 11, 2026]
A. Elhanashi, P. Dini, S. Saponara, Q. Zheng, Integration of Deep Learning into the IoT: A Survey of Techniques and Challenges for Real-World Applications, Electronics 12 (2023), 4925.
M. Geisslinger, F. Poszler, M. Lienkamp, An ethical trajectory planning algorithm for autonomous vehicles, Nature Machine Intelligence 5 (2023) 137–144.
N.J. Goodall, Machine Ethics and Automated Vehicles, In: Meyer, G., Beiker, S. (eds) Road Vehicle Automation, Lecture Notes in Mobility. Springer (2014), 93–102.
P. Hurtik, V. Molek, J. Hula, M. Vajgl, P. Vlasanek, T. Nejezchleba, Poly-YOLO: higher speed, more precise detection and instance segmentation for YOLOv3, Neural Computing and Applications 34 (2022), 8275–8290.
A. Karnati, D. Mehta, K. S. Manu, Artificial Intelligence in Self Driving Cars: Applications, Implications and Challenges, Ushus Journal of Business Management 21 (2022), no. 4, 1–28.
D. Kasraian, S. Raghav, B. Yusuf, E. J. Miller, A longitudinal analysis of travel demand and its determinants in the Greater Toronto-Hamilton Area, Environment and Planning B: Urban Analytics and City Science 49 (2022), no.8, 2230–2249.
U. Khadam, P. Davidsson, R. Spalazzese, A systematic literature review on AI in IoT systems: Tasks, applications, and deployment, Internet of Things 34 (2025), 101779.
J. Krause, M. Stark, J. Deng, L. Fei-Fei, 3D Object Representations for Fine-Grained Categorization, Proceedings of the IEEE International Conference on Computer Vision ICCV Workshops (2013), 554–561.
S. Ma, J.J. Yang, M.G. Chorzepa, C. Morris, S.S. Kim, S.A. Durham, Composite Deep Learning Architecture for Vehicle Classification Using Vision Transformers and Wheel Position Features, SN Computer Science 5 (2024).
J. Mauricio, I. Domingues, J. Bernardino, Comparing Vision Transformers and Convolutional Neural Networks for Image Classification: A Literature Review, Applied Sciences 13 (2023), 5521.
N. Al Mudawi, A.M. Qureshi, M. Abdelhaq, A. Alshahrani, A. Alazeb, M. Alonazi. A. Algarni, Vehicle Detection and Classification via YOLOv8 and Deep Belief Network over Aerial Image Sequences, Sustainability 15 (2023), 14597.
A.A. Musa, S.I. Malami, F. Alanazi, W. Ounaies, M. Alshammari, S.I. Haruna, Sustainable Traffic Management for Smart Cities Using Internet-of-Things-Oriented Intelligent Transportation Systems (ITS): Challenges and Recommendations, Sustainability 15 (2023), 9859.
V.Q. Nghiem, H.H. Nguyen, M.S. Hoang, LEAF-YOLO: Lightweight Edge-Real-Time Small Object Detection on Aerial Imagery, Intelligent Systems with Applications 25 (2025), 200484.
R. Padilla, W.L. Passos, T.L.B. Dias, S.L. Netto, E.A.B. da Silva, A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit, Electronics 10 (2021), 279.
S. Sowmiya, S. Jayasri, A.F. Thahamina, M. Srimathi,d S. Raghavan, AI and IoT Integration for Next-Generation Smart Cities, International Journal of Research and Scientific Innovation 13 (2026), no. 2, 1011–1020.
A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, J. Han, Yolov10: Real-time end-to-end object detection, Advances in Neural Information Processing Systems 37 (2024), 107984–108011.
M. Wang, N. Debbage, Urban morphology and traffic congestion: Longitudinal evidence from US cities, Computers, Environment and Urban Systems 89 (2021), 101676.
Y. Zhang, Z. Guo, J. Wu, Y. Tian, H. Tang, X. Guo, Real-Time Vehicle Detection Based on Improved YOLO v5, Sustainability 14 (2022), 12274.
X. Zhao, Y. Xia, W. Zhang, C. Zheng, Z. Zhang, YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection, Remote Sensing 15 (2023), 3778.
W. Zheng, B. Xiong, J. Chen, Q. Ou, L. Yu, A Texture Reconstructive Downsampling for Multi-Scale Object Detection in UAV Remote-Sensing Images, Sensors 25 (2025), 1569.
DOI: https://doi.org/10.52846/ami.v53i1.2386