Semantic Editing of Traffic Near-Miss and Accident Dataset Using Tune-A-Video

Eka Alifia Kusnanti, Chastine Fatichah, Hilmil Muchtar Aditya Pradana

Abstract


Developing effective traffic monitoring systems for accident detection relies heavily on high-quality, diverse datasets. Many existing approaches focus on detecting anomalies in traffic videos. Still, they often fail to account for how varying environmental conditions, such as time of day, weather, or lighting, might influence the occurrence of near-misses or accidents. In this study, we explore the potential of Tune-A-Video to apply semantic editing techniques to an existing traffic near-miss and accident dataset. By modifying the visual environment, such as changing the time of day, weather, or lighting, we aim to generate realistic footage variations without altering the core events like near-miss incidents or accidents. This method enhances the dataset with more varied and realistic traffic conditions, improving its representativeness of real-world scenarios. The primary objective is not to create a new dataset but to assess the impact of semantic editing on the dataset’s diversity and its effect on model performance. The results show that using Tune-A-Video for semantic editing can enrich the dataset, making it more suitable for training machine learning models. This approach helps improve the accuracy and robustness of computer vision models, particularly for traffic monitoring and accident detection applications, offering a promising tool for traffic safety systems.

Keywords


Accident; Near-Miss; Semantic Editing; Traffic Dataset; Tune-A-Video

Full Text:

Full Text

References


Apostolovski N, Trajanovski N, Chavdar M, Kartalov T, Gerazov B, Ivanovski Z. Deep Learning Based Multimodal Information Fusion for Near-Miss Event Detection in Intelligent Traffic Monitoring Systems. In: Complex Systems: Spanning Control and Computational Cybernetics: Applications: Dedicated to Professor Georgi M. Dimirovski on his Anniversary Springer; 2022.p. 357–388.

Niu Y, Fan Y, Ju X. Critical review on data-driven approaches for learning from accidents: comparative analysis and future research. Safety science 2024;171:106381.

Yang G, Sarkar A, Ridgeway C, Thapa S, Jain S, Miller A. Using Artificial Intelligence/Machine Learning Tools to Analyze Safety, Road Scene, Near-Misses and Crashes. National Surface Transportation Safety Center for Excellence; 2024.

Sohail A, CheemaMA, Ali ME, Toosi AN, Rakha HA. Data-driven approaches for road safety: A comprehensive systematic literature review. Safety science 2023;158:105949.

Azfar T, Li J, Yu H, Cheu RL, Lv Y, Ke R. Deep learning-based computer vision methods for complex traffic environments perception: A review. Data Science for Transportation 2024;6(1):1–27.

Alomar K, Aysel HI, Cai X. Data augmentation in classification and segmentation: A survey and new strategies. Journal of Imaging 2023;9(2):46.

Abdel-Aty M, Wang Z, Zheng O, Abdelraouf A. Advances and applications of computer vision techniques in vehicle trajectory generation and surrogate traffic safety indicators. Accident Analysis & Prevention 2023;191:107191.

Patel AS, Vyas R, Vyas O, Ojha M. A study on video semantics; overview, challenges, and applications. Multimedia Tools and Applications 2022;81(5):6849–6897.

Gao Z, Chen X, Xu J, Yu R, Zhang H, Yang J. Semantically-Enhanced Feature Extraction with CLIP and Transformer Networks for Driver Fatigue Detection. Sensors 2024;24(24):7948.

Muhammad K, Hussain T, Ullah H, Del Ser J, Rezaei M, Kumar N, et al. Vision-based semantic segmentation in scene understanding for autonomous driving: Recent achievements, challenges, and outlooks. IEEE Transactions on Intelligent Transportation Systems 2022;23(12):22694–22715.

Gu J, Fang Y, Skorokhodov I,Wonka P, Du X, Tulyakov S, et al. VIA: A Spatiotemporal Video Adaptation Framework for Global and Local Video Editing. arXiv preprint arXiv:240612831 2024;.

Wu JZ, GeY,Wang X, Lei SW, GuY, ShiY, et al. Tune-a-video: One-shot tuning of image diffusion models for text-to-video generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2023. p. 7623–7633.

Pradana H, Dao MS, Zettsu K. Augmenting ego-vehicle for traffic near-miss and accident classification dataset using manipulating conditional style translation. In: 2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Sydney, Australia: IEEE; 2022. p. 1–8.

Uhlig S, Alkhasli I, Schubert F, Tschöpe C, Wolff M. A review of synthetic and augmented training data for machine learning in ultrasonic non-destructive evaluation. Ultrasonics 2023 September;134:107041.

Chen D, Zhu M, Yang H,Wang X,Wang Y. Data-driven Traffic Simulation: A Comprehensive Review. IEEE Transactions on Intelligent Vehicles 2024;9(4):4730–4748. https://ieeexplore.ieee.org/document/10440492.

Razi A, Chen X, Li H, Wang H, Russo B, Chen Y, et al. Deep learning serves traffic safety analysis: A forward-looking review. IET Intelligent Transport Systems 2023;17(1):22–71.

Rocky A, Wu QJ, Zhang W. Review of Accident Detection Methods Using Dashcam Videos for Autonomous Driving Vehicles. IEEE Transactions on Intelligent Transportation Systems 2024;25(8):8356–8374.

Garcea F, Serra A, Lamberti F, Morra L. Data augmentation for medical imaging: A systematic literature review. Computers in Biology and Medicine 2023;152:106391.

Eze C, Crick C. Learning by Watching: A Review of Video-based Learning Approaches for Robot Manipulation. arXiv preprint arXiv:240207127 2024;p. 1–26.

Rabbi ABK, Jeelani I. AI integration in construction safety: Current state, challenges, and future opportunities in text, vision, and audio based applications. Automation in Construction 2024;164:105443.

Wu R, Yang T, Sun L, Zhang Z, Li S, Zhang L. Seesr: Towards semantics-aware real-world image super-resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2024. p. 25456–25467.

Mustapha A, Abdul-Rani AM, Saad N, Mustapha M. Advancements in Traffic Simulation for Enhanced Road Safety: A Review. Simulation Modelling Practice and Theory 2024;p. 103017.

Sun W, Tu RC, Liao J, Tao D. Diffusion model-based video editing: A survey. arXiv preprint arXiv:240707111 2024;p.1–23.

Zhao L, Zhang Z, Nie X, Liu L, Liu S. Cross-Attention and Seamless Replacement of Latent Prompts for High-Definition Image-Driven Video Editing. Electronics 2023;13(1):1–14.

Testolina P, Barbato F, Michieli U, Giordani M, Zanuttigh P, Zorzi M. Selma: Semantic large-scale multimodal acquisitions in variable weather, daytime and viewpoints. IEEE Transactions on Intelligent Transportation Systems 2023;24(7):7012–7024.

Suryanto N, Adiputra AA, Kadiptya AY, Le TTH, Pratama D, Kim Y, et al. Cityscape-Adverse: Benchmarking Robustness of Semantic Segmentation with Realistic Scene Modifications via Diffusion-Based Image Editing. arXiv preprint arXiv:241100425 2024;p. 1–19.

Fang J, Yan D, Qiao J, Xue J, Yu H. DADA: Driver attention prediction in driving accident scenarios. IEEE transactions on intelligent transportation systems 2021;23(6):4959–4971.




DOI: http://dx.doi.org/10.12962/j20882033.v35i3.22186

Refbacks

  • There are currently no refbacks.


Creative Commons License

IPTEK Journal of Science and Technology by Lembaga Penelitian dan Pengabdian kepada Masyarakat, ITS is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://iptek.its.ac.id/index.php/jts.