Semantic Editing of Traffic Near-Miss and Accident Dataset Using Tune-A-Video

Eka Alifia Kusnanti, Chastine Fatichah, Muhamad Hilmil Pradana

Abstract


Developing effective traffic monitoring systems for accident detection relies heavily on high-quality, diverse datasets. Many existing approaches focus on detecting anomalies in traffic videos. Still, they often fail to account for how varying environmental conditions, such as time of day, weather, or lighting, might influence the occurrence of near-misses or accidents. In this study, we explore the potential of Tune-A-Video to apply semantic editing techniques to an existing traffic near-miss and accident dataset. By modifying the visual environment, such as changing the time of day, weather, or lighting, we aim to generate realistic footage variations without altering the core events like near-miss incidents or accidents. This method enhances the dataset with more varied and realistic traffic conditions, improving its representativeness of real-world scenarios. The primary objective is not to create a new dataset but to assess the impact of semantic editing on the dataset’s diversity and its effect on model performance. The results show that using Tune-A-Video for semantic editing can enrich the dataset, making it more suitable for training machine learning models. This approach helps improve the accuracy and robustness of computer vision models, particularly for traffic monitoring and accident detection applications, offering a promising tool for traffic safety systems.

Keywords


Accident; Near-Miss; Semantic Editing; Traffic Dataset; Tune-A-Video

Full Text:

Full Text

References


Apostolovski N, Trajanovski N, Chavdar M, Kartalov T, Gerazov B, Ivanovski Z. Deep Learning Based Multimodal Information

Fusion for Near-Miss Event Detection in Intelligent Traffic Monitoring Systems. In: Complex Systems: Spanning

Control and Computational Cybernetics: Applications: Dedicated to Professor Georgi M. Dimirovski on his Anniversary

Springer; 2022.p. 357–388.

Niu Y, Fan Y, Ju X. Critical review on data-driven approaches for learning from accidents: comparative analysis and future

research. Safety science 2024;171:106381.

Yang G, Sarkar A, Ridgeway C, Thapa S, Jain S, Miller A. Using Artificial Intelligence/Machine Learning Tools to Analyze

Safety, Road Scene, Near-Misses and Crashes. National Surface Transportation Safety Center for Excellence; 2024.

Sohail A, CheemaMA, Ali ME, Toosi AN, Rakha HA. Data-driven approaches for road safety: A comprehensive systematic

literature review. Safety science 2023;158:105949.

Azfar T, Li J, Yu H, Cheu RL, Lv Y, Ke R. Deep learning-based computer vision methods for complex traffic environments

perception: A review. Data Science for Transportation 2024;6(1):1–27.

Alomar K, Aysel HI, Cai X. Data augmentation in classification and segmentation: A survey and new strategies. Journal

of Imaging 2023;9(2):46.

Abdel-Aty M, Wang Z, Zheng O, Abdelraouf A. Advances and applications of computer vision techniques in vehicle

trajectory generation and surrogate traffic safety indicators. Accident Analysis & Prevention 2023;191:107191.

Patel AS, Vyas R, Vyas O, Ojha M. A study on video semantics; overview, challenges, and applications. Multimedia Tools

and Applications 2022;81(5):6849–6897.

Gao Z, Chen X, Xu J, Yu R, Zhang H, Yang J. Semantically-Enhanced Feature Extraction with CLIP and Transformer

Networks for Driver Fatigue Detection. Sensors 2024;24(24):7948.

Muhammad K, Hussain T, Ullah H, Del Ser J, Rezaei M, Kumar N, et al. Vision-based semantic segmentation in scene

understanding for autonomous driving: Recent achievements, challenges, and outlooks. IEEE Transactions on Intelligent

Transportation Systems 2022;23(12):22694–22715.

Gu J, Fang Y, Skorokhodov I,Wonka P, Du X, Tulyakov S, et al. VIA: A Spatiotemporal Video Adaptation Framework for

Global and Local Video Editing. arXiv preprint arXiv:240612831 2024;.

Wu JZ, GeY,Wang X, Lei SW, GuY, ShiY, et al. Tune-a-video: One-shot tuning of image diffusion models for text-to-video

generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2023. p. 7623–7633.

Pradana H, Dao MS, Zettsu K. Augmenting ego-vehicle for traffic near-miss and accident classification dataset using

manipulating conditional style translation. In: 2022 International Conference on Digital Image Computing: Techniques and

Applications (DICTA) Sydney, Australia: IEEE; 2022. p. 1–8.

Uhlig S, Alkhasli I, Schubert F, Tschöpe C, Wolff M. A review of synthetic and augmented training data for machine

learning in ultrasonic non-destructive evaluation. Ultrasonics 2023 September;134:107041.

Chen D, Zhu M, Yang H,Wang X,Wang Y. Data-driven Traffic Simulation: A Comprehensive Review. IEEE Transactions

on Intelligent Vehicles 2024;9(4):4730–4748. https://ieeexplore.ieee.org/document/10440492.

Razi A, Chen X, Li H, Wang H, Russo B, Chen Y, et al. Deep learning serves traffic safety analysis: A forward-looking

review. IET Intelligent Transport Systems 2023;17(1):22–71.

Rocky A, Wu QJ, Zhang W. Review of Accident Detection Methods Using Dashcam Videos for Autonomous Driving

Vehicles. IEEE Transactions on Intelligent Transportation Systems 2024;25(8):8356–8374.

Garcea F, Serra A, Lamberti F, Morra L. Data augmentation for medical imaging: A systematic literature review. Computers

in Biology and Medicine 2023;152:106391.

Eze C, Crick C. Learning by Watching: A Review of Video-based Learning Approaches for Robot Manipulation. arXiv

preprint arXiv:240207127 2024;p. 1–26.

Rabbi ABK, Jeelani I. AI integration in construction safety: Current state, challenges, and future opportunities in text,

vision, and audio based applications. Automation in Construction 2024;164:105443.

Wu R, Yang T, Sun L, Zhang Z, Li S, Zhang L. Seesr: Towards semantics-aware real-world image super-resolution. In:

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2024. p. 25456–25467.

Mustapha A, Abdul-Rani AM, Saad N, Mustapha M. Advancements in Traffic Simulation for Enhanced Road Safety: A

Review. Simulation Modelling Practice and Theory 2024;p. 103017.

Sun W, Tu RC, Liao J, Tao D. Diffusion model-based video editing: A survey. arXiv preprint arXiv:240707111 2024;p.

–23.

Zhao L, Zhang Z, Nie X, Liu L, Liu S. Cross-Attention and Seamless Replacement of Latent Prompts for High-Definition

Image-Driven Video Editing. Electronics 2023;13(1):1–14.

Testolina P, Barbato F, Michieli U, Giordani M, Zanuttigh P, Zorzi M. Selma: Semantic large-scale multimodal acquisitions

in variable weather, daytime and viewpoints. IEEE Transactions on Intelligent Transportation Systems 2023;24(7):7012–

Suryanto N, Adiputra AA, Kadiptya AY, Le TTH, Pratama D, Kim Y, et al. Cityscape-Adverse: Benchmarking Robustness

of Semantic Segmentation with Realistic Scene Modifications via Diffusion-Based Image Editing. arXiv preprint

arXiv:241100425 2024;p. 1–19.

Fang J, Yan D, Qiao J, Xue J, Yu H. DADA: Driver attention prediction in driving accident scenarios. IEEE transactions

on intelligent transportation systems 2021;23(6):4959–4971.




DOI: http://dx.doi.org/10.12962%2Fj20882033.v35i3.22186

Refbacks

  • There are currently no refbacks.


sja138

Creative Commons License

IPTEK Journal of Science and Technology by Lembaga Penelitian dan Pengabdian kepada Masyarakat, ITS is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://iptek.its.ac.id/index.php/jts.