Analysis of Text Mining Clustering on Suara Surabaya Crime Report with DBSCAN Neural Network Autoencoder Algorithm

Grace Lucyana Koesnadi, Muhammad Rizal Anggriawan, Talitha Zuleika, Mochamad Rasyid Aditya Putra, Najwa Khoir Aldawiyah, M Fariz Fadillah Mardianto

Abstract


Criminality, or crime, is a behavior that violates the law or is contrary to applicable values and norms. A high number of criminal behaviors criminal behaviors in a community significantly impacts its social conditions, leading to a decrease in welfare, unrest, and material losses that pose a threat to an individual's life. This study examines text mining on crime report data from Suara Surabaya using the DBSCAN clustering method and the Neural Network Autoencoder. The neural network autoencoder algorithm effectively reduces the data dimension, with an input dimension of 300 and an encode dimension of 64. Clustering analysis using the DBSCAN method based on the silhouette coefficient value criterion resulted in three clusters, with cluster 1 dominating the report. The clustering results show essential patterns in complaint reports, and LDA analysis reveals critical topics in the report. Cluster 0 shows a diversity of reports focusing on motor loss, interaction with homes or properties, and people's entry into homes. Cluster 1 is more focused on the loss of vehicles, both cars and motorcycles, with specific details such as vehicle color, number, brand, and related transactions or social interactions. Meanwhile, cluster 2 focuses on reports related to interactions with police stations and information on the location of incidents. This text mining approach to community crime report data not only improves analysis accuracy and efficiency, but also provides essential information that can support efforts to handle and prevent crime.

Keywords


Text Mining Clustering; DBSCAN; Autoencoder Neural Network; Criminality

Full Text:

PDF

References


M. R. P. Musa, A. B. Lesmana, R. N. Arthamevia, P. A. Pratama and N. Savitri, "Human Rights and Pancasila: A Case of Tionghoa Ethnic Discrimination in Indonesia," Indonesian Journal of Pancasila and Global Constitutionalism, vol. 1, no. 1, pp. 119-170, 2022.

E. Kahya-Özyirmidokuz, "Analyzing unstructured Facebook social network data through web text mining: A study of online shopping firms in Turkey," Information Development, vol. 32, no. 1, pp. 70-80, 2016.

H. Choi, M. Kim, G. Lee and W. Kim, "Unsupervised learning approach for network intrusion detection system using autoencoders," The Journal of Supercomputing, vol. 75, no. 9, pp. 5597-5621, 2019.

S. Andleeb, R. Ahmed, Z. Ahmed and M. Kanwal, "Identification and classification of cybercrimes using text mining technique," In 2019 International Conference on Frontiers of Information Technology (FIT), pp. 227-232, 2019.

I. Insiyah, M. Khasanah and T. P. Hendarsyah, "Penerapan Metode Ward Clustering Untuk Pengelompokkan Daerah Rawan Kriminalitas Di Jawa Timur Tahun 2021," Jurnal Statistika dan Komputasi, vol. 2, no. 1, pp. 44-54, 2023.

H. D. Tampubolon, S. Suhada, M. Safii, S. Solikhun and D. Suhendro, "Penerapan Algoritma K-Means dan K-Medoids Clustering untuk Mengelompokkan Tindak Kriminalitas Berdasarkan Provinsi," Jurnal Ilmu Komputer dan Teknologi, vol. 2, no. 2, pp. 6-12, 2021.

R. N. Fahmi, M. Jajuli and N. Sulistiyowati, "Analisis Pemetaan Tingkat Kriminalitas di Kabupaten Karawang Menggunakan Algoritma K-Means," INTECOMS: Journal of Information Technology and Computer Science, vol. 4, no. 1, pp. 67-79, 2021.

A. D. Putra, G. S. Martha, M. Fikram and R. J. Yuhan, "Faktor-Faktor yang Memengaruhi Tingkat Kriminalitas di Indonesia Tahun 2018," Indonesian Journal of Applied Statistics, vol. 3, no. 2, pp. 123-131, 2021.

A. O. Edwart and Z. Azhar, "Pengaruh Tingkat Pendidikan, Kepadatan Penduduk dan Ketimpangan Pendapatan Terhadap Kriminalitas di Indonesia.," Jurnal Kajian Ekonomi Dan Pembangunan, vol. 1, no. 3, pp. 759-768, 2019.

R. Soesilo, KUHP Kitab Undang Undang Hukum Pidana Lengkap serta Komentarnya, Bogor: Politea, 1976.

M. D. Wuryandari and I. Afrianto, "Perbandingan Metode Jaringan Syaraf Tiruan Backpropagation Dan Learning Vector Quantization Pada Pengenalan Wajah," Jurnal Komputer dan Informatika (Komputa), vol. 1, no. 1, pp. 45-51, 2012.

S. Haykin, Neural Networks: A Comprehensive Foundation, New York: Macmillan, 1994.

R. Feldman and J. Sanger, The text mining handbook: advanced approaches in analyzing unstructured data, Cambridge: Cambridge University Press, 2006.

V. Gupta and G. S. Lehal, "A Survey of Text Mining Techniques and Applications," Journal of Emerging Technologies in Web Intelligence, vol. 1, p. 60–76, 2009.

F. Nurhuda, S. W. Sihwi and A. Doewes, "Analisis sentimen masyarakat terhadap calon Presiden Indonesia 2014 berdasarkan opini dari Twitter menggunakan metode Naive Bayes Classifier," ITSmart: Jurnal Teknologi dan Informasi, vol. 2, no. 2, pp. 35-42, 2016.

A. T. J. Harjanta, "Preprocessing Text untuk Meminimalisir Kata yang Tidak Berarti dalam Proses Text Mining," Jurnal Informatika UPGRIS, vol. 1, 2015.

C. Triawati, M. A. Bijaksana, N. Indrawati and W. A. Saputro, "Pemodelan Berbasis Konsep untuk Kategorisasi Artikel Berita Berbahasa Indonesia," In Seminar Nasional Aplikasi Teknologi Informasi (SNATI), 2009.

S. Bird, E. Klein and E. Loper, Natural language processing with Python: analyzing text with the natural language toolkit, O'Reilly Media, Inc, 2009.

C. D. Manning, P. Raghavan and H. Schütze, Introduction to Information Retrieval, Cambridge: Cambridge University Press, 2008.

H. Jiawei, K. Micheline and P. Jian, "Data Mining: Concepts and Techniques The Morgan Kaufmann Series in Data Management Systems," Elsevier, 2011.

T. Schmiedel, O. Müller and J. Vom Brocke, "Topic modeling as a strategy of inquiry in organizational research: A tutorial with an application example on organizational culture," Organizational Research Methods, vol. 22, no. 4, pp. 941-968, 2019.

H. Jelodar, Y. Wang, C. Yuan, X. Feng, X. Jiang, Y. Li and L. Zhao, "Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey," Multimedia tools and applications, vol. 78, pp. 15169-15211, 2019.




DOI: http://dx.doi.org/10.12962/j27213862.v7i3.20765

Refbacks

  • There are currently no refbacks.




Creative Commons License
Inferensi by Department of Statistics ITS is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://iptek.its.ac.id/index.php/inferensi.

ISSN:  0216-308X

e-ISSN: 2721-3862

Web
Analytics Made Easy - StatCounter View My Stats