Siamese Long Short-Term Memory for Detecting Conflict of Interest on Scientific Papers

Akhmad Bakhrul Ilmi, Diana Purwitasari, Chastine Fatichah


Scientific articles cited by other researchers have an impact on increasing author credibility. However, the citation process may be misused to unnaturally raise a bibliometric indicator value such as researcher’s h-index. Researchers may overly cites their own works, referred as self-citation, even though the topic of the references are not related to the current article. Further misconduct is excessive citations on the works of peoples related to the researcher which can be coercive or not, referred as conflict of interest (CoI). The proposed method uses a deep learning approach, Siamese Long ShortTerm Memory (LSTM), to recognize subject similarities between a scientific article and its references. Standard text similarity fails to do so because contextual relatedness of sentences in the articles need some learning process. Siamese-LSTM learns contextual relatedness of sentences in the article using two identical LSTM. Steps of the proposed method are (i) word-embedding to get weight values of terms but still considers their semantic relations, (ii) k-means clustering to generate training data for reducing time complexity in Siamese-LSTM learning of scientific articles, (iii) learns Siamese-LSTM weight from training data to identify contextual relatedness of sentences, (iv) calculate similarity of a scientific article with its references based on Siamese-LSTM. The empirical experiments are used to analyze similarity values and the possibility for conflict of interest in an article.


Citation; Conflict of Interest; Scientific Text; Deep Learning; Similarity; Text Processing

Full Text:



Khaled Moustafa, “Aberration of the Citation,” Account. Res., vol. 23, no. 4, pp. 230–244, 2016. [2] T. Yu, G. Yu, and M. Y. Wang, “Classification method for detecting coercive self-citation in journals,” J. Informetr., vol. 8, no. 1, pp. 123–135, 2014. [3] Institute of Medicine, Conflict of interest in medical research, education and practice. 2009. [4] J. Mueller, “Siamese Recurrent Architectures for Learning Sentence Similarity,” Proc. 30th Conf. Artif. Intell. (AAAI 2016), no. 2012, pp. 2786–2792, 2016. [5] L. Yu, J. Wang, K. R. Lai, and X. Zhang, “Refining Word Embeddings for Sentiment Analysis,” pp. 534–539, 2017. [6] H. G. and R. Srivastava, “K-means Based Document Clustering with Automatic hanya mandapat satu hasil ‘ K ’ Selection and Cluster Refinement,” Int. J. Comput. Sci. Mob. Appl, 2014. [7] J. Tang and J. Zhang, “ArnetMiner : Extraction and Mining of Academic Social Networks,” 2008.



  • There are currently no refbacks.

Creative Commons License

IPTEK Journal of Science and Technology by Lembaga Penelitian dan Pengabdian kepada Masyarakat, ITS is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at