Boteju, W.I.W.YGunarathna, M.K.A.S.Fonseka, T.L.M.DPeiris, H.N.S.Silva, L.A.R.Jayathilaka, N.N.S.S.Jayasekara, L.A.G.D.Perera, P.N.Jayaweera, H.H.E.Gunewardene, M.S.Jayawardhana, S.2022-07-222022-07-222022Sri Lankan Journal of Physics,23(1):p.1-12https://dl.nsf.gov.lk/handle/1/25616Raman spectroscopy is an ideal technique for gemstones identification due to its nondestructive nature, rapid detection, no sample preparation, and ability to analyze interior compositions. Notwithstanding the benefits, most routine gemstone analysis requires complementary techniques to verify the accuracy due to difficulties in matching Raman spectra against a known database, while ensuring high sensitivity, specificity, and accuracy. This work presents a technique where computational methods are used to accurately identify gemstones for routine operations. The acquired Raman spectroscopic data is preprocessed using baseline subtraction and signal smoothing for optimal signal extraction and then cross correlated with a verified database of spectra. The correlation coefficient results are then clustered using a K-means algorithm to distinguish the gemstone families. The locally sourced unknown gemstone was found to belong to the quartz family with a sensitivity of 100%, specificity of 98% and accuracy of 98%. A second technique was also introduced by considering both cross correlation and overlap between the area under the curves of matched spectra. Both methods converged on the same conclusion and was backed up by three common cluster validation indices thereby assuring the validity of the identification. The technique was further validated to be used with other gemstone families such as beryl, diamond, and corundum.Raman spectroscopyMachine learningGemmologyClusteringK-meansCorrelation coefficientCross-correlationA quantitative approach to gemstone identification using Raman spectroscopy combined with machine learning