Abstract: An exhaustive survey of all template molecules used in the molecular imprinting literature up until September 2009 was carried out. This is achieved by the combined usage of artificial neural network, simple dictionary and rule-based search in conjunction with a dynamic updating database to identify word patterns leading to recognition of template molecules from article titles and abstracts. Mining from 3020 articles in the molecular imprinting literature led to the extraction of 776 template molecules. The methodology described herein was shown to be effective in recognizing the templates in article titles and could achieve a final precision of up to 0.75 once trained on sufficient data, with a total precision of 0.68. Classification of the obtained templates indicated that the majority were therapeutic drugs. The physicochemical properties of the template molecules were obtained from computational chemistry calculations and further subjected to classification and statistical analysis. To the best of our knowledge, this work constitutes the first approach in utilizing text mining technology in the field of molecular imprinting and the first time an exhaustive survey of molecular imprinting templates has been carried out
Author keywords: molecular imprinting, templates, Text mining, Named entity recognition, Neural network