CSIR-NCL Digital Repository

Computational development of the strategies to explore molecular machines and the molecular space for desired properties using machine learning

Show simple item record

dc.contributor.advisor Vanka, K.
dc.contributor.author Ghule, S.
dc.date.accessioned 2022-11-17T14:14:18Z
dc.date.available 2022-11-17T14:14:18Z
dc.date.issued 2022-05-11
dc.identifier.uri http://dspace.ncl.res.in:8080/xmlui/handle/20.500.12252/6165
dc.description Ph. D. Thesis of Siddharth Ghule Division: Physical and Material Chemistry NCL-ID No.: 11865 AcSIR Roll No.: 10CC17A26010 Name of research Guide/co-guide: Dr. Kumar Vanka Names of DAC Members: Paresh Laxmikant Dhepe, Kavita Joshi, Nayana Vaval Date of joining NCL: 24-07-2017 Faculty: Chemical Sciences en
dc.description.abstract For thousands of years, scientific discoveries have played a vital role in the progress of human civilization. The discovery of new materials or new scientific phenomena, or an improved understanding of the known phenomena requires exploration through the space available for a given class of molecules (the molecular space). The typical size of molecular space is estimated to be ~1060, which is larger than the number of stars in the observable universe (~1024). Conventional experimental, computational, and algorithmic approaches are inefficient in exploring this vast molecular space. Furthermore, conventional exploration strategies do not take advantage of the large databases available today. On the other hand, machine learning (ML) algorithms can extract hidden knowledge from large datasets. They have shown excellent predictive accuracies in many fields, surpassing the traditional methods. Thus, ML algorithms are promising candidates for developing efficient exploration strategies for the vast molecular space. In this thesis work, we have demonstrated the development of exploration strategies using machine learning algorithms for three different molecular spaces. The first molecular space investigated in this thesis includes battery materials based on phenazine molecules. We have developed an accurate hybrid DFT-ML approach to explore this molecular space. We showed that 2D molecular features are most informative in predicting the redox potential of phenazine derivatives in DME. We also showed that it is possible to develop reasonably accurate machine learning models for complex quantities such as redox potential using small and simple datasets. Next, we investigated different unsupervised machine learning algorithms to explore the molecular space of DNA and proteins to uncover the interactions between them. We have shown that unsupervised machine learning models can discover commonly occurring regulatory modules containing interacting and co-binding transcription factors without prior information on binding activities. Sometimes, in fundamental research, one may encounter the desired property, which cannot be easily computed using existing methodologies. We faced this issue during the investigation of molecular machines. Therefore, we developed an algorithm for quantifying the desired property (i.e., rotational motion) of the ring in the molecular machines. We also investigated linear regression, a machine learning algorithm, during the development. The developed algorithm helped us get an insight into different factors responsible for the rotational directionality of the ring in the rotaxane system. Thus, this thesis work demonstrates the applicability of machine learning and computational tools to the development of efficient exploration strategies for molecular space. This work also shows how to address different issues one may encounter during the development. Furthermore, the specific strategies developed for three molecular spaces are valuable for discovering new molecules and new scientific phenomena. For example, the hybrid DFT-ML approach can help discover promising phenazine derivatives for green energy storage systems such as RFB. The unsupervised machine learning approach developed in this study has the potential to identify genetic determinants of diseases. The algorithm developed for quantifying rotation would help experimentalists develop novel molecular machines having rotational directionality. en
dc.format.extent 219 p. en
dc.language.iso en_US en
dc.publisher CSIR-National Chemical Laboratory, Pune en
dc.subject Machine Learning en
dc.subject Materials Science en
dc.subject Computational Biology en
dc.title Computational development of the strategies to explore molecular machines and the molecular space for desired properties using machine learning en
dc.type Thesis(Ph.D.) en
local.division.division Physical and Materials Chemistry Division en
dc.description.university AcSIR en
dc.identifier.accno TH2552


Files in this item

This item appears in the following Collection(s)

Show simple item record