Drug Discovery Process

Drug discovery process and development have a long history since the early days of human civilization. In ancient times, drugs were used for physical remedies and also associated with religious and spiritual healing.

The early drugs or folk medicines were derived mainly from plants and supplemented by animal products. These drugs were most probably discovered through a combination of trial and error experiments and observation of human and animal reactions as a result of these products. In ancient times, most of the drugs were based on herbs or extraction of ingredients from botanical sources.

Synthetic drugs using chemical methods were indicated at the beginning of the 1900s, and that was the beginning of the pharmaceutical industry. Many drugs were researched and manufactured, but mostly they were used from the therapeutic point of view rather than curing the diseases. From the early1930’s, drug discovery focused on screening of natural products and isolating the active ingredients for the treatment of the diseases. The active ingredients are usually the synthetic version of the natural products. These synthetic versions, called the new chemical entities (NCEs), have to go through many tests to ensure their potency and safety. Despite the many advances made in the 1800’s, there were very few drugs available for curing diseases at the beginning of the 1900. In the late 1970’s, development of recombinant DNA products commenced by utilizing cellular and molecular biology knowledge. The biotechnology field became a reality.

The pharmaceutical industry combined with the advances in gene therapy, understanding of the diseases’ mechanisms, and the research done from the Human Genome Projects have opened up a variety of opportunities and the possibility of the development and use of drugs specifically targeting the exact site of the disease. The conclusions show an increasing number of new therapeutic targets used for drug discovery. Also, the high-throughput protein purification, crystallography, and nuclear magnetic resonance spectroscopy techniques have been developed and contributed to many structural details of proteins and protein-ligand complexes. These kinds of advances allow computational strategies to be used in all aspects of drug discovery in recent years, such as the virtual screening (VS) technique for hit identification and lead optimization methods. As to structure-based drug design, molecular docking is the most common method to be used these days. Programs based on different algorithms are developed to perform molecular docking studies and have made molecular docking a vital tool in pharmaceutical research at present.

In the drug discovery process, the development of novel drugs with potential interactions with therapeutic targets is of central importance. Conventionally, promising-lead identification is achieved by experimental high-throughput screening (HTS), but it is time-consuming and expensive. Completion of a typical drug discovery cycle from target identification to an FDA-approved drug takes up to 14 years with the approximate cost of 800 million dollars. Nonetheless, recently, a decrease in the number of new drugs on the market was noted due to failure in different phases of clinical trials.

In contrast to the traditional drug discovery method (classical or forward pharmacology), rational drug design is eﬃcient and economical. The rational drug design method is also known as reverse pharmacology because the ﬁrst step is to identify promising target proteins, which are then used for screening of small-molecule libraries. Striking progress have been made in structural and molecular biology along with advances in biomolecular spectroscopic structure determination methods. These methods have provided three-dimensional (3D) structures of more than 100,000 proteins.

In conjunction with the storage of (and organizing) such data, there has been much hype about the development of sophisticated and robust computational techniques. Completion of the Human Genome Project and advances in bioinformatics increased the pace of drug development because of the availability of a huge number of target proteins. The availability of 3D structures of therapeutically important proteins favors identiﬁcation of binding cavities and has laid the foundation for structure-based drug design (SBDD). This is becoming a fundamental part of industrial drug discovery projects and of academic researches.

SBDD is a more speciﬁc, eﬃcient, and rapid process for lead discovery and optimization because it deals with the 3D structure of a target protein and knowledge about the disease at the molecular level. Among the relevant computational techniques, structure-based virtual screening (SBVS), molecular docking, and molecular dynamics (MD) simulations are the most common methods used in SBDD.

These methods have numerous applications in the analysis of binding energetics, ligand-protein interactions, and evaluation of the conformational changes occurring during the docking process.

In recent years, developments in the software industry have been driven by a massive surge in software packages for eﬃcient drug discovery processes.

Nonetheless, it is important to choose outstanding packages for an eﬃcient SBDD process. Brieﬂy, automation of all the steps in an SBDD process has shortened the SBDD timeline. Moreover, the availability of supercomputers, computer clusters, and cloud computing has sped up lead identiﬁcation and evaluation.

SBDD in Drug Discovery Process

SBDD is the most powerful and efficient process. Computational resources serve as an efficient technology for accelerating the drug discovery process, Which includes various screening procedures, combinatorial chemistry, and calculations of such properties as absorption, distribution, metabolism, excretion, and toxicity (ADMET).

SBDD is an iterative process and it proceeds through multiple cycles leading an optimized drug candidate to clinical trials. Generally, a drug discovery process consists of four steps: the discovery phase, development phase, clinical trial phase, and registry phase.

In the first phase, a potential therapeutic target and active ligands are identified. The fundamental step involves cloning of the target gene followed by the extraction, purification, and 3D structure determination of the protein. Many computer algorithms can be used to dock the huge databases of small molecules or fragments of compounds into the binding cavity of the target protein. These molecules are ranked according to a scoring system based on electrostatic and steric interactions with the binding site.

A thorough investigation of electrostatic properties of the binding site, including the presence of cavities, clefts, and allosteric pockets can be carried out using a 3D structure of the target molecule.

Current SBDD methods consider the key features of the binding cavity of the therapeutic target to design efficient ligands.

In the second phase, the top hits are synthesized and optimized. Furthermore, the top-ranked compounds with high affinity for selective modulation of the target protein are tested in vitro in biochemical assays. These ligands interfere with crucial cellular pathways, thereby leading to the development of drugs with a desired therapeutic and pharmacological effect. Biological properties like efficacy, affinity, and potency of the selected compounds are evaluated by experimental methods.

The next step is to determine the 3D structure of the target protein in complex with the promising ligand obtained in the first phase. The 3D structure provides detailed information about the inter-molecular features that aid in the process of molecular recognition and binding of the ligand. Structural insights into the ligand-protein complex help with the analysis of various binding conformations, identification of unknown binding pockets, and ligand-protein interactions; elucidation of conformational changes resulting from ligand binding; and detailed mechanistic studies. Subsequently, multiple iterations increase the efficacy and specificity of the lead.

The third phase includes clinical trials of the lead compounds. Those compounds that pass the clinical trials proceed to the fourth phase in which the drug is distributed in the market for clinical use.

SBDD is a computational technique widely used by pharmaceutical companies and scientists. There are numerous drugs available on the market that have been identiﬁed by SBDD.

Human immunodeﬁciency virus (HIV)-1-inhibiting FDA-approved drugs represent the foremost success story of SBDD. Moreover, other drugs identiﬁed by the SBDD technique include a thymidylate synthase inhibitor, raltitrexed; amprenavir, a potential inhibitor of HIV protease discovered by protein modeling and MD simulation; and the antibiotic norﬂoxacin.

De novo drug discovery Process

De novo drug discovery process is a method of building novel chemical compounds starting from molecular units. The gist of this approach is to develop chemical structures of the small molecules that bind to the target binding cavity with good aﬃnity. Generally, a stochastic approach is used for de novo design, and it is important to make the search space knowledge into consideration in the design algorithm.

The two designs, positive and negative, are being used. In the former design, a search is restricted to the speciﬁc regions of chemical space with higher probability of ﬁnding hits having required features.

In contrast, the search criteria are predeﬁned in the negative mode, to prevent the selection of false positives. The chemical compound designing by computational techniques can be related to imitation of synthetic chemistry, while scoring functions perform binding assays. Critical assessment of candidates is crucial for the design process, and the scoring function is one of the assessment tools.

Multiple scoring functions can be employed parallelly for multi-objective drug design, which considers multiple features at once.

Two methods—

ligand-based
receptor-based de novo drug design can be used.

The latter approach is more prevalent. The quality of target protein structures and accurate knowledge about its binding site are important for receptor-based design because suitable small molecules are designed by fitting the fragments into the binding cavities of the receptors. This could be either done by means of a computational program or by co-crystallization of the ligand with the receptor.

There are two techniques for receptor-based design: building blocks, either atoms or fragments such as single rings, amines, and hydrocarbons are linked together to form a complete chemical compound or simply by growing a ligand from a single unit.

In the fragment-linking method, the binding site is identified to map the probable interacting points for different functional groups present in the fragments.

These functional groups are attached together to build an absolute compound. In the fragment-growing technique, the growth of fragments is accomplished within the binding site monitored by suitable search algorithms. These search algorithms involve scoring functions to assess the probability of growth.

Fragment-based de novo design uses the whole chemical space to generate novel compounds. In the case of the linking approach, the selection of linkers is critical. Fragment anchoring in the binding site can be performed by

the outside-in approach
(ii) the inside-out approach.

In the former approach, the building blocks are primarily arranged at the periphery of the binding site, and it grows inward. In the course of the inside-out approach, building blocks are casually fitted into the binding site and built outward.

Big Data in Drug Discovery Process

The “big data” approach inﬂuences our daily life and drug discovery is not an exception. By current computational techniques, molecular characteristics can be studied in a logical and systematic manner. The data collected from each compound can be subjected to analyses from diﬀerent perspectives.

In the modern era of technology, there has probably been an increase in the size of data generation.

According to a recent estimate, the total size of stored data is approximately two zettabytes with an expected doubling every two years.

Hence, excavation of massively produced digital information oﬀers a multitude of opportunities to increase productivity. Nevertheless, apart from the volume and production rate of big data, the variety and complexity of big data pose challenges for eﬀective analysis.

Furthermore, sometimes generated data contain inconsistencies, such as missing or incomplete information, errors, and duplications, thereby aﬀecting the outcomes of accurate simulation and analytical activities. Therefore, preliminary analysis and curation are required as advanced measures to ensure fairness, accuracy, and experimental eﬃcacy.

On the other hand, pre collection and curation measures vary among research communities, depending on preceding observations and experimental records. Yet, there is high demand for a simple, uniﬁed, and well-established curation.

The protocol ensures the quality of generated simulation and analytical datasets. Hence, the existing standard of research continues to adhere to the “less-is-more” principle. Big data have played a vital role in medicinal and combinatorial chemistry, whereas HTS contributes to the generation of a huge amount of data over a short span of time.

Big data dependency will likely increase as the perception of personalized medicine improves. Earlier, big data have been regarded as the beginning of computation-oriented medicinal chemistry (i.e., processing stacks of generated data, resulting in shortening of the time taken to complete a drug development process).

For instance, a well-known global pandemic spanning more than 40 years, HIV, has infected more than 37 million people, where only 57% are being treated with antiviral agents (World Health Organization (WHO), 2018).

In the past few years, many studies have addressed the inhibition of viral reverse transcriptase and/or integrase.

Although this technique has proven eﬀective enough, it comes with several shortcomings such as viral resistance and poor bioavailability.

Artiﬁcial Intelligence and Machine Learning in Drug Discovery Process

Artificial intelligence (AI) mimics human behavior by simulating human intelligence by computer techniques. ML, a subfield of AI, uses statistical methods for learning with or without being programmed.

In the drug development process, AI has shifted the mood from hype to hope. Computational technologies and ML algorithms have revolutionized drug discovery in the pharmaceutical industry.

Integration of ML algorithms in an automatic manner–to discover new compounds by analyzing, learning, and explaining pharmaceutical big data–is the application of AI to drug design.

Big Pharma is increasing investment in AI; this situation shows the truth behind the use of ML algorithms to identify and screen potential drug candidates. For instance, SYNSIGHT has introduced an AI-based integrated platform in combination with VS and molecular modeling to create huge biological models for the drug development process.

Many leading biopharmaceutical companies are collaborating to integrate AI and ML methods with their drug discovery pipelines.

ML success has been repeatedly demonstrated in classification, generative modeling, and reinforcement learning (RL). Different categories of ML are supervised learning, unsupervised learning, and RL. The subcategory of supervised learning, classification, and regression methods predicts the model on the basis of input and output data sources.

Supervised ML is applicable to disease in diagnostic methods, ADMET in a classification method’s output, and to drug efficacy in regression methods. SVMs with supervised ML algorithms use binary activity prediction to distinguish between a drug and non drug or between specific and nonspecific compounds.

SVM classification is performed in LBVS to rank the database compounds by decreasing activity probability. To minimize errors in SVM ranking, optimized special ranking functions are used. The clustering method for an unsupervised learning category can discover a disease subtype as outputs, while a feature-finding method can identify a target in a disease.

Challenges and Emerging Problems in Drug Discovery Process

Drug discovery process still faces a lot of challenges, such as

(i) upgrading the eﬃcacy of virtual screening methods

(ii) improving computational chemogenomic studies

(iii) boosting the quality and number

of computational web sources

(iv) improving the structure of multitarget drugs

(v) enhancing the algorithms for toxicity prediction

(vi) collaborating with other related ﬁelds of study for better lead identiﬁcation and optimization.

The computer-aided structure-based drug discovery process is an integral part of multidisciplinary work.

The computer-aided drug discovery process can be used in combination with combinatorial chemistry or HTS, by means of various algorithms to prepare combinatorial libraries for HTS, including chemical space characterization.

VS is known to shorten the time and cost of HTS methods. The major drawback of VS is that while generating screening libraries, it ignores the protonation and tautomerism effect as well as ionization states of compounds, thereby missing out on significant hits.

The availability of limited experimental data and reliable output of computational methods cause researchers to ignore tautomerization, but they are still irresistible.

In the drug discovery process, ADMET prediction remains a hurdle. Nonetheless, the availability of various computational methods for the prediction of these values has reduced the time and the number of tests on animals.

Drug Discovery Process

ByDr Nidhi Joshi

De novo drug discovery Process

Big Data in Drug Discovery Process

Artiﬁcial Intelligence and Machine Learning in Drug Discovery Process

Challenges and Emerging Problems in Drug Discovery Process

By Dr Nidhi Joshi

Related Post

Immune system and its Parts

Hormones Functions

Recombinant DNA Technology Techniques

Leave a Reply Cancel reply

Latest Blog

Immune system and its Parts

Hormones Functions

Recombinant DNA Technology Techniques

Proteins Different Types