Computational design of biocatalysts
Proteins are versatile nanostructured biomacromolecules which are used by nature in a multitude of functions: as highly active and selective catalysts, as efficient nanomachines, or as nanostructured materials with superior mechanical, electrical, or optical properties. Though in principle any protein can be conveniently produced by chemical DNA synthesis and expression of the respective gene, the application of proteins in white and red biotechnology is still limited to the naturally found proteins and variants thereof. While we understand in most cases how a single mutation changes the biochemical and biophysical properties of a protein, we are only at the very beginning of understanding the general relationship between sequence, structure, and function. A deep understanding of this relationship would enable us to predict the function of a protein from its sequence, and to design ab initio the sequence of a protein with desired properties and functions.
To deepen our understanding on enzymes, we apply two complementary methods. We investigate the molecular basis of catalytic activity, specificity, and selectivity of enzymes by molecular modeling methods, and we analyze the rapidly increasing number of natural protein sequences by bioinformatics methods. The enzymes we are investigating cover a broad range of catalytic activities and can be applied as selective biocatalysts in organic synthesis: lipases and esterases, cytochrome P450 monooxygenases, thiamin diphosphate-dependent decarboxylases, laccases, and squalene- hopene cyclases. However, these enzymes catalyze not only a single type of reaction each, but show catalytic promiscuity depending of single mutations, modifications of the substrate, or changes in the solvent. By systematically analyzing the sequence-function relationship of a protein family and modeling of enzyme-substrate complexes in realistic solvent conditions, improved biocatalysts for industrially relevant substrates and reactions are designed.
Using homology modeling and molecular docking, we investigate how the shape of the enzyme's substrate binding site determines the experimentally observed substrate specificity, regio- and stereoselectivity. Molecular dynamics simulations of enzyme-substrate complexes take into account the flexibility of the enzyme and the substrate, and provide a reliable basis of designing improved variants with broadened substrate profile or increased selectivity. By simulating proteins solved in organic solvents or binding to hydrophobic substrate interfaces, the effect of nonpolar environments to substrate recognition and catalytic activity is investigated. Simulations are performed on our in- house computer cluster and on the infrastructure provided by HLRS and bwForCluster. We use the software package GROMACS for molecular dynamics simulations.
Using our in-house database system, enzyme families are systematically analyzed to study sequence- structure-function relationships, to learn about the sequence space, and to identify new enzymes for applications in white biotechnology. Therefore, we develop relational databases for large protein families, generate multisequence alignments and phylogenetic trees, annotate functionally and structurally relevant positions, and analyse conservation and correlations. Databases are available for lipases, cytochrome P450 monooxygenases, PHA depolymerases, lactamases, thiamin diphosphate-dependent enzymes , laccases, and triterpene cyclases.
The molecular dynamics simulation of proteins in realistic solvents results in a complex dynamics: local conformational changes such as side chain motions, opening of lid structures, or slow movements of domains, binding of solvent molecules to the protein surface...
We establish an infrastructure to store and analyse large volumes of biocatalytic data from enzyme cascade reactions using our BioCatNet database system. By kinetic modelling, reaction bottlenecks are identified such as the stability of the biocatalyst.
In the framework of the German- South African research network in the thematic area "Bioeconomy: using renewable resources for industry", we participate in two research projects: Enzyme engineering of the "small laccase" for the synthesis of antioxidants and surface functionalisation as well as Synergistic degradation of lignocellulose by using expansions and enzymes.
In collaboration with our experimental partners Prof. Truong Nam Hai (Vietnam Academy of Science and Technology, Hanoi) and Prof. Wolfgang Streit (University of Hamburg), we develop an innovative technology platform for identification of novel and useful enzymes from metagenomic samples.
In the framework of the Cluster of Excellence SimTech, we develop and apply an integrated simulation approach to model biochemical and biophysical properties of enzymes in aqueous and non-aqueous solvents: solubility, stability, and enzymatic function.
The enzymatic reduction of imines is a promising approach to synthesize chiral amines. Therefore, biocatalysts with a broad substrate spectrum, high imine reductaseactivity, and high stereoselectivity are needed. Based on the Imine Reductase Engineering Database, we explore the sequence space of imine reductases to identify functionally relevant positions and to design variants.
As a result, a small, highly enriched mutant library will be identified which is sufficiently small to be tested experimentally. The present research project was set up as a feasibility study to identify and validate reliable geometric descriptors.
Solvent selection is a key step process design. An appropriate solvent selection is relevant not only for the dynamic behavior of the enzyme biocatalyst but also for planning the most suitable approach for product recovery (downstream) as well as for substrate solubility and availability which influence the reaction rate.
The Lipase Engeneering Database (LED) integrates information on sequence, structure and function of alpha/beta-hydrolases. This protein class is comprised of enzymes with various functions that share structural rather than sequence similarities. As part of updating the LED we analysed the structures of the proteins in the current version of the LED
A workflow for molecular simulations based on bash scripting is established.