PyEED is collaborative project on developing a sustainable, re-usable toolbox for enzyme engineering. The toolbox includes methods to establish databases on sequence, structure, and function of protein families, bioinformatics methods for analyzing sequence and structure data, modelling tools for studying substrate binding sites and for designing mutants, and data management tools for handling experimental data on enzyme function. It uses currently available technology and methods: PyEED GitHub as collaborative software development platform, Galaxy and Jupyter as interactive development environment for workflows, version control tools for the management of parameter studies, Python as object-oriented programming language, Biopython as Python bioinformatics libraries, UniProt, GenBank, and PDB as public databases, and EnzymeML for the standardized monitoring and exchange of data on enzyme-catalyzed reactions.
The toolbox will be applied by each PyEED Fellow for her or his specific research questions: (functional) annotation of novel genes from (meta)genomic sequencing, searching for promising enzyme candidates in a databases, or designing of a focused, highly enriched mutant library. PyEED is application-centered and starts from the specific application rather than from the database. By using Juypter Notebook, each PyEED Fellow can build a specific workflow to answer her or his individual research questions from existing tools such as Biopython. The data model is represented by a Python object model and gradually built up and extended as required by the applications. Thus, different research questions and different enzyme families can be tackled by the same library of tools and the same object model representing proteins.