Protein’s function is ultimately determined by its amino-acid sequence which in turn determine its structure. The general problem of the sequence-function relationship is very complicated and some case studies that focus on a well-defined function in a small portion of the sequence space can be informative. Here, we show that using a protein display assay coupled with high-throughput sequencing, it is possible to examine quantitatively the effect of thousands of mutations around a reference sequence. This approach rely solely on the sequences-function relationship and can provide valuable insights on the mechanisms of a protein without prior structural knowledge. The ability to extract valuable knowledge in the noisy context of biological experiments is made possible thanks to new inference techniques based on Information Theory. Using a simple model that considers each amino acid as independent we demonstrate the ability to infer an energy model for the binding of the hYap65 WW domain, which recapitulates to a certain extent the thermodynamics laws of a binding event. Finally, we extended our analysis by looking at the sequence-structure relationship of two other WW domains.
Aleksandra Walczak & Thierry Mora
You can download the thesis at here: M2-Thesis
This work was done at the Ecole Normal Superieure ENS ULM in the Laboratoire de Physique Therorique (LPT), in Paris, France