On Ribosome Load, Codon Bias and Protein Abundance

Stefan Klumpp, Max Planck Institute of Colloids and Interfaces
Jiajia Dong, Bucknell University
Terence Hwa, Center for Theoretical Biological Physics, University of California San Diego

Description

Different codons encoding the same amino acid are not used equally in protein-coding sequences. In bacteria, there is a bias towards codons with high translation rates. This bias is most pronounced in highly expressed proteins, but a recent study of synthetic GFP-coding sequences did not find a correlation between codon usage and GFP expression, suggesting that such correlation in natural sequences is not a simple property of translational mechanisms. Here, we investigate the effect of evolutionary forces on codon usage. The relation between codon bias and protein abundance is quantitatively analyzed based on the hypothesis that codon bias evolved to ensure the efficient usage of ribosomes, a precious commodity for fast growing cells. An explicit fitness landscape is formulated based on bacterial growth laws to relate protein abundance and ribosomal load. The model leads to a quantitative relation between codon bias and protein abundance, which accounts for a substantial part of the observed bias for E. coli. Moreover, by providing an evolutionary link, the ribosome load model resolves the apparent conflict between the observed relation of protein abundance and codon bias in natural sequences and the lack of such dependence in a synthetic gfp library. Finally, we show that the relation between codon usage and protein abundance can be used to predict protein abundance from genomic sequence data alone without adjustable parameters.