Other properties, such as hydrophobicity or solubility, do not correlate well although they would be hypothesized to be important given that this residue��s side chain is exposed. Instead, the detected dependences reflect helical constraints on the backbone conformation, which must be preferably achieved with small amino acids. In the second case, for Gly87, the two descriptors Hydrophobicity and P explain equally well the observed distributions. This residue is located in a tight turn on the protein surface and is 42% exposed, suggesting that the negative correlation with hydrophobicity reflects its role in conferring solubility. Notice that glycine is an important outlier in the plot against Hydrophobicity, probably due to the important conformational constraint revealed by its correlation against P. The finding of two important dependencies might point at the requirement of a malleable and polar amino acid at this position rather than simply a ��flexible�� one, because flexibility does not correlate as well. Notice that although glycine residues are usually attributed a role in conferring flexibility, our analysis suggests that this particular glycine would fulfill other roles. Arg222 and Ser98 provide CP-358774 supply examples where the PI-103 clinical trial atomic composition of the side chain describes the observed distributions better than any physicochemical property. The guanidinum group of Arg222 forms multiple hydrogen bonds and salt bridges with the carboxylate groups of Asp214 and Asp233 and with three backbone oxygen atoms, which effectively closes a loop located at the protein surface. Thus, arginine is by far the preferred residue, followed by histidine and then the other amino acids. Similarly, Ser98 is roughly as likely as Thr or Asn, with the three of them having one oxygen atom in the side chain, whereas Asp and Glu are more favored than any of those three, and amino acids with no oxygens in their side chains are the least favored. Ser98 is located in a small loop closed through extensive hydrogen bonds to its alcohol group, where carboxylate groups could accommodate more interactions. As a final note we would like to posit the idea that despite the analyzed data is specific for TEM-1 hydrolyzing ampicillin and thus most interpretations are valid under that setting, a significant fraction of the fits could reflect generalizable trends, especially those concerning protein stability and solubility. A second important difficulty in finding descriptors to account for distributions might arise from multiple constraints that affect several properties simultaneously, and/or that follow non-monotonic dependencies on the descriptor such that intermediate values are optimal. In principle, fitting to nonlinear functions and combinations of descriptors could unveil these patterns, but it would be difficult to assess the statistical significance of different fits.