Abstract and keywords
Abstract (English):
The size of linear peptide molecules is considered as a number of amino acid residues ( p ) contained in them. The aim of this work was to analyze the region of existence and occurrence of various natural peptide structures with different p -values. We used SwissProt database contained more than 560000 complete primary structures. We have removed structures containing non-standard amino acid residues, as well as identical amino acid sequences. As a result, 463450 different sequences with a length of 2 to 35,213 amino acid residues were obtained for analysis. The analysis showed that the number of peptide structures on p -scale is characterized by different areas of existence in different biological domains and kingdoms, and the shapes of the profiles of the curves obtained are close to classical distributions. However, they can have sharp high peaks, indicating the presence of a large number of specific proteins with the same p -value. Possible reasons for the existence and features of such distributions are considered.

Keywords:
UniProt, peptide, protein, amino acid sequence, UniProt database
Text
Publication text (PDF): Read Download
References

1. Sewald N., Jakubke H.-D. Peptides: Chemistry and Biology. Weinheim, WILEY-VCH Verlag GmbH, 2002, 562 p.

2. Zamyatnin A.A. Osobennosti sovokupnosti prirodnyh oligopeptidov. Neyrohimiya, 2016, t. 33, № 4, s. 265-275. DOI:https://doi.org/10.1134/S1819712416040176. @@[Zamyatnin A.A. Features of the totality of natural oligopeptides. Neurochemistry, 2016, vol. 33, no. 4, pp. 265-275. DOI: 10.1134 / S1819712416040176. (In Russ.)]

3. Zamyatnin A.A. Structural-functional diversity of the natural oligopeptides. Progr. Biophys. Mol. Biol., 2018, vol. 133, pp. 1-8. DOI:https://doi.org/10.1016/j.pbiomolbio.2017.09.024.

4. Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank. Nucleic Acids Res., 1991, vol. 19, no 1, pp. 2247-2249. DOI:https://doi.org/10.1093/nar/19.suppl.2247.

5. Kneale G.G., Kennard O. The EMBL nucleotide sequence data library. Biochem. Soc. Trans., 1984, vol. 12, no. 6, pp. 1011-1014. DOI:https://doi.org/10.1042/bst0121011.

6. Church D.M., Goodstadt L., Hillier L.W., Zody M.C., Goldstein S., She X., Bult C.J., Agarwala R., Cherry J.L., DiCuccio M., Hlavina W., Kapustin Y., Meric P., Maglott D., Birtle Z., Marques A.C., Graves T., Zhou S., Teague B., Potamousis K., Churas C., Place M., Herschleb J., Runnheim R., Forrest D., Amos-Landgraf J., Schwartz D.C., Cheng Z., Lindblad-Toh K., Eichler E.E., Ponting C.P.; Mouse Genome Sequencing Consortium. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol., 2009, vol. 7, no. 5, p. e1000112. DOI:https://doi.org/10.1371/journal.pbio.1000112.

7. Woese C.R., Kandler O., Wheelis M.L. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl. Acad. Sci. USA, 1990, vol. 87, no. 12, pp. 4576-4579. DOI:https://doi.org/10.1073/pnas.87.12.4576.

8. Zamyatnin A.A. Biofizicheskie problemy oligopeptidnoy regulyacii. Biofizika, 2003, t. 48, № 6, s. 1030-1039. @@[Zamyatnin A.A. Biophysical problems of oligopeptide regulation. Biophysics, 2003, vol. 48, no. 6, pp. 1030-1039. (In Russ.)]

9. Zamyatnin A.A. Biohimicheskie problemy oligopeptidnoy regulyacii. Biohimiya, 2004, t. 69, № 11, s. 1565-1573. DOI:https://doi.org/10.1007/s10541-005-0073-8. @@[Zamyatnin A.A. Biochemical problems of oligopeptide regulation. Biochemistry, 2004, vol. 69, no. 11, pp. 1565-1573. (In Russ.)]

10. Ramakumar S. Stochastic dynamics modeling of the protein sequence length distribution in genomes: implications for microbial evolution. Physica A: Statistical Mechanics and its Applications, 1999, vol. 273, no. 3, pp. 476-485. DOI:https://doi.org/10.1016/S0378-4371(99)00370-2.

11. Tiessen A., Pérez-Rodríguez P, Delaye-Arredondo L.J. Mathematical modeling and comparison of protein size distribution in different plant, animal, fungal and microbial species reveals a negative correlation between protein size and protein number, thus providing insight into the evolution of proteomes. BMC Research Notes, 2012, vol. 5, no. 85. DOI:https://doi.org/10.1186/1756-0500-5-85.

12. Jhang G. Protein-length distributions for the three domains of life. Trends in Genetics, 2000, vol. 16, no. 3, pp. 107-109. DOI:https://doi.org/10.1016/S0168-9525(99)01922-8.

13. Menzerath P. Architektonik des deutschen Wortschatzes. Bonn, 1954, 131 p.

14. Altmann G. Prolegomena to Menzerath’s law. Glottometrika, 1980, vol. 2, pp. 1-10.

15. Eroglu S. Language-like behavior of protein length distribution in proteomes. Complexity, 2014, vol. 20, pp. 12-21. DOI:https://doi.org/10.1002/cplx.21498.

16. Esposti M.D., De Vries S., Crimi M., Ghelli A., Patarnello T., Meyer A. Mitochondrial cytochrome b: evolution and structure of the protein. Biochim. Biophys. Acta, 1993, vol. 1143, no 3, pp. 243-271. DOI:https://doi.org/10.1016/0005-2728(93)90197-n.

17. Zamyatnin A.A. Fiziko-himicheskie i funkcional'nye harakteristiki polnoy sistemy prirodnyh oligopeptidov. Aktual'nye voprosy biologicheskoy fiziki i himii, 2018, t. 3, № 1, c. 225-235. @@[Zamyatnin A.A. Physico-chemical and functional characteristics of a complete system of natural oligopeptides. Modern trends in biological physics and chemistry, 2018, vol. 3, no. 1, p. 225-235. (In Russ.)]


Login or Create
* Forgot password?