Prediction of signal peptides in protein sequences by neural networks.

  • Dariusz Plewczynski Interdisciplinary Centre for Mathematical and Computational Modelling, Warsaw University, Warszawa, Poland. D.Plewczynski@icm.edu.pl;
  • Lukasz Slabinski
  • Krzysztof Ginalski
  • Leszek Rychlewski

Abstract

We present here a neural network-based method for detection of signal peptides (abbreviation used: SP) in proteins. The method is trained on sequences of known signal peptides extracted from the Swiss-Prot protein database and is able to work separately on prokaryotic and eukaryotic proteins. A query protein is dissected into overlapping short sequence fragments, and then each fragment is analyzed with respect to the probability of it being a signal peptide and containing a cleavage site. While the accuracy of the method is comparable to that of other existing prediction tools, it provides a significantly higher speed and portability. The accuracy of cleavage site prediction reaches 73% on heterogeneous source data that contains both prokaryotic and eukaryotic sequences while the accuracy of discrimination between signal peptides and non-signal peptides is above 93% for any source dataset. As a consequence, the method can be easily applied to genome-wide datasets. The software can be downloaded freely from http://rpsp.bioinfo.pl/RPSP.tar.gz.
Published
2008-05-26
Section
Articles