The PASS (Prediction of Activity Spectra for Substances) estimates the probable biological activity profiles for compounds under study based on their structural formula presented in MOLfile or SDfile format. For prediction of cytotoxicity against tumor vs non-tumor cell lines and interactions with target proteins, PASS cell lines version was prepared using special training sets. PASS prediction is based on the knowledge base about “structure-cell line cytotoxicity” relationships for 76,804 structures of compounds. The spectrum of biological activity reflects the result of chemical interaction of compounds with different biological objects. Prediction spectrum of biological activity is based on the analysis of the relationship "structure-activity" for substances of training sample. In this case, the spectrum of biological activity is the assessment of cytotoxicity in relation to different cell lines.
The results are stored in the prediction file SDF, containing information about the structure of a chemical compound and predicted cytotoxicity against a variety of human cell lines with estimated probability of activity: Pa - probability "to have cytotoxicity" ("to be active"); Pi - probability "to have no cytotoxicity" "to be inactive". Cancer cell line prediction results contain information about cell line ID, cell line full name, tissue, and tumor type; non-tumor cell line prediction results - about cell line ID, cell line full name, and tissue.
Average accuracy of prediction estimated in leave-one-out cross-validation procedure for the whole PASS training set is about 96%.