Although such a GWA study has shown some success in the past few years, it suffers from serious multiple testing problem when applied to a number of markers in a large population, and its basic hypothesis of Common Disease Common Variant (CDCV) has been challenged by the fact that both common variants and rare variants may be involved in the third pathogenesis of common diseases.To overcome these limitations and serve as a complementary category of these traditional statistical methods, computational approaches that rely on properties of variants instead of experimental data of patients have been designed for the detection of deleterious variants, with the growing functional annotations of the human genome sequence.
Although such methods may never be accurate enough to replace wet-lab experiments, they may help in identifying and prioritizing a small number of susceptible and tractable candidate nsSNPs from pools of available data [1]. Recent studies [9�C21] have shown that computational methods are capable of well estimating the functional effects of nsSNPs. These approaches may take advantage of structure information, sequence information, and annotations as classification features, as well as logistic regression [21], neural networks [1], Bayesian models [5], and other statistical approaches [18] as classifiers. In this paper, we first summarize the databases for collecting nsSNP data and provide a framework of nsSNP function prediction methodology. We survey existing deleterious nsSNPs prediction methods and summarize the prediction features conducted in prediction models and the prediction algorithms to distinguish the deleterious nsSNPs.
Then, we discuss computational methods that use comparative genomics to predict deleteriousness of nsSNPs in both coding and noncoding regions. We also look at prioritization methods for disease-specific nsSNPs detection and discuss deleterious nsSNPs prediction methods for rare variants detection. Finally, we suggest using multiple prediction algorithms to enhance the prediction power and discuss challenges and likely future improvements of such methods.2. Databases for nsSNPsMany popular databases present useful information of nsSNPs. Particularly, as shown in Table 1, deleterious nsSNPs are mainly collected in four databases: the Online Mendelian Inheritance in Man (OMIM) database Anacetrapib [22], the Human Gene Mutation Database (HGMD) [12], the UniProt/Swiss-Prot database [13], and the Human Genome Variation database (HGVbase) [14].