Bioinformatics and Biostatistics

Network/Pathway Analysis
Significance of Proteomic Results
SNP Analysis


Technique:Network/Pathway Analysis

A notable recurrent theme in post-genomic life sciences is the use of systems theory to elucidate mechanisms of biological phenomena. This trend exemplifies an acceding departure from traditions which employ linearity and simplicity as theoretical constraints while traversing from cause and effect, for e.g. when a particular adverse clinical outcome is attributed to an abnormality in a single protein. Systems theory proposes a contrasting approach where a biological phenomenon is studied as part of an integrated system, i.e. when afore mentioned clinical outcome is attributed to a perturbation of an intricately connected network of proteins and other relevant biological components. Dynamic for static, integration for isolation, and holism for reductionism serve as compelling arguments which make systems level analysis one of particular interest to our research. At the Case Center for Proteomics and Bioinformatics, network/pathway analyses are routinely implemented to compliment standard analytical pipelines in biomarker discovery. In Ingenuity Pathway Analysis® (IPA) and MetaCore., clinical biomarkers are often interpreted in the context of biological processes and pathways. These high level synopses of complex data provide for clinically relevant interpretations of perturbation. In IPA, biomarkers can be evaluated for novelty, relevance, and robustness by comparing them against a knowledgebase of published findings. MetaCore™, a largely comparable resource often serves as an independent means to assess the reproducibility of extrapolated network components from IPA and vice versa. MetaDrug™, an integral portion of MetaCore's compendium empowers the latter stages of a clinical study with the capability to predict potential pharmacological effects of new and existing compounds. Notably, the principal commonality that secures IPA/MetaCore's role in our workflow can be attributed to the high quality of their knowledge bases; A non-trivial achievement due to the arduous nature of literature mining by manual curation. In contrast, Pathway Studio® builds its knowledgebase utilizing artificial intelligence coupled with natural language processing as a compensatory approach. Pathway Studio introduces flexibility in generating networks that is unsurpassed by its commercial counterparts. Using Pathway studio, researchers can generate an up to date comprehensive relational network of a particular disease and associated biological components, or, a specialized disease network with a focus on a specific tissue, molecular level, clinical phase, or grant. Pathway Studio's pliability serves as a valuable asset in studying lesser known diseases or well known diseases in a specialized context. These commercial applications are almost always utilized in conjunction with publicly available tools, algorithms, and databases. By incorporating a multitude of approaches, our network/pathway efforts attempt to extract clinically applicable knowledge from high-throughput information in order to seamlessly traverse from hypothesis to study and study to hypothesis while addressing the complexity that is intrinsic to the field of life sciences.

Contact Person:
Case Center for Proteomics and Bioinformatics
Case Western Reserve University
10900 Euclid Avenue, BRB 919
Cleveland, Ohio 44106
(phone) 216-368-2317
(fax) 216-368-6846

Technique:Significance of Proteomic Results

A common application in proteomics data is to determine the statistical significance of differential expression between several experimental conditions. A number of pre-processing steps are required to remove sources of systematic variation due to experimental artifacts in the measured intensities and to ensure that the usual assumptions for statistical inferences are met (e.g. normality, homoscedasticity). We recently developed routines to handle missing data and normalizations issues that are applicable to any type of multivariate datasets including proteomics data (e.g. in label free assays). An adaptive regularization procedure was recently developed in the center that achieved the best combination of normalization and variance stabilization required for preprocessing multivariate datasets of this type where the number of variables dominates the number of samples. For statistical inferences, we routinely employ unsupervised methods as well as model-based approaches (e.g. Linear Mixed Model) that incorporate measurement error in combination with empirical Bayes estimators, and error rates correction methods. These conventional statistical approaches are however challenged when dealing with modern high-dimensional datasets where the number of variables greatly exceeds the number of samples. One active area of research is to reduce the increased error rates and address the model overfitting issues in these settings.

Technique:SNP Analysis

It is estimated that >99% of human genome sequences are the same across the population; however, variations in the rest of the genome are responsible for the diversity of human beings. Single Nucleotide Polymorphisms (SNPs) are one of the most common sequence variations; a single nucleotide (or a small number of them) in a subpopulation is different than the rest of the population. SNPs occur about every 200 bases along the 3 billion- base human genome. They can affect how individuals develop diseases, respond to pathogens and drugs, and are key enablers in realizing the concept of personalized medicine. Methods to detect SNPs (SNP genotyping) are hybridization or enzyme-based. More recently, the next generation of sequencing technologies has been used to identify SNPs, which are suited to identifying multiple SNPs in a small region. Common SNPs can be found at http://www.ncbi.nlm.nih.gov/projects/SNP/ and http://www.hapmap.org/index.html.en/.

Contact Person:
Jean-Eudes Dazard, Ph.D. (jean-eudes.dazard@case.edu)
Case Center for Proteomics and Bioinformatics
Case Western Reserve University
10900 Euclid Avenue, BRB 936
Cleveland, Ohio 44106
(phone) 216-368-3157
(fax) 216-368-6846