Personal tools
You are here: Home news&event Nancy Zhang Seminar
Document Actions

Nancy Zhang Seminar

Stanford University, "Simultaneous Change-point Models with Applications to Cross-sample and Cross-platform Analysis of DNA Copy Number"

What
When 2009-10-22
from 14:00 to 15:15
Where RRI 101
Add event to calendar vCal
iCal

1   abstract

DNA copy number analysis involves the detection of chromosomal gains and losses using high-density microarray platforms. Change-point methods have been applied successfully to detecting signals in single data sequences derived from one biological sample. However, it is common to have data sets involving hundreds to thousands of biological samples. How should information be combined across samples to detect population level common polymorphisms?

Also, how should the samples be summarized to give a sparse signature of variation across the cohort? It is also now common to have the same biological sample assayed using multiple experimental platforms. For example, in the Cancer Genome Atlas project, each biological sample is processed using Illumina, Affymetrix and Agilent chips. How should data be integrated across platforms to achieve higher accuracy?

I will discuss the statistical issues underlying these problems and formulate a class of simultaneous change-point models for cross-sample and cross-platform data integration. These models lead to interpretable scan statistics whose significance level can be theoretically analyzed. I will also discuss model selection approaches for this class of models. The insights gained from this study can be applied to integrative analysis of data from other types of genome-wide profiling experiments, such as methylation or RNA expression.

2   log

DNA copy number

?? same boundary

2.1   pull samples together to detect

sparse representation

shared change-point free jump model

f_i(t) = u + sigma_i I(s-t)

sum of square of t-stat, then take max

false positive rate

2.2   multi-platform integration (MPCBS)

r_k linear coefficient for different platforms

scaling factor for different platform: (sginal response rate) sqrt(number of probes)/error-stdev

2.3   Recursive segmentaion approach (MSCBS)

scan by z-score

find the max

go back do it again

no of change points by BIC

classic BIC requires the likelihood to be differentiable and also limited number of models

model selection involves: N (#samples), T(#probes), m(#change points), M (#"mean" parameters)

BIC - NmH(M/Nm), H(p) is entropy. the extra term offsets the decrease of residual

2.4   Validation:

  1. no of disagreements between tech replicates
  2. compare child and parents. only if parent has it

2.5   cons

  1. the stat (essentially is sum of chi-square stat) is not good for rare variants (<5%)

More information about this event…

« November 2009 »
Su Mo Tu We Th Fr Sa
1234567
89101112 1314
1516171819 20 21
22232425262728
2930
 

Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: