microarray
1 2007-09-03 handle affymetrix data with bioconductor
project bioconductor
use R commands to open documents about it, like what is CEL, CDF?
2 2007-09-04 how to read custom cdf files (not available through bioconductor)
install bioconductor package, makecdfenv:
source("http://bioconductor.org/biocLite.R")
biocLite("makecdfenv")
read the cel file and make a cdf environment with same name as Data@cdfName:
Data <- ReadAffy('ge 1092 (Con 1) Drosge1 jt.CEL')
Data@cdfName
DrosGenome1 = make.cdf.env('DrosGenome1.CDF')
eset = mas5(Data)
another way is to make an R package:
R> make.cdf.package("atSNPtil_expr.cdf", cdf.path='/tmp/yanli_8-8-07/Full/atSNPtil_expr/LibFiles/', species = "Arabidopsis_thaliana", package.path='/tmp/yanli_8-8-07')
bash> R CMD INSTALL atsnptilexprcdf
3 2007-09-05 some definitions
- CEL
- files contain measured intensities and locations for an array that has been hybridized.
- CDF
- file contain the information relating probe pair sets to locations on the array.
4 2008-03-13 CelQuantileNorm
normalization program used in WTCCC's 2007 paper. http://www.wtccc.org.uk/info/software.shtml
change the array size, #define CEL_INTENSITY_ARRAY_SIZE 2598544 in cel_qnorm_pass1.cpp and gtype_cel_to_pq.h
4.1 some command lines
Generates a reference intensity array from mean quantile:
./CelQuantileNorm/cel_qnorm_pass1 -o 250k_test/yanli8-8-07.intfile /Network/Data/250k/raw_data/yanli8-8-07/*.CEL
Output quantile-normalized results for each array into a directory. Each array gets one output file. with debug flag toggled:
./gtype_cel_to_pq -refintensity /tmp/intfile -cdf ~/script/variation/genotyping/250ksnp/data/atSNPtilx520433_rev2/Full/atSNPtil_geno/LibFiles/atSNPtil_geno.cdf -log-average -outdir /tmp/ -debug /Network/Data/250k/raw_data/yanli8-8-07/*.CEL >/tmp/stdout1 2>/tmp/stderr1
Output quantile-normalized results of all arrays into one file given a subset of probes.:
./CelQuantileNorm/gtype_cel_to_pq -refintensity 250k_test/yanli8-8-07.intfile -cdf ~/script/variation/genotyping/250ksnp/data/atSNPtilx520433_rev2/Full/atSNPtil_geno/LibFiles/atSNPtil_geno.cdf -single 250k_test/yanli8-8-07 -subset 250k_test/250kprobe_subset.txt -log-average /Network/Data/250k/raw_data/yanli8-8-07/*.CEL
Output quantile-normalized results of all arrays into one file given a subset of probes. debug flag toggled.:
./gtype_cel_to_pq -refintensity /tmp/intfile -cdf ~/script/variation/genotyping/250ksnp/data/atSNPtilx520433_rev2/Full/atSNPtil_geno/LibFiles/atSNPtil_geno.cdf -debug -single /tmp/yanli8-8-07 -subset 250kprobe_subset.txt -log-average /Network/Data/250k/raw_data/yanli8-8-07/*.CEL >/tmp/stdout1 2>/tmp/stderr1
Run Chiamo:
./chiamo -i 250k_test/yanli8-29-07_chiamo.input -o 250k_test/yanli8-29-07_chiamo.out -max1 -max2 -nmax 200 -n 0 -b 0 -approx 1 20 -pgd 0.01 >/tmp/stdout
5 2008-03-13 Affymetrix SDK
from http://www.affymetrix.com/support/developer/fusion/index.affx?terms=yes . The direct download link is http://www.affymetrix.com/Auth/support/developer/fusion/affy-fusion-release-110b.zip.
in affy/sdk/calvin_files/makefile.g5, change '-I../portability' in CPPFLAGS to -I../ cuz dir portability is directly mentioned in some include<portability/....>
read_cel commandline:
crocea@banyan:~/script/variation/genotyping/250ksnp/affymetrix_dev/affy/sdk/calvin_files$ ./read_cel /Network/Data/250k/raw_data/yanli8-8-07/Ler-1A.CEL ~/script/variation/genotyping/250ksnp/data/atSNPtilx520433_rev2/Full/atSNPtil_geno/LibFiles/atSNPtil_geno.cdf >/tmp/stdout