R syntax
- 1 rpy
- 2 batch run an R program(2006-03-16)
- 3 2006-03-17 install randomForest on hpc-cmb, how to build an R package
- 4 plot graphs in multiple pages(2006-08-27)
- 5 differences from Splus (S-plus)(2006-10-26)
- 6 2006-11-26 how to inspect/access the components(values) of an object
- 7 2007-02-21 density estimation
- 8 2007-08-20 how to access command line arguments
- 9 2007-09-03 how to get familiar with a package (i.e. package = affy)
- 10 2007-09-03 install bioconductor
- 11 2007-09-10 install R 2.5.1 in ubuntu 7.04 feisty fawn
- 12 2008-02-15 get running time in R
- 13 2008-05-17 string operation
- 14 2008-07-31 type conversion
- 15 2008-10-05 debug
1 rpy
using R operator in python through rpy, for example, to get a boolean list(0 or 1) by testing the value of a list:
ls = [1,2,1,2,1,1,1,2] from rpy import r r["=="](ls, 2)
- . in function name of R is translated to _ in python. i.e. t.test is t_test in python. However, _ of R is not known of which letter corresponds to it in python. So don't use _ in R function name.
- 2007-12-05 disable the summary output when rpy is loaded (redirect stdout before import)::
- import sys sys.stdout = open('/dev/null', 'w') from rpy import r sys.stdout = sys.__stdout__
- "Execution halted" encounted in submitted jobs:
If you run it from console interactively, you probably won't see this error. The traceback is like this, rpy(or R) wants to be interactive (i.e. output something to report error, progress...). But it found it can't be interactive (no active terminal) then it says "Execution halted". Someone got a snippet from in main/errors.c of R 1.5.1:
439: if ( !R_Interactive && !haveHandler && inError ) { 440: REprintf("Execution halted\n"); 441: R_CleanUp(SA_NOSAVE, 1, 0); /* quit, no save, no .Last, status=1 */ 442: }In the case of randomForest_fit() in rpart_prediction.py, randomForest is installed in my home directory on hpc-cmb. So 'r.library("randomForest")' fails and 'r.library("randomForest", lib_loc=os.path.join(lib_path, "R"))' is needed. Python's 'try ... except ...' solves this problem in interactive environment, but fails in case of submitted jobs.
2 batch run an R program(2006-03-16)
bash, from standard input:
R --vanilla <job.R
put in a file:
R --vanilla << EOF data = read.table("filename") ... EOFbash, CMD BATCH:
R CMD BATCH job.R
standalone R program with capability to pass arguments:
#2008-05-17 $* stands for all the arguments on the shell commandline after $0 R --vanilla --args $0 $* <<EOF command_args = commandArgs() #command_args starts with ["/usr/lib/R/bin/exec/R", "--vanilla", "--args"] print(command_args) #EOF below is optional EOF
use Rscript, similar to perl, python:
#!/usr/bin/env Rscript command_args = commandArgs() #command_args starts with ["/usr/lib/R/bin/exec/R", "--slave", "--no-restore", "--file=./this_R_script.R", "--args"] print(command_args)
3 2006-03-17 install randomForest on hpc-cmb, how to build an R package
(not really install it to system directory), how to load it into R
download from http://cran.at.r-project.org/src/contrib/Descriptions/randomForest.html
- uncompress
- 'tar -zxvf randomForest_4.5-16.tar.gz', the new source directory is randomForest.
- build
- 'R CMD build --binary randomForest' (Don't go into directory randomForest.) A file randomForest_4.5-16_R_i386-(platform).tar.gz is created.
- install
- unzip the binary zipfile (delete or move the source directory first if unzip happens in the same directory).
- load package
This is for a package not in system directory(i.e. '/usr/lib/R/library'). In R console, two ways:
- library(randomForest, lib.loc='/home/rcf-14/yuhuang/lib64/R') - .libPaths('/home/rcf-14/yuhuang/lib64/R') ; library(randomForest)
4 plot graphs in multiple pages(2006-08-27)
- check trellis.user.pdf
use the poscript device and command 'par(ask=TRUE)':
par(ask=TRUE)
postscript('/tmp/plot.ps')
plot(z)
plot(rnorm(50))
dev.off()
trellis.device() can't be run within a function. It'll give null graphic output(2006-10-19):
trellis.device(postscript, color=T, file='~/script/test/math650/figures/math650_hw8_fig2.eps') draw_data_no_intr(reg, data2) dev.off()
2008-05-17 using pdf device to plot graphs in multiple pages. start with a pdf(...) and every plot(...) following it would be a new page:
pdf(output_fname) for (i in seq(start_array_id, end_array_id)) { plot(density(m.norm), "l", xlim=c(1, 10), ylim=c(0, 1), main=paste("array id = ", i)) lines(density(mprobe.mean), col="red") }
5 differences from Splus (S-plus)(2006-10-26)
- cor(x,y,na.method="available"):
- unused argument(s) (na.method ...), So probably na.method is not available to cor of R.
For splus, no "_" in variable name in splus for '='. If '_', use '<-', i.e. LOG_AB_RATIO <- log(data1$BRAIN/data1$LIVER); print(LOG_AB_RATIO) directly typing 'LOG_AB_RATIO' outputs nothing. However LOGABRATIO = log(data1$BRAIN/data1$LIVER); works.
6 2006-11-26 how to inspect/access the components(values) of an object
if the object is generated by some function, say glm, then '?glm' and check section Values. Access them via object$value_name.
Or function attributes(object) tells you all the components under the names attribute.:
names(attributes(object)) #tell you all attributes' names of the object attributes(object) #tell you not only the names but also the contents of each attribute attributes(object)$attr_name1 #to access a particular attribute of object attr(object, 'attr_name1') #to access a particular attribute of object object@attr_name1 #to access a particular attribute of object
class(object) tells you the type of the object
2008-10-03 str() seems handy to inspect a data structure. it's like summary().
7 2007-02-21 density estimation
ecdf() Empirical Cumulative Distribution Function
density() Kernel Density Estimation
8 2007-08-20 how to access command line arguments
commandArgs() would provide the access to command line arguments. example:
for (e in commandArgs())
{
cat(e, "\n")
}
9 2007-09-03 how to get familiar with a package (i.e. package = affy)
load the library and open vignettes:
library(affy) openVignette()
you might need to set the option of pdfviewer if the default one is not available on your system. options("pdfviewer"="/usr/bin/evince")
10 2007-09-03 install bioconductor
follow http://www.bioconductor.org/download/:
source("http://bioconductor.org/biocLite.R")
biocLite()
use lib='/usr/local/lib/R/site-library' for biocLite() to specify library path. but default is '/usr/local/lib/R/site-library'.
10.1 to install a specific package
biocLite("tkWidgets") to install package tkWidgets. handy to have GUI turning up for some functionality.
10.2 2009-3-5 libraries required
- libblas-dev (for preprocessCore)
- gfortran (for preprocessCore)
11 2007-09-10 install R 2.5.1 in ubuntu 7.04 feisty fawn
http://cran.r-project.org/bin/linux/debian/README.html for basic setup and solving errors as 'NO_PUBKEY'
help is from https://stat.ethz.ch/pipermail/r-sig-debian/2007-July/000231.html
use source packages from below (add lines below to /etc/apt/sources.list:
deb-src http://<favorite-cran-mirror>/bin/linux/ubuntu feisty/
sudo apt-get build-dep r-base
sudo apt-get source -b r-base
install the debs in that directory (need to run 'sudo apt-get -f intall' or dselect to fix some dependency problem first. lots of cran packages would be installed.)
script to build any package which can't be run after R is upgraded to 2.5.1:
#!/bin/bash
# Script to automate building of r-cran-* packages for Debian.
# Author: Johannes Ranke <jranke@uni-bremen.de> and
# Vincent Goulet <vincent.goulet@act.ulaval.ca>
#DEBEMAIL=jranke@uni-bremen.de
#DEBFULLNAME="Johannes Ranke"
text="Recompiled on etch for CRAN"
#for i in codetools; do
for i in boot cluster codetools foreign kernsmooth lattice mgcv nlme rcompgen rpart survival vr; do
cd $i
rm -rf $i*
rm *.deb
apt-get source -t unstable $i
cd $i-*
version=`dpkg-parsechangelog | grep ^Version | cut -f2 -d " "`~etchcran.1
dch -b -v $version -D etch-cran $text
fakeroot dpkg-buildpackage -B
cd ../..
done
specifically for python-rpy (_rpy2501.so is missing). version 1.0~rc1-5feistycran.1 is bigger than 1.0~rc1-5:
apt-get source python-rpy cd rpy-1.0~rc1/ text="Recompiled on feisty for CRAN" dch -b -v 1.0~rc1-5feistycran.1 -D feisty-cran $text sudo apt-get install python-all-dev fakeroot dpkg-buildpackage -B
12 2008-02-15 get running time in R
proc.time() gives you the user, sys, elapsed time for this whole R session.
Sys.time() tells you the current system time. It returns a variable that you could do arithmetic calculations.:
tm = Sys.time() ... Sys.time()-tm
13 2008-05-17 string operation
paste returns a string. cat writes the string to stdout or file and returns NULL:
fname = paste('/tmp/other_output/164_mprobe_mean.rda','fda', sep="") cat(fname, '\n') no_of_chars = nchar(fname) cat(substr(fname, 2, 4))
14 2008-07-31 type conversion
read.table() usually automatically converts data into appropriate type. The notorious fallback action of converting everything other than numeric to the factor data type is a bug generator. After this, simply using type-casting functions such as as.numeric(), as.integer over factor data usually fails to convert data type correctly.
Two alternative ways to resolve it.
- pass as.is=c(1:4) to read.table to disable the fallback conversion. 1:4 means from column 1 to 4. Then simple as.integer(data[,4]) would work.
- wedge as.character() before other type-casting funtion. like as.integer(as.character(data[,4])).
15 2008-10-05 debug
several ways to debug in R: http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR/
i explain a little of the built-in debug() function. It must be in interactive mode, not Rscript mode. Use debug(fun) to mark a function for debugging. Then 4 commands:
- <RET>
- Go to the next statement if the function is being debugged. Continue execution if the browser was invoked.
- c or cont
- Continue execution without single stepping.
- n
- Execute the next statement in the function. This works from the browser as well.
- where
- Show the call stack.
- Q
- Halt execution and jump to the top-level immediately.