Personal tools
You are here: Home log R R syntax
Document Actions

R syntax

1   rpy

using R operator in python through rpy, for example, to get a boolean list(0 or 1) by testing the value of a list:

ls = [1,2,1,2,1,1,1,2]
from rpy import r
r["=="](ls, 2)
  • . in function name of R is translated to _ in python. i.e. t.test is t_test in python. However, _ of R is not known of which letter corresponds to it in python. So don't use _ in R function name.
2007-12-05 disable the summary output when rpy is loaded (redirect stdout before import)::
import sys sys.stdout = open('/dev/null', 'w') from rpy import r sys.stdout = sys.__stdout__
"Execution halted" encounted in submitted jobs:

If you run it from console interactively, you probably won't see this error. The traceback is like this, rpy(or R) wants to be interactive (i.e. output something to report error, progress...). But it found it can't be interactive (no active terminal) then it says "Execution halted". Someone got a snippet from in main/errors.c of R 1.5.1:

439: if ( !R_Interactive && !haveHandler && inError ) {
440: REprintf("Execution halted\n");
441: R_CleanUp(SA_NOSAVE, 1, 0); /* quit, no save, no .Last, status=1 */
442: }

In the case of randomForest_fit() in rpart_prediction.py, randomForest is installed in my home directory on hpc-cmb. So 'r.library("randomForest")' fails and 'r.library("randomForest", lib_loc=os.path.join(lib_path, "R"))' is needed. Python's 'try ... except ...' solves this problem in interactive environment, but fails in case of submitted jobs.

2   batch run an R program(2006-03-16)

  1. bash, from standard input:

    R --vanilla <job.R
    
  2. put in a file:

    R --vanilla << EOF
    data = read.table("filename")
    ...
    EOF
    
  3. bash, CMD BATCH:

    R CMD BATCH job.R
    
  4. standalone R program with capability to pass arguments:

    #2008-05-17 $* stands for all the arguments on the shell commandline after $0
    R --vanilla --args $0 $* <<EOF
    command_args = commandArgs()        #command_args starts with ["/usr/lib/R/bin/exec/R", "--vanilla", "--args"]
    print(command_args)
    
    #EOF below is optional
    EOF
    
  5. use Rscript, similar to perl, python:

    #!/usr/bin/env Rscript
    
    command_args = commandArgs()        #command_args starts with ["/usr/lib/R/bin/exec/R", "--slave", "--no-restore", "--file=./this_R_script.R", "--args"]
    print(command_args)
    

3   2006-03-17 install randomForest on hpc-cmb, how to build an R package

(not really install it to system directory), how to load it into R

download from http://cran.at.r-project.org/src/contrib/Descriptions/randomForest.html

uncompress
'tar -zxvf randomForest_4.5-16.tar.gz', the new source directory is randomForest.
build
'R CMD build --binary randomForest' (Don't go into directory randomForest.) A file randomForest_4.5-16_R_i386-(platform).tar.gz is created.
install
unzip the binary zipfile (delete or move the source directory first if unzip happens in the same directory).
load package

This is for a package not in system directory(i.e. '/usr/lib/R/library'). In R console, two ways:

- library(randomForest, lib.loc='/home/rcf-14/yuhuang/lib64/R')

- .libPaths('/home/rcf-14/yuhuang/lib64/R') ; library(randomForest)

4   plot graphs in multiple pages(2006-08-27)

  • check trellis.user.pdf

use the poscript device and command 'par(ask=TRUE)':

par(ask=TRUE)
postscript('/tmp/plot.ps')
plot(z)
plot(rnorm(50))
dev.off()

trellis.device() can't be run within a function. It'll give null graphic output(2006-10-19):

trellis.device(postscript, color=T, file='~/script/test/math650/figures/math650_hw8_fig2.eps')
draw_data_no_intr(reg, data2)
dev.off()
  • 2008-05-17 using pdf device to plot graphs in multiple pages. start with a pdf(...) and every plot(...) following it would be a new page:

    pdf(output_fname)
    for (i in seq(start_array_id, end_array_id))
    {
         plot(density(m.norm), "l", xlim=c(1, 10), ylim=c(0, 1), main=paste("array id = ", i))
         lines(density(mprobe.mean), col="red")
    }
    

5   differences from Splus (S-plus)(2006-10-26)

cor(x,y,na.method="available"):
unused argument(s) (na.method ...), So probably na.method is not available to cor of R.

For splus, no "_" in variable name in splus for '='. If '_', use '<-', i.e. LOG_AB_RATIO <- log(data1$BRAIN/data1$LIVER); print(LOG_AB_RATIO) directly typing 'LOG_AB_RATIO' outputs nothing. However LOGABRATIO = log(data1$BRAIN/data1$LIVER); works.

6   2006-11-26 how to inspect/access the components(values) of an object

  • if the object is generated by some function, say glm, then '?glm' and check section Values. Access them via object$value_name.

  • Or function attributes(object) tells you all the components under the names attribute.:

    names(attributes(object)) #tell you all attributes' names of the object
    attributes(object) #tell you not only the names but also the contents of each attribute
    attributes(object)$attr_name1 #to access a particular attribute of object
    attr(object, 'attr_name1') #to access a particular attribute of object
    object@attr_name1 #to access a particular attribute of object
    
  • class(object) tells you the type of the object

  • 2008-10-03 str() seems handy to inspect a data structure. it's like summary().

7   2007-02-21 density estimation

ecdf() Empirical Cumulative Distribution Function

density() Kernel Density Estimation

8   2007-08-20 how to access command line arguments

commandArgs() would provide the access to command line arguments. example:

for (e in  commandArgs())
{
  cat(e, "\n")
}

9   2007-09-03 how to get familiar with a package (i.e. package = affy)

load the library and open vignettes:

library(affy)
openVignette()

you might need to set the option of pdfviewer if the default one is not available on your system. options("pdfviewer"="/usr/bin/evince")

10   2007-09-03 install bioconductor

follow http://www.bioconductor.org/download/:

source("http://bioconductor.org/biocLite.R")

biocLite()

use lib='/usr/local/lib/R/site-library' for biocLite() to specify library path. but default is '/usr/local/lib/R/site-library'.

10.1   to install a specific package

biocLite("tkWidgets") to install package tkWidgets. handy to have GUI turning up for some functionality.

10.2   2009-3-5 libraries required

  1. libblas-dev (for preprocessCore)
  2. gfortran (for preprocessCore)

11   2007-09-10 install R 2.5.1 in ubuntu 7.04 feisty fawn

http://cran.r-project.org/bin/linux/debian/README.html for basic setup and solving errors as 'NO_PUBKEY'

help is from https://stat.ethz.ch/pipermail/r-sig-debian/2007-July/000231.html

use source packages from below (add lines below to /etc/apt/sources.list:

deb-src http://<favorite-cran-mirror>/bin/linux/ubuntu feisty/

sudo apt-get build-dep r-base

sudo apt-get source -b r-base

install the debs in that directory (need to run 'sudo apt-get -f intall' or dselect to fix some dependency problem first. lots of cran packages would be installed.)

script to build any package which can't be run after R is upgraded to 2.5.1:

#!/bin/bash
# Script to automate building of r-cran-* packages for Debian.
# Author: Johannes Ranke <jranke@uni-bremen.de> and
#         Vincent Goulet <vincent.goulet@act.ulaval.ca>

#DEBEMAIL=jranke@uni-bremen.de
#DEBFULLNAME="Johannes Ranke"
text="Recompiled on etch for CRAN"
#for i in codetools; do
for i in boot cluster codetools foreign kernsmooth lattice mgcv nlme rcompgen rpart survival vr; do
        cd $i
        rm -rf $i*
        rm *.deb
        apt-get source -t unstable $i
        cd $i-*
        version=`dpkg-parsechangelog | grep ^Version | cut -f2 -d " "`~etchcran.1
        dch -b -v $version -D etch-cran $text
        fakeroot dpkg-buildpackage -B
        cd ../..
 done

specifically for python-rpy (_rpy2501.so is missing). version 1.0~rc1-5feistycran.1 is bigger than 1.0~rc1-5:

apt-get source python-rpy
cd rpy-1.0~rc1/
text="Recompiled on feisty for CRAN"
dch -b -v 1.0~rc1-5feistycran.1 -D feisty-cran $text
sudo apt-get install python-all-dev
fakeroot dpkg-buildpackage -B

12   2008-02-15 get running time in R

proc.time() gives you the user, sys, elapsed time for this whole R session.

Sys.time() tells you the current system time. It returns a variable that you could do arithmetic calculations.:

tm = Sys.time()
...
Sys.time()-tm

13   2008-05-17 string operation

  • paste returns a string. cat writes the string to stdout or file and returns NULL:

    fname = paste('/tmp/other_output/164_mprobe_mean.rda','fda', sep="")
    cat(fname, '\n')
    no_of_chars = nchar(fname)
    cat(substr(fname, 2, 4))
    

14   2008-07-31 type conversion

read.table() usually automatically converts data into appropriate type. The notorious fallback action of converting everything other than numeric to the factor data type is a bug generator. After this, simply using type-casting functions such as as.numeric(), as.integer over factor data usually fails to convert data type correctly.

Two alternative ways to resolve it.

  1. pass as.is=c(1:4) to read.table to disable the fallback conversion. 1:4 means from column 1 to 4. Then simple as.integer(data[,4]) would work.
  2. wedge as.character() before other type-casting funtion. like as.integer(as.character(data[,4])).

15   2008-10-05 debug

several ways to debug in R: http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR/

i explain a little of the built-in debug() function. It must be in interactive mode, not Rscript mode. Use debug(fun) to mark a function for debugging. Then 4 commands:

<RET>
Go to the next statement if the function is being debugged. Continue execution if the browser was invoked.
c or cont
Continue execution without single stepping.
n
Execute the next statement in the function. This works from the browser as well.
where
Show the call stack.
Q
Halt execution and jump to the top-level immediately.
Related content
« November 2009 »
Su Mo Tu We Th Fr Sa
1234567
89101112 1314
1516171819 20 21
22232425262728
2930
 

Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: