Downloading R
R's official web site is:
http://www.r-project.org/
You can download the latest version directly from:
http://cran.cnr.berkeley.edu/
Histograms from the command line
Check out
histogram-R-plot.pl
. This program will take a regular list of numbers in a flat file and will talk to R in order to make a PDF. You can run it with
--help
in order to get some example usages.
Multiple histograms in the same figure
First off, this will only work with the PDF (on any system) and Quartz (the Mac OS X on-screen display) devices, so be sure you're using one of those devices or you won't see anything.
There are three key parts to this:
- After the first
hist()
command include the argument add=TRUE
. Make sure the first hist command does NOT include "add", or you won't see anything.
- Use transparent colors, such as
col="#FF000088"
, which is a half transparent red. These colors are similar to HTML colors, but the last two digits (in this case, the 88
) are the opacity. Use FF
for fully opaque colors, 00
for fully transparent.
- Use common breakpoints for all the histogram boxes. This is slightly more tricky, as you first have to compute the breakpoints. Then pass them into every
hist()
command using the breaks= argument.
Here's an example:
# First I'll create some data to plot
x <- rnorm(300)
y <- rnorm(300, mean=5, sd=5)
# Now we'll compute the common breakpoints by doing a histogram of all the data together
b <- hist(c(x,y), plot=F)
# Plot the data from x with the computed breakpoints and a transparent color
hist(x, breaks=b$breaks, col="#FF000088")
#Plot the data from y, with the computed breakpoints, a transparent color, and add it to the existing plot
hist(y, breaks=b$breaks, col="#00FF0088", add=T)
Filled and overlapping densities
Similar to the histograms, but the trick here is that instead of using plot(density()), we use polygon() in order to set the fill.
# First I'll create some data to plot
x <- rnorm(300)
y <- rnorm(300, mean=5, sd=5)
# calculate the densities
x.d <- density(x)
y.d <- density(y)
# Next we'll establish the right plot area. Add axes labels and titles here
b <- plot(NA, xlim=range(c(x.d$x, y.d$x)), ylim=range(c(x.d$y, y.d$y)))
# Plot the densities
polygon(x.d, col="#FF000088")
polygon(y.d, col="#0000FF88")
Quick Hints and Commands
Important commands:
summary(ARRAY)
stem(ARRAY)
How to have multi-color text strings
Documentation and Examples
Here is a page that explains how to do clustering and some fancy histogram stuff with R:
More should go up here later, especially "canned" commands for running R from the command line.