% Plotting Spilled... Ink? % Michael Stone % February 25, 2012 Ever since I bought [Hadley Wickham]'s lovely book ["ggplot2: Elegant Graphics for Data Analysis (Use R!)"][ggplot2-book] a few weeks back, I've been meaning to write up a simple end-to-end example of data collection and plotting using [ggplot2]. Thus, without further delay, let's try to make a pretty picture of the rate at which I've been writing here (and thus, of the rate at which my rather naive [site search][site-search] implementation's dataset is growing). Here's what we'll do: 1. Install [R] and [ggplot2]: ~~~~ { .bash } sudo aptitude install r-base r-cran-ggplot2 ~~~~ 2. Collect the data: ~~~~ { .bash } echo date bytes post > data.txt (for f in $(find posts -name 'index.txt'); do DATE=$(cat $f | head -n 3 | tail -n 1 | sed -e 's/^% //'); echo $(date -d "$DATE" +%s) \ $(stat -c '%s' $f) \ $(echo $f | sed -e 's,posts/,,' -e 's,/index.txt,,'); done) | sort -n -k1 >> data.txt ~~~~ 3. Sanity-check the resulting data: $ head data.txt date bytes post 1232600400 2038 joy_of_tex 1234674000 3947 irrefutability 1275796800 2076 openkey 1300248000 5958 afd_discussions 1300593600 1358 safe_phones 1301371200 1126 convergence 1302235200 1404 secrets 1302408000 2916 comment_systems 1307160000 833 scheduling 4. Make the plot: $ R ~~~~ { .R } library("ggplot2") # load ggplot2 df <- read.table("data.txt", header = TRUE) # load the data ndf <- df[order(df$date),] # sort the data ndf$date2 <- as.POSIXct(ndf$date, origin="1970-01-01") # convert timestamps to dates ndf$total_bytes <- cumsum(ndf$bytes) # count total bytes over time svg(filename="data.svg", width=6, height=4) # make the plot qplot(x = date2, y = total_bytes, data=ndf, xlab="date", ylab="total bytes") dev.off() ~~~~ 5. Enjoy: ![](data.svg) (P.S. - Care to guess when I joined [Iron Blogger]? :-) [Hadley Wickham]: http://had.co.nz/ [R]: http://www.r-project.org [ggplot2]: http://had.co.nz/ggplot2/ [ggplot2-book]: http://www.amazon.com/gp/product/0387981403?ie=UTF8&tag=hadlwick-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=0387981403 [site-search]: http://mstone.info/posts/site_search/ [Iron Blogger]: http://iron-blogger.mako.cc/