summary(data)
= returns min, 1st quartile, median, mean, 3rd quartile, maxboxplot(data$v1, col = “blue”)
= produces a box with middles 50% highlighted in the specified colour. Graphical output of summary()
.histogram(data$v1, col = "green")
= produces a histogram with specified breaks and colour
breaks = 100
= number of bars. The higher the number is the smaller/narrower the histogram columns are. Too big = noisey / rough, too small = can’t see shape of distributionrug(data$v1)
= density plot, adds strip under histrogram indicating location of each data pointbarplot(table(data), col = "wheat", main = "Title")
= produces a bar graph, usually for categorical data
abline(h/v = 12)
= overlays a horizontal (boxplot) or vertical (histogram) line at specificed location.col = “red”
= specifies colorlwd = 4
= line widthlty = 2
= line type
boxplot
to see how much data falls at specified value.boxplot(data)
;abline(h = 12)
. If line is above box, then less than 75% (3rd quartile) of islands speak 12 languages.histogram
abline(v = meadian(data), col = blue, lwd = 4)
= displays median line. note boxplot contains median line as feature, histogram does notboxplot(pm25 ~ region, data = pollution, col = “red”)
par(mfrow = c(2, 1), mar = c(4, 4, 2, 1))
= set marginhist(subset(pollution, region == "east")$pm25, col = "green")
= first histogramhist(subset(pollution, region == "west")$pm25, col = "green")
= second histogramwith(pollution, plot(latitude, pm25))
= scatterplot of pollution going south to north (latitude). Plots two variables.abline(h = 12, lwd = 2, lty = 2)
= plots horizontal dotted line with(pollution, plot(latitude, pm25, col = region))` = same as first histogram, but data dots are coloured by region. Red = west, Black = East. Plots three variables.par(mfrow = c(1, 2), mar = c(5, 4, 2, 1))
= sets margins
with(subset(pollution, region == "west"), plot(latitude, pm25, main = "West"))
= left scatterplotwith(subset(pollution, region == "east"), plot(latitude, pm25, main = "East"))
= right scatterplot * both scatterplots (single & multiple) show that for pollution is higher in mid-latitudes than low / high latitudes for both eastern and western regions.
with(pollution, plot(latitude, pm25, col = region))
abline(h = 12, lwd = 2, lty = 2)
= plots horizontal dotted line showing the standard air quality levelplot(jitter(child, 4)~parent, galton)
= spreads out data points at the same position to simulate measurement error/make high frequency more visibbleplot(x, y)
or hist(x)
will launch a graphics device and draw a plot on device?par
”airquality <- transform(airquality, Month = factor(month))
pch
: plotting symbol (default = open circle)lty
: line type (default is solid) * 0=blank, 1=solid (default), 2=dashed, 3=dotted, 4=dotdash, 5=longdash, 6=twodashlwd
: line width (integer)col
: plotting color (number string or hexcode, colors() returns vector of colors)xlab
, ylab
: x-y label character stringscex
: numerical value giving the amount by which plotting text/symbols should be magnified relative to the default
cex = 0.15 * variable
: plot size as an additional variablepar()
function = specifies global graphics parameters, affects all plots in an R session (can be overridden)
las
: orientation of axis labelsbg
: background colormar
: margin size (order = bottom left top right)oma
: outer margin size (default = 0 for all sides)mfrow
: number of plots per row, column (plots are filled row-wise)mfcol
: number of plots per row, column (plots are filled column-wise)par("parameter")
plot
= make a scatterplot, or other plot depending on class of datalines
= add lines to existing plot. I.E. Connecting dots in a time-series plotpoints
= add points to a plot. I.E. add a different group or subset afterwardstext
= add text labels within the plot using specified x,y coordinatestitle
= add text labels outside the plot (x/y-axis, title, subtitle, outer margin)mtext
= add text to inner/outer margin of plotaxis
= add axis ticks/labelslibrary(datasets)
# type =“n” sets up the plot and does not fill it with data
with(airquality, plot(Wind, Ozone, main = "Ozone and Wind in New York City"))
# subsets of data are plotted here using different colors
with(subset(airquality, Month == 5), points(Wind, Ozone, col = "blue"))
with(subset(airquality, Month != 5), points(Wind, Ozone, col = "red"))
legend("topright", pch = 1, col = c("blue", "red"), legend = c("May", "Other Months"))
# regression line is produced here
model <- lm(Ozone ~ Wind, airquality)
abline(model, lwd = 2)
lattice
Plotting Systemlibrary(lattice)
= load lattice systemlattice
and grid
packageslattice
package = contains code for producing Trellis graphics (independent from base graphics system)
grid
package = implements the graphing system; lattice build on top of gridbase plot
panel
functions can be specified/customized to modify the subplotstrellis.par.set()
\(\rightarrow\) can be used to set global graphic parameters for all trellis objectslattice
Functions and Parametersxyplot()
= main function for creating scatterplotsbwplot()
= box and whiskers plots (box plots)histogram()
= histogramsstripplot()
= box plot with actual pointsdotplot()
= plot dots on “violin strings”splom()
= scatterplot matrix (like pairs() in base plotting system)levelplot()
/contourplot()
= plotting image dataxyplot(y ~ x | f * g, data, layout, panel)
~
) = left hand side is the y-axis variable, and the right hand side is the x-axis variablef
/g
= conditioning/categorical variables (optional)
*
indicates interaction between two variablesf
and g
data
= the data frame/list from which the variables should be looked up
layout
= specifies how the different plots will appear
layout = c(5, 1)
= produces 5 subplots in a horizontal fashionpanel
function can be added to control what is plotted inside each panel of the plot
panel
functions receive x/y coordinates of the data points in their panel (along with any additional arguments)?panel.xyplot
= brings up documentation for the panel functions