4 Objects

4.1 What are Objects?

Objects are, roughly, data (or more generally a stored state) that knows what it can do.

We know what happens when we put this troublesome + guy between numbers

## [1] 2

But it’s less clear what it means to + letters

## Error in "a" + "b": non-numeric argument to binary operator

Let’s see what typeof variables 1 and "a" are :

## [1] "double"
## [1] "character"

(note this is a little misleading, typeof determines the base object class that an R object is stored as. All R objects are composed of base objects, we’ll get to the types of objects in the next section)

R has a useful package pryr for inspecting objects and other meta-linguistic needs. Let’s get that now.

4.1.1 Object terminology

A class is the description, or ‘blueprint’ of how individual objects or instances are made, including their attributes - which data should be kept and what it should be named, and methods, the functions that they are capable of calling on their stored data or attributes. Objects can have a nested structure, and sub-classes can inherit the attributes and methods of their parent classes.

For example: As a class, trucks have attributes like engine_size, number_of_wheels, or number_of_jumps_gone_off. Trucks have the method go_faster(), but only individual instances of trucks can go_faster() - the concept/class of trucks can’t. As a subclass, monster_trucks also have the attributes engine_size, etc. and the method go_faster(), but they also have additional attributes like mythical_backstory and methods like monster_jam().

4.2 Objects in R

“In R functions are objects and can be manipulated in much the same way as any other object.” - R language guide 2.1.5

“S3 objects are functions that call the functions of their objects” - Also R

4.2.1 Object Systems

R has base types and three object-oriented systems (also called types). We’ll spend more time on Base types and S3 objects in this lesson, and return to S4 and reference classes when we start building bigger code.

  • Base types: Low-level C types. Build the other object systems.

  • S3 - “Casual objects”: Objects that use generic functions. S3 methods “belong to” functions, not classes. Functions contain the UseMethod(“function_name”, object) function (see ?UseMethod).

  • S4 - “Formal objects”: Formal classes with inheritance and means by which methods can be shared between classes. S4 methods still “belong to” functions, but classes are more rigorously defined.

  • Reference classes: Objects that use message passing - or the method finally ‘belongs to’ the class rather than a function.

The easiest way to see everything about an object is to use the str() function, short for structure. For example we can see everything about the lamest linear model ever

## List of 12
##  $ coefficients : Named num [1:2] -3 1
##   ..- attr(*, "names")= chr [1:2] "(Intercept)" "c(4, 5, 6)"
##  $ residuals    : Named num [1:3] -9.06e-17 1.81e-16 -9.06e-17
##   ..- attr(*, "names")= chr [1:3] "1" "2" "3"
##  $ effects      : Named num [1:3] -3.46 -1.41 -2.22e-16
##   ..- attr(*, "names")= chr [1:3] "(Intercept)" "c(4, 5, 6)" ""
##  $ rank         : int 2
##  $ fitted.values: Named num [1:3] 1 2 3
##   ..- attr(*, "names")= chr [1:3] "1" "2" "3"
##  $ assign       : int [1:2] 0 1
##  $ qr           :List of 5
##   ..$ qr   : num [1:3, 1:2] -1.732 0.577 0.577 -8.66 -1.414 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:3] "1" "2" "3"
##   .. .. ..$ : chr [1:2] "(Intercept)" "c(4, 5, 6)"
##   .. ..- attr(*, "assign")= int [1:2] 0 1
##   ..$ qraux: num [1:2] 1.58 1.26
##   ..$ pivot: int [1:2] 1 2
##   ..$ tol  : num 1e-07
##   ..$ rank : int 2
##   ..- attr(*, "class")= chr "qr"
##  $ df.residual  : int 1
##  $ xlevels      : Named list()
##  $ call         : language lm(formula = c(1, 2, 3) ~ c(4, 5, 6))
##  $ terms        :Classes 'terms', 'formula'  language c(1, 2, 3) ~ c(4, 5, 6)
##   .. ..- attr(*, "variables")= language list(c(1, 2, 3), c(4, 5, 6))
##   .. ..- attr(*, "factors")= int [1:2, 1] 0 1
##   .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. ..$ : chr [1:2] "c(1, 2, 3)" "c(4, 5, 6)"
##   .. .. .. ..$ : chr "c(4, 5, 6)"
##   .. ..- attr(*, "term.labels")= chr "c(4, 5, 6)"
##   .. ..- attr(*, "order")= int 1
##   .. ..- attr(*, "intercept")= int 1
##   .. ..- attr(*, "response")= int 1
##   .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
##   .. ..- attr(*, "predvars")= language list(c(1, 2, 3), c(4, 5, 6))
##   .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
##   .. .. ..- attr(*, "names")= chr [1:2] "c(1, 2, 3)" "c(4, 5, 6)"
##  $ model        :'data.frame':   3 obs. of  2 variables:
##   ..$ c(1, 2, 3): num [1:3] 1 2 3
##   ..$ c(4, 5, 6): num [1:3] 4 5 6
##   ..- attr(*, "terms")=Classes 'terms', 'formula'  language c(1, 2, 3) ~ c(4, 5, 6)
##   .. .. ..- attr(*, "variables")= language list(c(1, 2, 3), c(4, 5, 6))
##   .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
##   .. .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. .. ..$ : chr [1:2] "c(1, 2, 3)" "c(4, 5, 6)"
##   .. .. .. .. ..$ : chr "c(4, 5, 6)"
##   .. .. ..- attr(*, "term.labels")= chr "c(4, 5, 6)"
##   .. .. ..- attr(*, "order")= int 1
##   .. .. ..- attr(*, "intercept")= int 1
##   .. .. ..- attr(*, "response")= int 1
##   .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
##   .. .. ..- attr(*, "predvars")= language list(c(1, 2, 3), c(4, 5, 6))
##   .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
##   .. .. .. ..- attr(*, "names")= chr [1:2] "c(1, 2, 3)" "c(4, 5, 6)"
##  - attr(*, "class")= chr "lm"

We can query any object’s base type with pryr’s otype

## [1] "base"
## [1] "S3"
## [1] "S3"

and its class with class

## [1] "numeric"
## [1] "lm"

Confusingly, R’s object system means that a given object will have both a class and a type, for example:

## [1] "base"
## [1] "numeric"

4.2.2 Attributes

Object can also have arbitrarily many attributes. The most important and common are

  • names - which give the object the ability to refer to its elements by name. for example:
## [1] "apples"   "bananas"  "cherries"
## apples 
##      1
  • class - which is used by the S3 object system, we’ll see that in a moment

  • dim - short for dimensions, which is used by multidimensional base objects. We’ll see that in a moment too.

You can query a specific attribute with attr

## [1] "apples"   "bananas"  "cherries"

or list all attributes with attributes

## $names
## [1] "apples"   "bananas"  "cherries"

4.3 Base Types

Every R object is built out of basic C structures that define how it is stored and managed in memory.

This table from Advanced R summarizes them:

Homogenous data Heterogenous data
1-Dimensional Atomic Vector List
2-Dimensional Matrix Data frame
N-Dimensional Array

Recall that we can use typeof() to find an object’s base type

## [1] "double"
## [1] "list"

4.3.1 Vectors

Vectors are sequences, the most basic data type in R. They have two varieties: atomic vectors (with homogenous values) and lists (with … heterogenous values).

R has no 0-dimensional, scalar types, so individual characters or numbers are length=one atomic vectors. They are:

Atomic Vector Type Example
Logical booleans <- c(TRUE, FALSE, NA)
Integer integers <- c(1L, 2L, 3L)
Double (== numeric) doubles <- c(1, 2.5, 0.005)
Character characters <- c("apple", "banana")

raw and complex types also exist, but they are rare.

Vectors are constructed with c(). When heterogeneous vectors are constructed with c(), they are coerced to the most permissive vector type (an integer can be both a double (floating point numbers with decimal points) and character “1”) - the table above is ordered from least to most permissive.

## [1] "integer"
## [1] "double"
## [1] "character"
## [1] 1
## [1] "1"

Each of the different atomic vector types has different methods (we’ll come back to how methods work in a bit), which explains why we can 1 + 1 but not "1" + "1". Notice how the integer class has a set of methods called “Arith” (see ?Arith, an S4 group of generic functions, something we won’t talk about until section 5) but character doesn’t.

## [1] as.data.frame coerce        Ops          
## see '?methods' for accessing help and source code
##  [1] all.equal                as.data.frame           
##  [3] as.Date                  as.POSIXlt              
##  [5] as.raster                coerce                  
##  [7] coerce<-                 formula                 
##  [9] getDLLRegisteredRoutines Ops                     
## see '?methods' for accessing help and source code

To make a vector that preserves the types of its elements, make a list instead

## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
## 
## [[3]]
## [1] "3"
## [1] "integer"
## [1] "double"
## [1] "character"

Notice the double bracket notation [[]]. Lists are commonly recursive, ie. they store other lists. Since the elements of our list are themselves lists, single bracket indexing [] returns lists, and [[]] returns the the elements in that list.

## [1] TRUE
## [[1]]
## [1] 1
## [1] "list"
## [[1]]
## [1] 1 2 3
## 
## [[2]]
## [1] "apple"    "banana"   "cucumber"
## [[1]]
## [1] 1 2 3
## [[1]]
## [1] 1 2 3
## [1] 1 2 3
## [1] 1

Similarly to coersion among atomic vectors, vectors that contain lists will be coerced to lists.

## [1] 1 2 3
## [1] 1 2 3
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
## 
## [[3]]
## [1] 3
## [[1]]
## [1] 1 2 3
## 
## [[2]]
## [1] "a" "b" "c"
## [1] "1" "2" "3" "a" "b" "c"

Because they are the most general form of vector, lists are used as the base type for many derived classes, like data frames

## [1] "list"

4.3.2 Matrices & Arrays

Arrays are atomic vectors with a dim attribute. Matrices are arrays with dim = 2.

## , , 1
## 
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
## 
## , , 2
## 
##      [,1] [,2] [,3]
## [1,]    7    9   11
## [2,]    8   10   12
## 
## , , 3
## 
##      [,1] [,2] [,3]
## [1,]   13   15   17
## [2,]   14   16   18
## 
## , , 4
## 
##      [,1] [,2] [,3]
## [1,]   19   21   23
## [2,]   20   22   24
## [1] "integer"
## $dim
## [1] 2 3 4
##      [,1] [,2] [,3]
## [1,]    1    9   17
## [2,]    2   10   18
## [3,]    3   11   19
## [4,]    4   12   20
## [5,]    5   13   21
## [6,]    6   14   22
## [7,]    7   15   23
## [8,]    8   16   24
## , , 1
## 
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
## 
## , , 2
## 
##      [,1] [,2] [,3]
## [1,]    7    9   11
## [2,]    8   10   12
## 
## , , 3
## 
##      [,1] [,2] [,3]
## [1,]   13   15   17
## [2,]   14   16   18
## 
## , , 4
## 
##      [,1] [,2] [,3]
## [1,]   19   21   23
## [2,]   20   22   24

In higher dimensions, c() becomes cbind(), rbind(), and abind(); column and row bind for matrices and array bind for arrays.

##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
## [4,]    1    2    3
## [5,]    4    5    6
## [6,]    7    8    9
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    4    7    1    2    3
## [2,]    2    5    8    4    5    6
## [3,]    3    6    9    7    8    9
## , , 1
## 
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
## 
## , , 2
## 
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9

Arrays and matrices also have new methods that lists and vectors dont.

## [1] all.equal     as.data.frame coerce        Ops           relist       
## [6] type.convert  within       
## see '?methods' for accessing help and source code
##  [1] anyDuplicated as.data.frame as.raster     boxplot       coerce       
##  [6] determinant   duplicated    edit          head          initialize   
## [11] isSymmetric   Math          Math2         Ops           relist       
## [16] subset        summary       tail          unique       
## see '?methods' for accessing help and source code

4.3.3 Data Frames

Data frames are one of the gems of R. A data frame is a list of equal length vectors.

##   little_ones big_ones
## 1           0        5
## 2           1        6
## 3           2        7
## 4           3        8
## 5           4        9
## $names
## [1] "little_ones" "big_ones"   
## 
## $class
## [1] "data.frame"
## 
## $row.names
## [1] 1 2 3 4 5

data frames can be used like lists of vectors

##   little_ones
## 1           0
## 2           1
## 3           2
## 4           3
## 5           4
## [1] 0 1 2 3 4
## [1] 0

Or using names with the $ operator (see ?Extract for more information).

## [1] "little_ones" "big_ones"
## [1] "little_ones" "big_ones"
## [1] "1" "2" "3" "4" "5"
## [1] 0 1 2 3 4
## [1] 5 6 7 8 9

Data frames also inherit the methods of lists and vectors

##   little_ones big_ones medium_ones
## 1           0        5           3
## 2           1        6           4
## 3           2        7           5
## 4           3        8           6
## 5           4        9           7
##    little_ones big_ones
## 1            0        5
## 2            1        6
## 3            2        7
## 4            3        8
## 5            4        9
## 6            3        3
## 7            4        4
## 8            5        5
## 9            6        6
## 10           7        7

4.3.4 Etc.

Functions, environments, and other stuff that we’ll learn about in our section on Functions are also base objects, but we’ll discuss them then.

4.4 S3 Objects

S3 objects “belong to” functions, which become their methods. S3 classes don’t really “exist,” but are assigned as an object’s “class” attribute. S3 classes are one of the worst things about R, but are also responsible for some of its flexibility.

## NULL
## [1] "letters"

One can find an articulation of the reasoning behind this “function-and-class” programming can be found here: https://developer.r-project.org/howMethodsWork.pdf. We’ll talk more about this in later sections.

S3 objects are defined by a series of functions that themselves contain the UseMethod() function - this is described briefly above, try ?UseMethod for more detail. These functions extend the generic function, typically using the syntax generic.class() as in the case of mean.Date() for taking the mean of dates. One can list the objects that have a generic method, and the methods that an object has with methods()

## [1] mean.Date     mean.default  mean.difftime mean.POSIXct  mean.POSIXlt 
## see '?methods' for accessing help and source code
##  [1] -             [             [[            [<-           +            
##  [6] as.character  as.data.frame as.list       as.POSIXct    as.POSIXlt   
## [11] Axis          c             coerce        cut           diff         
## [16] format        hist          initialize    is.numeric    julian       
## [21] length<-      Math          mean          months        Ops          
## [26] pretty        print         quarters      rep           round        
## [31] seq           show          slotsFromS3   split         str          
## [36] summary       Summary       trunc         weekdays      weighted.mean
## [41] xtfrm        
## see '?methods' for accessing help and source code

By default, the source code of S3 methods is not visible to R, one can retreive it with `utils::getS3method``

The plot base function is an s3 generic method.

## [1] "s3"      "generic"

By default, if the first argument is a base type compatible with being points on a scatterplot, the actual function that is called is plot.default, whose source behaves like you’d expect:

## function (x, y = NULL, type = "p", xlim = NULL, ylim = NULL, 
##     log = "", main = NULL, sub = NULL, xlab = NULL, ylab = NULL, 
##     ann = par("ann"), axes = TRUE, frame.plot = axes, panel.first = NULL, 
##     panel.last = NULL, asp = NA, ...) 
## {
##     localAxis <- function(..., col, bg, pch, cex, lty, lwd) Axis(...)
##     localBox <- function(..., col, bg, pch, cex, lty, lwd) box(...)
##     localWindow <- function(..., col, bg, pch, cex, lty, lwd) plot.window(...)
##     localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...)
##     xlabel <- if (!missing(x)) 
##         deparse(substitute(x))
##     ylabel <- if (!missing(y)) 
##         deparse(substitute(y))
##     xy <- xy.coords(x, y, xlabel, ylabel, log)
##     xlab <- if (is.null(xlab)) 
##         xy$xlab
##     else xlab
##     ylab <- if (is.null(ylab)) 
##         xy$ylab
##     else ylab
##     xlim <- if (is.null(xlim)) 
##         range(xy$x[is.finite(xy$x)])
##     else xlim
##     ylim <- if (is.null(ylim)) 
##         range(xy$y[is.finite(xy$y)])
##     else ylim
##     dev.hold()
##     on.exit(dev.flush())
##     plot.new()
##     localWindow(xlim, ylim, log, asp, ...)
##     panel.first
##     plot.xy(xy, type, ...)
##     panel.last
##     if (axes) {
##         localAxis(if (is.null(y)) 
##             xy$x
##         else x, side = 1, ...)
##         localAxis(if (is.null(y)) 
##             x
##         else y, side = 2, ...)
##     }
##     if (frame.plot) 
##         localBox(...)
##     if (ann) 
##         localTitle(main = main, sub = sub, xlab = xlab, ylab = ylab, 
##             ...)
##     invisible()
## }
## <bytecode: 0x7f9df2e75370>
## <environment: namespace:graphics>

If the first argument to plot has its own plot method (ie. that it is exported by the object’s package namespace, more about this in section 5), that function is called instead. That’s why

is different than this nonsensical model

4.4.1 Example: Extending S3 Objects

http://adv-r.had.co.nz/OO-essentials.html “Creating new methods and generics”

Using a class’s method is what allows us to do sensible computations on different types of objects with the same command.

## [1] "s3"      "generic"
## [1] 1
## [1] 1.25
## [1] "that's just a one you maniac"
##  [1] mean,ANY-method          mean,Matrix-method      
##  [3] mean,sparseMatrix-method mean,sparseVector-method
##  [5] mean.Date                mean.default            
##  [7] mean.difftime            mean.just_one           
##  [9] mean.POSIXct             mean.POSIXlt            
## see '?methods' for accessing help and source code
## NULL
## Warning in mean.default(dates): argument is not numeric or logical:
## returning NA
## [1] NA
## [1] "Date"
## [1] "2000-01-08"
## [1] "2000-01-08"

4.5 S4 Objects

S4 objects have a single class definition with specifically defined fields and functions. They are too complicated for us to cover in much detail yet, so we will return to them again later.

We could pretend for awhile we’re another class with S3 objects

## [1] "data.frame"
## 
## Call:
## NULL
## 
## No coefficients
## Error in if (p == 0) {: argument is of length zero

Not so with S4 objects.

We can finally implement our truck classes

S4 objects have slots, accessible with @ (which behaves like $) or slot(). We create new instances of S4 objects with new()

S4 Methods are a headache (and we will skip them in the class). One has to create a generic function if it does not yet exist with setGeneric(), then set the method, classes and function separately with setMethod(). An example for your edification:

Try extending that to have the monster trucks tell their mythical_backstory as they accelerate.

We will return to S4 objects in more detail in section 5.

4.6 Reference Classes

References classes are a “truly” object oriented system in R, but we are going to skip them entirely for now because they are rare enough that you aren’t likely to encounter them yet. See here for more information: http://adv-r.had.co.nz/OO-essentials.html#rc