Beginning with R — The uncharted territory Part 2

For a recap of lists, vectors and matrices in R checkout Beginning with R — The uncharted territory Part 1.

Table of Contents

Arrays

Array is an object which can hold multidimensional data. Matrices are a subset of arrays as in they are two dimensional arrays. So, together with an attribute of dimension i.e. dim, arrays also have attribute dimnames. Array is simply a multidimensional data structure.

Its syntax is a <- array(data, dim = c(x,y,z,t...))

a <- array(1:24, dim = c(3,4,2)); print(a)
## , , 1
## 
##      [,1] [,2] [,3] [,4]
## [1,]    1    4    7   10
## [2,]    2    5    8   11
## [3,]    3    6    9   12
## 
## , , 2
## 
##      [,1] [,2] [,3] [,4]
## [1,]   13   16   19   22
## [2,]   14   17   20   23
## [3,]   15   18   21   24
vec1 <- c(10,20,30,40)
vec2 <- c(12,13,14,15)
b <- array(c(vec1,vec2), dim = c(2,2,2)); print(b)
## , , 1
## 
##      [,1] [,2]
## [1,]   10   30
## [2,]   20   40
## 
## , , 2
## 
##      [,1] [,2]
## [1,]   12   14
## [2,]   13   15

To define labels for different dimensions, use dimnames

vec1 <- c(10,20,30,40)
vec2 <- c(12,13,14,15)
b <- array(c(vec1,vec2), dim=c(2,2,2), dimnames = list(c("a", "b"),
                                                      c("d", "e"),
                                                      c("g", "h"))); print(b)
## , , g
## 
##    d  e
## a 10 30
## b 20 40
## 
## , , h
## 
##    d  e
## a 12 14
## b 13 15
arr <- array(1:27,dim=c(3,3,3)); print(arr)
## , , 1
## 
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
## 
## , , 2
## 
##      [,1] [,2] [,3]
## [1,]   10   13   16
## [2,]   11   14   17
## [3,]   12   15   18
## 
## , , 3
## 
##      [,1] [,2] [,3]
## [1,]   19   22   25
## [2,]   20   23   26
## [3,]   21   24   27
t <- arr[1:2,1:2,,drop=FALSE]; print(attributes(t)); print(t)
## $dim
## [1] 2 2 3
## , , 1
## 
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## 
## , , 2
## 
##      [,1] [,2]
## [1,]   10   13
## [2,]   11   14
## 
## , , 3
## 
##      [,1] [,2]
## [1,]   19   22
## [2,]   20   23

Factors

For the representation of categorical data, R has specific object called factors. Factors are basically integers and have labels associated with them. So, a particular number of factors are associated with a particular label. These labels are called levels. Factors look like characters but are integers in reality. Further uses of Factors are to sort all the categorical datasets according to one categorical dataset.

factor() command is used to create a factor object.

fruits <- factor(c('apple','orange','orange','apple','orange','banana','apple'))
print(attributes(fruits))
## $levels
## [1] "apple"  "banana" "orange"
## 
## $class
## [1] "factor"

The levels are by default unordered. To order them you can define the levels.

fruits <- factor(c('apple','orange','orange','apple','orange','banana','apple'), 
                 levels = c('apple', 'orange', 'banana'))
print(attributes(fruits))
## $levels
## [1] "apple"  "orange" "banana"
## 
## $class
## [1] "factor"

Dataframes

Dataframes are used to store tabular data. Lists of equal length are stored in dataframes.

a <- data.frame(city=c('Jaipur','Jammu'), rank = c(2,3)); print(a)
##     city rank
## 1 Jaipur    2
## 2  Jammu    3

The data stored can be of different type. One column may be character, another may be factors and so on. But each column must have same type of data.

Avatar
Puneet Sharma
Research Scholar

My research interests include cloud & aerosol modeling and statistics.

Related

Next
Previous
comments powered by Disqus