In R, How to Differentiate a Result Is Vector or Matrix

In R, how do you differentiate a result is vector or matrix?

Here are a few ways to see what the result of split(calculations..) is:

class(split(mtcars$mpg, mtcars$cyl))
typeof(split(mtcars$mpg, mtcars$cyl))
mode(split(mtcars$mpg, mtcars$cyl))
storage.mode(split(mtcars$mpg, mtcars$cyl))

# str() Shows the structure of the object. It gives an small summary of it.
str(split(mtcars$mpg, mtcars$cyl))

You can also assing the a new object with the list and interrogate it using the previous functions

cars_ls <- split(mtcars$mpg, mtcars$cyl)

class(cars_ls)
typeof(cars_ls)
mode(cars_ls)

# and

str(cars_ls)
# List of 3
# $ 4: num [1:11] 22.8 24.4 22.8 32.4 30.4 33.9 21.5 27.3 26 30.4 ...0
# $ 6: num [1:7] 21 21 21.4 18.1 19.2 17.8 19.7
# $ 8: num [1:14] 18.7 14.3 16.4 17.3 15.2 10.4 10.4 14.7 15.5 15.2 ...

By now, it's clear the object split returns is a list. In this case, the list cars_ls has 3 numeric vectors.
You can index the list in a few ways. Here are some examples. Obviously, there is no matrix here.

# Using $[backquote][list name][back quote]
cars_ls$`4`

# Including names using [
cars_ls[1]

# No names using [[
cars_ls[[1]]

EDIT
Technically speaking, lists are vectors also. Here are a few more functions to check what type of object you have.

is.vector(cars_ls)
# [1] TRUE
is.matrix(cars_ls)
# [1] FALSE
is.list(cars_ls)
# [1] TRUE
is.data.frame(cars_ls)
# [1] FALSE

Regarding what unlist does:

un_ls <- unlist(cars_ls)

mode(un_ls)
storage.mode(un_ls)
typeof(un_ls)
class(un_ls)

is.vector(un_ls)
# [1] TRUE
is.list(un_ls)
# [1] FALSE

un_ls is a numeric vector, clearly not a list. So unlist() grabs a list and unlists it.

You can find a more detailed description of these functions in the R Language Definition

What are the differences between vector, matrix and array data types?

There is no difference between a matrix and a 2D array:

> x <- matrix(1:10, 2)
> y <- array(1:10, c(2, 5))
> identical(x, y)
[1] TRUE
...

matrix is just a more convenient constructor, and there are many functions and methods that only accept 2D arrays (a.k.a. matrices).

Internally, arrays are just vectors with a dimension attribute:

...
> attributes(x)
$dim
[1] 2 5

> dim(x) <- NULL
> x
[1] 1 2 3 4 5 6 7 8 9 10
> z <- 1:10
> dim(z) <- c(2, 5)
> is.matrix(z)
[1] TRUE

To cite the language definition:

Matrices and arrays are simply vectors with the attribute dim and
optionally dimnames attached to the vector.

[...]

The dim attribute is used to implement arrays. The content of the
array is stored in a vector in column-major order and the dim
attribute is a vector of integers specifying the respective extents of
the array. R ensures that the length of the vector is the product of
the lengths of the dimensions. The length of one or more dimensions
may be zero.

A vector is not the same as a one-dimensional array since the latter
has a dim attribute of length one, whereas the former has no dim
attribute.

Difference between two vectors in R

You can use setdiff

setdiff(b,a)
#[1] 2 6 8

Difference between [] and $ operators for subsetting

Below we will use the one-row data frame in order to provide briefer output:

mtcars1 <- mtcars[1, ]

Note the differences among these. We can use class as in class(mtcars["hp"]) to investigate the class of the return value.

The first two correspond to the code in the question and return a data frame and plain vector respectively. The key differences between [ and $ are that [ (1) can specify multiple columns, (2) allows passing of a variable as the index and (3) returns a data frame (although see examples later on) whereas $ (1) can only specify a single column, (2) the index must be hard coded and (3) it returns a vector.

mtcars1["hp"]  # returns data frame
## hp
## Mazda RX4 110

mtcars1$hp # returns plain vector
## [1] 110

Other examples where index is a single element. Note that the first and second examples below are actually the same as drop = TRUE is the default.

mtcars1[, "hp"] # returns plain vector
## [1] 110

mtcars1[, "hp", drop = TRUE] # returns plain vector
## [1] 110

mtcars1[, "hp", drop = FALSE] # returns data frame
## hp
## Mazda RX4 110

Also there is the [[ operator which is like the $ operator except it can accept a variable as the index whereas $ requires the index to be hard coded:

mtcars1[["hp"]] # returns plain vector
## [1] 110

Others where index specifies multiple elements. $ and [[ cannot be used with multiple elements so these examples only use [:

mtcars1[c("mpg", "hp")] # returns data frame
## mpg hp
## Mazda RX4 21 110

mtcars1[, c("mpg", "hp")] # returns data frame
## mpg hp
## Mazda RX4 21 110

mtcars1[, c("mpg", "hp"), drop = FALSE] # returns data frame
## mpg hp
## Mazda RX4 21 110

mtcars1[, c("mpg", "hp"), drop = TRUE] # returns list
## $mpg
## [1] 21
##
## $hp
## [1] 110

[

mtcars[foo] can return more than one column if foo is a vector with more than one element, e.g. mtcars[c("hp", "mpg")], and in all cases the return value is a data.frame even if foo has only one element (as it does in the question).

There is also mtcars[, foo, drop = FALSE] which returns the same value as mtcars[foo] so it always returns a data frame. With drop = TRUE it will return a list rather than a data.frame in the case that foo specifies multiple columns and returns the column itself if it specifies a single column.

[[

On the other hand mtcars[[foo]] only works if foo has one element and it returns that column, not a data frame.

$

mtcars$hp also only works for a single column, like [[, and returns the column, not a data frame containing that column.

mtcars$hp is like mtcars[["hp"]]; however, there is no possibility to pass a variable index with $. One can only hard-code the index with $.

subset

Note that this works:

subset(mtcars, hp > 150)

returning a data frame containing those rows where the hp column exceeds 150:

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8

other objects

The above pertain to data frames but other objects that can use $, [ and [[ will have their own rules. In particular if m is a matrix, e.g. m <- as.matrix(BOD), then m[, 1] is a vector, not a one column matrix, but m[, 1, drop = FALSE] is a one column matrix. m[[1]] and m[1] are both the first element of m, not the first column. m$a does not work at all.

help

See ?Extract for more information. Also ?"$", ?"[" and ?"[[" all get to the same page, as well.

get a derivative by knowing two numeric vectors

To find the derivative use the numeric approximation: (y2-y1)/(x2-x1) or dy/dx. In R use the diff function to calculate the difference between 2 consecutive points:

x<-rnorm(100)
y<-x^2+x

#find the average x between 2 points
avex<-x[-1]-diff(x)/2
#find the numerical approximation
#delta-y/delta-x
dydx<-diff(y)/diff(x)

#plot numeric approxiamtion
plot(x=avex, dydx)
#plot analytical answer
lines(x=avex, y=2*avex+1)


Related Topics



Leave a reply



Submit