Prevent Automatic Conversion of Single Column to Vector

Prevent automatic conversion of single column to vector

This one is pretty simple. Append , drop = FALSE to your subsetting.

E.g.

df[, c(T, F, F), drop = FALSE]

Also works for matrices.

Is there something in R for automatic conversion of column( of data frame or table) into its original vector type

It seems like there is some bug(?) in fread with setting colClasses (I'll wait for a response from @Arun). In a meanwhile, you can fix this using type.convert after reading the data while reassigning the columns by reference

indx <- which(sapply(df, is.character))
df[, (indx) := lapply(.SD, type.convert), .SDcols = indx]
str(df)
# Classes ‘data.table’ and 'data.frame': 6 obs. of 4 variables:
# $ V1 : int 1 2 3 4 5 6
# $ ID : int 109 110 111 112 113 114
# $ SignalIntensity: num 7.58 11.27 8.6 9.54 10.18 ...
# $ SNR : num 1.34 9.75 1.8 3.2 4.65 ...
# - attr(*, ".internal.selfref")=<externalptr>

Convert data.frame columns from factors to characters

Just following on Matt and Dirk. If you want to recreate your existing data frame without changing the global option, you can recreate it with an apply statement:

bob <- data.frame(lapply(bob, as.character), stringsAsFactors=FALSE)

This will convert all variables to class "character", if you want to only convert factors, see Marek's solution below.

As @hadley points out, the following is more concise.

bob[] <- lapply(bob, as.character)

In both cases, lapply outputs a list; however, owing to the magical properties of R, the use of [] in the second case keeps the data.frame class of the bob object, thereby eliminating the need to convert back to a data.frame using as.data.frame with the argument stringsAsFactors = FALSE.

Creating New Values for Column from Vector of Differences

One possibility would be to use the cumsumfunction:

set.seed(1)
data <- data.frame(seq(2001, 2020, 1))
data$y <- (runif(20, 1, 10))
data$y[11:20] <- NA
colnames(data)[1] <- "Year"

myvector <- runif(10, -1, 1)
data$y[11:20] <- data$y[10] + cumsum(myvector)

Also, it is good practice to set a random seed (with set.seed) when working with random numbers.

R avoid coercion from matrix to numeric

Use drop=FALSE

> matrix(1:10, ncol=2)
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
> matrix(1:10, ncol=2)[, 2]
[1] 6 7 8 9 10
> matrix(1:10, ncol=2)[, 2, drop=FALSE]
[,1]
[1,] 6
[2,] 7
[3,] 8
[4,] 9
[5,] 10

Avoid rbind()/cbind() conversion from numeric to factor

You can use rbind.data.frame and cbind.data.frame instead of rbind and cbind.

Prevent Julia from automatically converting the type of a 1D matrix slice

Use a range of length 1 instead of just an index

Instead of simply specifying the index (Int64) of the desired column, specify a range (UnitRange{Int64}) of length 1: 1:1.

That will trick Julia into preserving the 2D-array type (Array{Int64,2}) instead of returning a vector (Array{Int64,1}).

Edit: the developers discussed this topic here (thanks to Colin for pointing me to it).

julia> alpha = [1 2 3; 4 5 6]
2x3 Array{Int64,2}:
1 2 3
4 5 6

julia> alpha[:,1] # nope
2-element Array{Int64,1}:
1
4

julia> alpha[:,1:1] # yep
2x1 Array{Int64,2}:
1
4

Prevent [.data.frame drop dimensions where there is only one column

Your first line, demos.part <- demos[-i, ], would only drop from a data frame to a matrix if demis.part has exactly one column:

# One column: result is a vector
> data.frame(a=letters)[1,]
[1] a
Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
# 2 cols: result is a df with 1 row
> data.frame(a=letters, b=letters)[1,]
data.frame with 1 row and 2 columns
a b
<factor> <factor>
1 a a

To see why this is, you can inspect the arguments of [.data.frame, where the default value of the drop argument depends on the number of columns:

> args(`[.data.frame`)
function (x, i, j, drop = if (missing(i)) TRUE else length(cols) ==
1)
NULL

Regardless, any time you want to prevent dropping of dimensions, simply add drop=FALSE after any indexing arguments (including intentionally blank indexing arguments; note the empty space between the two commas for the blank column index):

> data.frame(a=letters)[1, , drop=FALSE]
data.frame with 1 row and 1 column
a
<factor>
1 a

You should always use drop=FALSE when deciding how many rows/columns to select based on external input, since there is always the possibility that it will select just one row. Alternatively, use the data_frame function from the dplyr package to create a data frame with fewer weird edge cases in its behavior:

> library(dplyr)
> data_frame(a=letters)[1,]
Source: local data frame [1 x 1]

a
(chr)
1 a


Related Topics



Leave a reply



Submit