Cbind Warnings:Row Names Were Found from a Short Variable and Have Been Discarded

cbind warnings : row names were found from a short variable and have been discarded

I'm guessing your data.frame has row.names:

A <- data.frame(a = c("A", "B", "C"), 
b = c(1, 2, 3),
c = c(4, 5, 6),
row.names=c("A", "B", "C"))

cbind(A[1], stack(A[-1]))
# a values ind
# 1 A 1 b
# 2 B 2 b
# 3 C 3 b
# 4 A 4 c
# 5 B 5 c
# 6 C 6 c
# Warning message:
# In data.frame(..., check.names = FALSE) :
# row names were found from a short variable and have been discarded

What's happening here is that since you can't by default have duplicated row.names in a data.frame and since you don't tell R at any point to duplicate the row.names when recycling the first column to the same number of rows of the stacked column, R just discards the row.names.

Compare with a similar data.frame, but one without row.names:

B <- data.frame(a = c("A", "B", "C"), 
b = c(1, 2, 3),
c = c(4, 5, 6))

cbind(B[1], stack(B[-1]))
# a values ind
# 1 A 1 b
# 2 B 2 b
# 3 C 3 b
# 4 A 4 c
# 5 B 5 c
# 6 C 6 c

Alternatively, you can set row.names = NULL in your cbind statement:

cbind(A[1], stack(A[-1]), row.names = NULL)
# a values ind
# 1 A 1 b
# 2 B 2 b
# 3 C 3 b
# 4 A 4 c
# 5 B 5 c
# 6 C 6 c

If your original row.names are important, you can also add them back in with:

cbind(rn = rownames(A), A[1], stack(A[-1]), row.names = NULL)
# rn a values ind
# 1 A A 1 b
# 2 B B 2 b
# 3 C C 3 b
# 4 A A 4 c
# 5 B B 5 c
# 6 C C 6 c

Warning message when using cbin() in R

Hey I think you should try row.names = NULL in your cbind. here is an example from your code

tab=as.data.frame(cbind(data_R[i,1:2],data_D[,1:2], row.names = NULL))

R - Issues while calling a user-defined function

The following works

get_P <- function(df, data_sub) {
data_sub <- data_sub[complete.cases(data_sub), ]
data.frame(
Scorecard = data_sub$Key_P,
Results = df[data_sub$Key_P, ncol(df)])
}
get_P(df, data_sub)
# Scorecard Results
#1 2 1837
#2 3 315
#3 4 621

get_A <- function(df, data_sub) {
data_sub <- data_sub[complete.cases(data_sub), ];
data.frame(
Scorecard = data_sub$Key_A,
Results = as.numeric(df[nrow(df), data_sub$Key_A + 1]))
}
get_A(df, data_sub)
# Scorecard Results
#1 1 12
#2 3 8
#3 5 11

To avoid the warning, we need to strip rownames with as.numeric in get_A.

Another tip: It's better coding practice to make get_P and get_A a function of both df and data_sub to avoid global variables.


Sample data

df <- read.table(text =
" V1 V2 V3 V4 V5 V6 V7
1 A 29 27 0 14 21 163
2 W 70 40 93 63 44 1837
3 E 11 1 11 49 17 315
4 S 20 59 36 23 14 621
5 C 12 7 48 24 25 706
6 B 14 8 78 27 17 375
7 G 12 7 8 4 4 257
8 T 0 0 0 0 0 0
9 N 32 6 9 14 17 264
10 R 28 46 49 55 38 608
11 O 12 2 8 12 11 450", header = T, row.names = 1)

data_sub <- read.table(text =
" Key_P Key_A
1 2 1
2 3 3
3 4 5
4 NA NA", header = T, row.names = 1)

omu_anova error: row names supplied are of the wrong length

Here is my hypothesis: Your error results from one or more of the 15 sample columns in the count table (not Metabolite or KEGG) being incorrectly rendered as something other than "numeric" variables.

Looking at the source code for omu_anova, the "Metabolite" column is first assigned to the rownames of the data frame and then the following line selects only the columns in the count data that are numeric.

  data_Int <- count_data[sapply(count_data, function(x) is.numeric(x))]

So, here non-numeric columns are dropped, and having too few columns could explain the error you see because the 'Metabolite' rownames are lost when the transposed count data are later joined to the metadata.

To diagnose this try str(metab9) and eyeball whether or not all the sample columns are numeric. Alternatively, check that your 'KEGG' column is not numeric (it should be a factor or character, I suspect). If KEGG is numeric, it would give the count data too many columns.

R cbind based on row names

We can use merge

 merge(as.data.frame(x), as.data.frame(y), by='row.names', all=TRUE)


Related Topics



Leave a reply



Submit