Transposing a dataframe maintaining the first column as heading
Here is one way
tmydf = setNames(data.frame(t(mydf[,-1])), mydf[,1])
What is the best way to transpose a data.frame in R and to set one of the columns to be the header for the new transposed table?
Well you could do it in 2 steps by using
# Transpose table YOU WANT
fooData.T <- t(fooData[,2:ncol(fooData)])
# Set the column headings from the first column in the original table
colnames(fooData.T) <- fooData[,1]
The result being a matrix which you're probably aware of, that's due to class issues when transposing. I don't think there will be a single line way to do this given the lack of naming abilities in the transpose step.
Properly transpose Pandas dataframe and access first column
Set fruit
as the index; selection should be much easier:
temp = df.set_index('fruit')
temp.loc['weight', 'apple']
50
Transposing a dataframe and using the first column as an index
You can try this:
library(tidyverse)
#Data
df <- structure(list(filename = c("file1", "file1", "file1", "file2",
"file2", "file2"), wavelength = c("w1", "w2", "w3", "w1", "w2",
"w3"), A = c(NA, NA, NA, 3L, 4L, 6L), B = c(NA, NA, NA, 4L, 8L,
1L), C = c(1L, 3L, 6L, NA, NA, NA), D = c(2L, 2L, 2L, NA, NA,
NA)), class = "data.frame", row.names = c(NA, -6L))
Code:
df %>% pivot_longer(cols = -c(1,2)) %>% filter(!is.na(value)) %>%
pivot_wider(names_from = wavelength,values_from = value)
Output:
# A tibble: 4 x 5
filename name w1 w2 w3
<chr> <chr> <int> <int> <int>
1 file1 C 1 3 6
2 file1 D 2 2 2
3 file2 A 3 4 6
4 file2 B 4 8 1
Transpose Pandas DataFrame and change the column headers to a list
Need set_index
+ T
:
df = df.set_index('col1').T
print (df)
col1 a b
name1 10.0 72.0
name2 0.2 -0.1
df = df.set_index('col1').T.rename_axis('Variable').rename_axis(None, 1)
print (df)
a b
Variable
name1 10.0 72.0
name2 0.2 -0.1
If need column from index:
df = df.set_index('col1').T.rename_axis('Variable').rename_axis(None, 1).reset_index()
print (df)
Variable a b
0 name1 10.0 72.0
1 name2 0.2 -0.1
How to preserve header as a column and have an index after transposing dataframe in pandas?
Instead your solution try first transpose and then remove Unnamed
index values - set to default values:
df_1 = pd.read_csv(file)
print (df_1)
Unnamed 1 Unnamed 4 Unnamed 7
0 X Y Z
1 P P P
2 P P P
3 P P P
df_1 = df_1.T.reset_index(drop=True)
print (df_1)
0 1 2 3
0 X P P P
1 Y P P P
2 Z P P P
Maintaining row and column names while transposing a df
We can use row_to_names
from janitor
ga_sessions_combined %>%
column_to_rownames(var = "Metric") %>% # because when transposing the rownames will become the column names
t() %>% # transpose
as.data.frame() %>% # tur back to a df
rownames_to_column(var ="ym") %>% # now make what were the column names into rownames
# do my transfrmations
mutate_at(vars(Users, `Engaged Users`, Transactions), scales::comma_format()) %>%
mutate_at(vars(ConversionRate, `Bounce Rate`), scales::percent_format()) %>%
mutate(Revenue = scales::dollar(Revenue)) %>%
# try to get back into original layout
t() %>%
as.data.frame(stringsAsFactors = FALSE) %>%
rownames_to_column(var = "Metric") %>%
janitor::row_to_names(row_number = 1) %>%
rename(Metric = ym)
# Metric ym_201904 ym_201905 ym_201906 ym_201907 ym_201908 ym_201909 ym_201910 ym_201911 ym_201912 ym_202001
#2 Users 157,664 199,340 169,971 161,346 132,702 164,160 217,227 970,864 1,180,689 216,816
#3 Engaged Users 79,295 103,879 90,557 88,059 70,701 96,124 118,041 604,606 671,162 109,637
#4 Transactions 5,764 5,744 4,899 4,223 3,106 3,841 4,448 27,713 59,536 5,057
#5 Revenue $609,173 $673,063 $566,247 $580,409 $424,808 $724,959 $798,116 $4,859,789 $9,447,240 $738,079
#6 ConversionRate 3.65588% 2.88151% 2.88226% 2.61736% 2.34058% 2.33979% 2.04763% 2.85447% 5.04248% 2.33239%
#7 Bounce Rate 49.70634% 47.88853% 46.72209% 45.42226% 46.72198% 41.44493% 45.66007% 37.72495% 43.15506% 49.43316%
#8 $/User 3.863741 3.376459 3.331435 3.597293 3.201216 4.416173 3.674112 5.005633 8.001464 3.404172
# ym_202002 ym_202003
#2 204,113 324,266
#3 145,975 229,438
#4 4,847 8,341
#5 $720,506 $1,196,235
#6 2.37467% 2.57227%
#7 28.48324% 29.24389%
#8 3.529939 3.689053
How to transpose the table by keeping the header names
If you want to create "t#" named columns using spread
from the tidyr
package, note that it does so in alphabetical order, & doesn't deal well with duplicated column names.
Your example has two rows named "t1" & two rows named "t2". So that need to be handled.
The names are in alphabetical order in this example, but assuming that's not always going to be the case, you can preface the names with a sequence of numbers in running order.
Something like the following could be modified to work:
qt <- q %>%
# make row names unique & sorted correctly in increasing order
# by appending numbers in running order
mutate(name = paste(seq(1, n()),
name,
sep = "_")) %>%
gather(row, value, -name) %>%
spread(name, value)
# strip away the appended numbers from the newly created column names
names(qt) <- sapply(strsplit(names(qt), "_"), function(x){x[2]})
> qt
# A tibble: 3 x 6
`NA` t1 t1 t2 t2 t4
* <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 g1 0 2 1 3 4
2 g2 1 2 2 3 4
3 g3 2 2 3 3 4
Alternatively, if you don't need a tidyverse
solution:
# transpose the data frame without the name column
qt <- t(q[-1])
# add name column back as a dimname attribute
attr(qt, "dimnames")[[2]] <- unname(unlist(q[1]))
# edit: alternative to above
colnames(qt) <- q[1][[1]]
# convert result to data frame
qt <- as.data.frame(qt)
> qt
t1 t1 t2 t2 t4
g1 0 2 1 3 4
g2 1 2 2 3 4
g3 2 2 3 3 4
Whichever it is, I hope this is for presentation rather than analysis, because it's really hard to work with duplicated column names in tidyverse.
Related Topics
Display a Time Clock in the R Command Line
Adding Percentage Labels to a Bar Chart in Ggplot2
Split Text String in a Data.Table Columns
How to Install an R Package from the Source Tarball on Windows
Evaluating Both Column Name and the Target Value Within 'J' Expression Within 'Data.Table'
How to Add Table of Contents in Rmarkdown
Difference Between Passing Options in Aes() and Outside of It in Ggplot2
R - Use Rbind on Multiple Variables with Similar Names
Position of the Sun Given Time of Day, Latitude and Longitude
Use Different Center Than the Prime Meridian in Plotting a World Map
Examples of the Perils of Globals in R and Stata
Remove/Collapse Consecutive Duplicate Values in Sequence
How to Round Up to the Nearest 10 (Or 100 or X)
Count Number of Zeros Per Row, and Remove Rows with More Than N Zeros
R: Data.Table Cross-Join Not Working
Is There a Better Alternative Than String Manipulation to Programmatically Build Formulas