Specifying Column Types when Importing xlsx Data to R with Package readxl
New solution since readxl
version 1.x:
The solution in the currently preferred answer does no longer work with newer versions than 0.1.0 of readxl
since the used package-internal functionreadxl:::xlsx_col_types
does no longer exist.
The new solution is to use the newly introduced parameter guess_max
to increase the number of rows used to "guess" the appropriate data type of the columns:
read_excel("My_Excel_file.xlsx", sheet = 1, guess_max = 1048576)
The value 1,048,576 is the maximum number of lines supported by Excel currently, see the Excel specs: https://support.office.com/en-us/article/Excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3
PS: If you care about performance using all rows to guess the data type: read_excel
seems to read the file only once and the guess is done in-memory then so the performance penalty is very small compared to the saved work.
Readxl: handling a column name with a space
you could run:
summary(subset(Data, `EMPLOYEE ATTITUDE` == "A","HOURLY RATE"))
readxl, selected worksheets in single .xlsx-workbook
Here's the solution I worked out. Please flee free to improve or
criticizs,
sh_to_impt <- c('iris', 'mtcars')
path <- readxl_example("datasets.xlsx")
path %>%
excel_sheets() %>%
set_names() %>% .[sh_to_impt] %>%
map_df(read_excel,
path = path,
.id = "sheet")
# A tibble: 182 x 17
sheet Sepal.Length Sepal.Width Petal.Length Petal.Width Species mpg
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
1 iris 5.1 3.5 1.4 0.2 setosa NA
2 iris 4.9 3 1.4 0.2 setosa NA
3 iris 4.7 3.2 1.3 0.2 setosa NA
4 iris 4.6 3.1 1.5 0.2 setosa NA
5 iris 5 3.6 1.4 0.2 setosa NA
6 iris 5.4 3.9 1.7 0.4 setosa NA
7 iris 4.6 3.4 1.4 0.3 setosa NA
8 iris 5 3.4 1.5 0.2 setosa NA
9 iris 4.4 2.9 1.4 0.2 setosa NA
10 iris 4.9 3.1 1.5 0.1 setosa NA
# ... with 172 more rows, and 10 more variables: cyl <dbl>, disp <dbl>,
# hp <dbl>, drat <dbl>, wt <dbl>, qsec <dbl>, vs <dbl>, am <dbl>,
# gear <dbl>, carb <dbl>
Related Topics
Remove Spacing Around Plotting Area in R
R: Legend with Points and Lines Being Different Colors (For the Same Legend Item)
Understanding Color Scales in Ggplot2
How to Change and Remove Default Library Location
How to Knitr Markdown Straight Out of Your Workspace Using Rstudio
"Long Vectors Not Supported Yet" Error in Rmd But Not in R Script
Plot the Equivalent of Correlation Matrix for Factors (Categorical Data)? and Mixed Types
Display an Axis Value in Millions in Ggplot
To Find Whether a Column Exists in Data Frame or Not
Quickly Remove Zero Variance Variables from a Data.Frame
How to Hide Code in Rmarkdown, with Option to See It
Use Dplyr's Summarise_Each to Return One Row Per Function
Update Multiple Data.Table Columns Elegantly
How to Display a Busy Indicator in a Shiny App
How to Increase Size of the Points in Ggplot2, Similar to Cex in Base Plots