Automatic rounding in dplyr::summarise() function
This is to do with the way tibbles are printed. The actual numbers in the data frame still have all the decimal places they are just not displayed when printing the tibble.
You can use as.data.frame
or print.data.frame()
which will show you more decimal points (depending on your getOption("digits")
). You can also change the tibble settings but my understanding is that these are always based on significant figures rather than decimal points (so your values >100 will have fewer decimal points than values <100) See
https://tibble.tidyverse.org/reference/formatting.html for tibble printing options
So
df %>% group_by(group) %>% summarise(mL = round(mean(large),3), mS = round(mean(small),3)) %>%
as.data.frame()
will give you values to 3 decimal places, and
df %>% group_by(group) %>% summarise(mL = mean(large), mS = mean(small)) %>%
as.data.frame()
will show to getOption("digits")
decimal places (I think 7 is default).
Also note if you do want to do the same thing to multiple columns in summarise, summarise_at()
can be very helpful, e.g.
df %>% group_by(group) %>% summarise_at(c("large","small"), ~round(mean(.),3)) %>%
print.data.frame()
Decimal places not showing when using dplyr summarize function in R
The issue is with settings in your environment which controls number of digits to be displayed while printing which can be changed by running options(digits = 5)
or any higher number (upto 22) in the console.
From ?options
digits:
controls the number of significant (see signif) digits to print when printing numeric values. It is a suggestion only. Valid values are 1...22 with default 7.
After doing that if you run
library(dplyr)
NLeast_starters %>% summarize(mean_hits = round(mean(H),5))
# mean_hits
#1 123.75
you'll get the expected display of decimal places.
r - rounding in summarise()
For the tibble package you need to modifiy the option pillar.sigfig
.
pillar.sigfig
: The number of significant digits that will be printed and highlighted, default: 3
library(tibble)
options(pillar.sigfig = 10)
set.seed(1)
tibble(a = rnorm(3), b = rexp(3))
# A tibble: 3 x 2
# a b
# <dbl> <dbl>
#1 -0.6264538107 0.4360686258
#2 0.1836433242 2.894968537
#3 -0.8356286124 1.229562053
dplyr summarise character time variable
I can think of using lubridate::hms
to convert those strings to numbers, but I haven't found the right way to format(.., format="%H:%M:%S")
back again, so here are two functions I have used for various related purposes:
## simply convert "01:23:45" to 5025 (seconds) and "00:17:14.842" to 1034.842
time2num <- function(x) {
vapply(strsplit(x, ':'), function(y) sum(as.numeric(y) * c(60*60, 60, 1)),
numeric(1), USE.NAMES=FALSE)
}
## and back again
num2time <- function(x, digits.secs = getOption("digits.secs", 3)) {
hr <- as.integer(x %/% 3600)
min <- as.integer((x - 3600*hr) %/% 60)
sec <- (x - 3600*hr - 60*min)
if (anyNA(digits.secs)) {
# a mostly-arbitrary determination of significant digits,
# motivated by @Roland https://stackoverflow.com/a/27767973
for (digits.secs in 1:6) {
if (any(abs(signif(sec, digits.secs) - sec) > (10^(-3 - digits.secs)))) next
digits.secs <- digits.secs - 1L
break
}
}
sec <- sprintf(paste0("%02.", digits.secs[[1]], "f"), sec)
sec <- paste0(ifelse(grepl("^[0-9]\\.", sec), "0", ""), sec)
out <- sprintf("%02i:%02i:%s", hr, min, sec)
out[is.na(x)] <- NA_character_
out
}
With these,
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Freq = num2time(sum(time2num(Time)), digits = 0)) %>%
ungroup()
# # A tibble: 6 x 3
# ID Time Freq
# <int> <chr> <chr>
# 1 456 0:00:01 00:02:06
# 2 456 0:02:05 00:02:06
# 3 123 0:00:14 00:00:14
# 4 756 0:03:47 00:05:44
# 5 756 0:01:56 00:05:44
# 6 756 0:00:01 00:05:44
Data
dat <- structure(list(ID = c(456L, 456L, 123L, 756L, 756L, 756L), Time = c("0:00:01", "0:02:05", "0:00:14", "0:03:47", "0:01:56", "0:00:01")), class = "data.frame", row.names = c(NA, -6L))
Correcting summary in R with appropriate # of digits of precision
The default for summary.data.frame
is not digits=3, but rather:
... max(3, getOption("digits") - 3) # set in the argument list
getOption("digits") # the default setting
[1] 7
options(digits=10)
> summary(df)
V1 V2 V3
Min. :-3.70323584 Min. : 11.0 Min. :6.790622e-05
1st Qu.:-0.66847105 1st Qu.:122798.5 1st Qu.:2.497735e-01
Median : 0.00977831 Median :247971.0 Median :5.013797e-01
Mean : 0.01044752 Mean :248776.4 Mean :5.001182e-01
3rd Qu.: 0.68878422 3rd Qu.:374031.0 3rd Qu.:7.502424e-01
Max. : 3.56810079 Max. :499931.0 Max. :9.998686e-01
fix r sum() auto remove the small digital .05
I believe this is just a printout issue; if you want to increase the number of significant digits in the printout, you could try:
sprintf("%.2f",sum(22068.00, 144501.00, 71153.00, 26193.05, 10395.00 , 80619.00))
# [1] "354929.05"
And to change the number of digits, just change the number in the first argument, i.e.:
sprintf("%.10f",sum(22068.00, 144501.00, 71153.00, 26193.05, 10395.00 , 80619.00))
#[1] "354929.0500000000"
Related Topics
How to Move or Position a Legend in Ggplot2
Sample Rows of Subgroups from Dataframe with Dplyr
Subfigures or Subcaptions with Knitr
How to Concatenate Factors, Without Them Being Converted to Integer Level
How to Change Order of Boxplots When Using Ggplot2
Save Plot with a Given Aspect Ratio
Exact Number of Bins in Histogram in R
Which Is the Best Method to Apply a Script Repetitively to N .CSV Files in R
Find the N Most Common Values in a Vector
There Is Pmin and Pmax Each Taking Na.Rm, Why No Psum
Output a Vector in R in the Same Format Used for Inputting It into R
Can't Download Data from Yahoo Finance Using Quantmod in R
Getting the Last N Elements of a Vector. Is There a Better Way Than Using the Length() Function
Creating a Symmetric Matrix in R
R: Ggplot2, How to Set the Plot Title to Wrap Around and Shrink the Text to Fit the Plot
Merge Three Different Columns into a Date in R
Agrep: Only Return Best Match(Es)
Create a Matrix of Scatterplots (Pairs() Equivalent) in Ggplot2