round_any equivalent for dplyr?
ggplot::cut_width
as pointed to in one of the comments, does not even return a numeric vector, but a factor instead. So it is no real substitute.
Since round
and not floor
is the default rounding method, a custom replacement until a (dplyr solution may arrive) would be
round_any = function(x, accuracy, f=round){f(x/ accuracy) * accuracy}
This method could also be used directly from the plyr package which contains this implementation. However, be careful when loading plyr
into a workspace which will cause naming conflicts when using dplyr as well.
R: Is there a good replacement for plyr::rbind.fill in dplyr?
Yes. dplyr::bind_rows
Credit goes to commenter.
Equivalent of max and over(partition by) in R for flattening
Using data.table
:
library(data.table)
setDT(df)[
, recorded_dt:=NULL][
, lapply(.SD, \(x) sort(x, na.last = TRUE, decreasing = TRUE)[1])
, by=.(ID, time0)]
## ID time0 day0 day1 day4 day30
## 1: 1 2009-01-01 A <NA> B D
## 2: 2 2005-02-02 <NA> B <NA> <NA>
The internal variable .SD
represents a subset of the data.table
including all columns except those included in the by=...
clause. This is why we have to remove the column recorded_dt
first.
r-dplyr equivalent of sql query returning monthly utilisation of contracts
You get lubridate
errors with working with certain date time formats. It works if you remove as.Date
and %m+%
.
df %>%
filter(start< "2016-03-01" &
start + months(duration) >="2016-03-01")
Creating a new column based on multiple conditional statements in r
You can use plyr
s round_any
function
df$value1 <- plyr::round_any(df$value, 0.1, ceiling)
df
# value value1
#1 0.59465953 0.6
#2 0.10581043 0.2
#3 0.48806113 0.5
#4 0.04106798 0.1
#5 0.24026985 0.3
#6 0.08468660 0.1
#7 0.11598592 0.2
#8 0.50481103 0.6
#9 0.43194839 0.5
#10 0.16032725 0.2
#11 0.29700099 0.3
#12 0.04986834 0.1
#13 0.21233054 0.3
#14 0.58152528 0.6
#...
How to round up to the nearest 10 (or 100 or X)?
If you just want to round up to the nearest power of 10, then just define:
roundUp <- function(x) 10^ceiling(log10(x))
This actually also works when x is a vector:
> roundUp(c(0.0023, 3.99, 10, 1003))
[1] 1e-02 1e+01 1e+01 1e+04
..but if you want to round to a "nice" number, you first need to define what a "nice" number is. The following lets us define "nice" as a vector with nice base values from 1 to 10. The default is set to the even numbers plus 5.
roundUpNice <- function(x, nice=c(1,2,4,5,6,8,10)) {
if(length(x) != 1) stop("'x' must be of length 1")
10^floor(log10(x)) * nice[[which(x <= 10^floor(log10(x)) * nice)[[1]]]]
}
The above doesn't work when x is a vector - too late in the evening right now :)
> roundUpNice(0.0322)
[1] 0.04
> roundUpNice(3.22)
[1] 4
> roundUpNice(32.2)
[1] 40
> roundUpNice(42.2)
[1] 50
> roundUpNice(422.2)
[1] 500
[[EDIT]]
If the question is how to round to a specified nearest value (like 10 or 100), then James answer seems most appropriate. My version lets you take any value and automatically round it to a reasonably "nice" value. Some other good choices of the "nice" vector above are: 1:10, c(1,5,10), seq(1, 10, 0.1)
If you have a range of values in your plot, for example [3996.225, 40001.893]
then the automatic way should take into account both the size of the range and the magnitude of the numbers. And as noted by Hadley, the pretty()
function might be what you want.
How to do range grouping on a column using dplyr?
We can use cut
to do the grouping. We create the 'gr' column within the group_by
, use summarise
to create the number of elements in each group (n()
), and order the output (arrange
) based on 'gr'.
library(dplyr)
DT %>%
group_by(gr=cut(B, breaks= seq(0, 1, by = 0.05)) ) %>%
summarise(n= n()) %>%
arrange(as.numeric(gr))
As the initial object is data.table
, this can be done using data.table
methods (included @Frank's suggestion to use keyby
)
library(data.table)
DT[,.N , keyby = .(gr=cut(B, breaks=seq(0, 1, by=0.05)))]
EDIT:
Based on the update in the OP's post, we could substract a small number to the seq
lvls <- levels(cut(DT$B, seq(0, 1, by =0.05)))
DT %>%
group_by(gr=cut(B, breaks= seq(0, 1, by = 0.05) -
.Machine$double.eps, right=FALSE, labels=lvls)) %>%
summarise(n=n()) %>%
arrange(as.numeric(gr))
# gr n
#1 (0,0.05] 2
#2 (0.05,0.1] 2
#3 (0.1,0.15] 3
#4 (0.15,0.2] 2
#5 (0.7,0.75] 1
Related Topics
Convert to Local Time Zone Using Latitude and Longitude
Error in Xj[I]: Invalid Subscript Type 'List'
Ggplot2: Making Changes to Symbols in The Legend
Changing The Radius of a Coord_Polar Ggplot
How to Create a Rank Variable Under Certain Conditions
Change Position of Tick Marks of a Single Graph, Using Ggplot2
Column Name with Brackets or Other Punctuations for Dplyr Group_By
Subsetting in Xts Using a Parameter Holding Dates
R Bookdown - Custom Title Page
Error: C Stack Usage Is Too Close to The Limit in R
How Does R Represent Na Internally
Center Error Bars (Geom_Errorbar) Horizontally on Bars (Geom_Bar)
Multiplication of Large Integers
Specific Spaces Between Bars in a Barplot - Ggplot2 - R