How to print a row that contains specific values of a column?
A solution with Base R:
Em_sub = aggregate(. ~ year, data = Em, '[', 1)
or with dplyr
:
library(dplyr)
Em_sub = Em %>%
group_by(year) %>%
slice(1)
Results:
year x y
1 1950 5 3
2 1951 6 7
# A tibble: 2 x 3
# Groups: year [2]
year x y
<int> <int> <int>
1 1950 5 3
2 1951 6 7
Data:
Em = read.table(text = "year x y
1950 5 3
1950 4 3
1950 2 4
1950 1 5
1951 6 7
1951 5 6
1951 1 4
1951 0 3", header = TRUE)
aggregate function in R, multiple conditions
I think the easiest way is to create a new column only containing the word "Vormittag", "Nachmittag" or "Abend" and then use that as a grouping variable.
For instance:
# Set random seed for reproducibility
set.seed(12345)
# Make some data
my.data <- data.frame(ID_Oeffnungszeit = sample(1:10, 100, replace = TRUE),
ID_Einrichtung = sample(3000001:3000020, 100, replace = TRUE),
Anzahl_Std = rnorm(100),
Bez_Oeffnungszeit = sample(c("Montag Vormittag",
"Montag Nachmittag",
"Dienstag Vormittag",
"Montag Abend"), 100,
replace = T))
# Create a new column containing "Vormittag", "Nachmittag" or "Abend"
my.data$cat <- ""
my.data$cat[grep("Vormittag", my.data$Bez_Oeffnungszeit)] <- "Vormittag"
my.data$cat[grep("Nachmittag", my.data$Bez_Oeffnungszeit)] <- "Nachmittag"
my.data$cat[grep("Abend", my.data$Bez_Oeffnungszeit)] <- "Abend"
# Now just call aggregate over the Std variable using ID and category
# as grouping factors.
res <- aggregate(my.data$Anzahl_Std,
by = list(ID = my.data$ID_Einrichtung, cat = my.data$cat),
FUN = sum)
How to replace NA data of specific dates with the average data of different years of same dates of a dataframe in R?
This might work. It groups the months and days pairs and the replace the NA
s from the mean.
library(dplyr)
A <- A %>%
group_by(month, day, hour, minute) %>%
mutate(rain = ifelse(is.na(rain),
mean(rain, na.rm=TRUE), rain))
SQL Server add computed column in VIEW with conditions
Yes, google how to create a view in SQL Server and in the SELECT statement of your view multiply the Price column with the TaxId column. Since you have to consider the NULL value you should use COALESCE
like this
SELECT
Price * (1 + COALESCE(TaxId/100), 0) AS newValue
...
assuming that you have just '8.5' in your column and not '8.5%'. Is it varchar or not?
P.S.: COALESCE()
returns the first non-null value
UPDATE after question was edited:
You just have to slightly modify your syntax
SELECT
CAST(
CASE
WHEN TaxId = 1 THEN Price *1.085
WHEN TaxId = 2 THEN Price *1.2
WHEN TaxId IS NULL THEN Price END AS decimal(15,4)
) AS CalculatedPrice
FROM...
Transforming matrix of presence/absence to Data.frame of vertice connection. (Removing duplicated rows with eeuqal unordered values)
The problem can be solved using the purrr package.
# reproduce input
mat <- matrix(
data = c(1,0,0,0,0,0,0,
0,1,1,0,1,0,0,
0,0,1,0,1,0,0,
1,1,0,0,0,1,0,
0,0,0,0,0,1,0,
0,1,0,0,1,0,0,
1,0,1,1,1,0,0), nrow = 7, ncol = 7)
colnames(mat) <- LETTERS[1:7]
rownames(mat) <- c("Aiz", "Aren", "Atx", "Berr", "Bra", "Bur", "Cab")
# convert to dataframe
df <- mat %>%
dplyr::as_tibble() %>%
dplyr::bind_cols(
tibble::tibble(Names = rownames(mat)))
# calculate the connections
purrr::map_df(df$Names, function(x){
output <-purrr::map_df(df$Names, function(y){
if(x >= y) return(tibble::tibble()) # avoid double counting
tibble::tibble(
siteA = x,
siteB = y,
weight = sum(as.integer(df[df$Names==x,1:7]) & as.integer(df[df$Names==y,1:7])))
})
})
good luck
Extract single value from dataframe based on logical condition, Tidyverse style
There are many ways to skin a cat in R; here are lots (keep in mind you can continually mix and match alot of the functions used below) in general you have correctly observed the inefficiency of tidyverse functions:
# Base R method 1: => returns data.frame:
subset(example_data, species == "Gorilla_gorilla", select = "var1")
# Base R method 2: => returns vector length 1 (R's scalar):
example_data$var1[match("Gorilla_gorilla", example_data$species)]
# Base R method 3: => (result as df):
example_data[match("Gorilla_gorilla", example_data$species), "var1", drop = FALSE]
# Tidyverse method 1: => returns df
library(tidyverse)
example_data %>%
slice(which(species == "Gorilla_gorilla")) %>%
select(var1)
# Tidyverse method 2: => returns df
example_data %>%
filter(species == "Gorilla_gorilla") %>%
select(var1)
# Tidyverse method 3: => returns vector
example_data %>%
filter(species == "Gorilla_gorilla") %>%
pull(var1)
# data.table method 1: => returns data.table / data.frame object
library(data.table)
setDT(example_data)[species == "Gorilla_gorilla", "var1"]
# data.tabel method 2: => returns data.table / data.frame object:
setDT(example_data)[species == "Gorilla_gorilla", 2]
# data.tabel method 3: => returns data.table / data.frame object:
setDT(example_data)[species == "Gorilla_gorilla", .SD, .SDcols = 2]
# data.table method 4: => return vector:
as.matrix(setDT(example_data)[species == "Gorilla_gorilla", 2])[1]
Rownames for data.table in R for model.matrix
There are a couple of issues here.
Firstly, it is a feature of a data.table
that they do not have a rownames
, instead they have key
s which are far more powerful. See this great vignette.
But, it isn't the end of the world. model.matrix
returns sensible rownames when you pass it a data.table
For example
A <- data.table(ID = 1:5, x = c(NA, 1:4), y = c(4:2,NA,3))
mm <- model.matrix( ~ x + y, A)
rownames(mm)
## [1] "2" "3" "5"
So rows 2,3 and 5 are those included in the model.matrix.
Now, you can add this sequence as a column to A
. This will be useful if you then set the key to something else (thereby losing the original order)
A[, rowid := seq_len(nrow(A)]
You might consider making it character (like the rownames of mm
)) but it won't really matter (as you can just as easily convert rownames(mm)
to numeric when you need to reference.
As to the warning that data.table
gives, if you read the next sentence
Avoid key<-, names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: setkey(), setnames() and setattr()
rownames
are an attribute rownames<-
(internally at somepoint using the equivalent to attr<-
) will (possibly copy) in the same way.
The line from `row.names<-.data.frame`
is
attr(x, "row.names") <- value
That being said, data.tables
don't have rownames, so there is no point setting them.
Related Topics
1-Dimensional Matrix Is Changed to a Vector in R
Create Columns from Column of List in Data.Table
R + Ggplot2: How to Hide Missing Dates from X-Axis
Highlight a Line in Ggplot with Multiple Lines
How to Expand a Large Dataframe in R
R Shiny Error: Object of Type 'Closure' Is Not Subsettable
Count Unique Values of a Column by Pairwise Combinations of Another Column in R
Delete Rows with Less Than 7 Characters
Graph Flow Chart of Transition from States
Ggplot Each Group Consists of Only One Observation
R: Compare All the Columns Pairwise in Matrix
Overlapping the Predicted Time Series on the Original Series in R
In R, Check If String Appears in Row of Dataframe (In Any Column)
Why Are the Colors Wrong on This Ggplot