How to Manage a Table/Matrix to Obtain Information Using Conditions

How to print a row that contains specific values of a column?

A solution with Base R:

Em_sub = aggregate(. ~ year, data = Em, '[', 1)

or with dplyr:

library(dplyr)
Em_sub = Em %>%
group_by(year) %>%
slice(1)

Results:

  year x y
1 1950 5 3
2 1951 6 7

# A tibble: 2 x 3
# Groups: year [2]
year x y
<int> <int> <int>
1 1950 5 3
2 1951 6 7

Data:

Em = read.table(text = "year    x      y
1950 5 3
1950 4 3
1950 2 4
1950 1 5
1951 6 7
1951 5 6
1951 1 4
1951 0 3", header = TRUE)

aggregate function in R, multiple conditions

I think the easiest way is to create a new column only containing the word "Vormittag", "Nachmittag" or "Abend" and then use that as a grouping variable.

For instance:

# Set random seed for reproducibility
set.seed(12345)

# Make some data
my.data <- data.frame(ID_Oeffnungszeit = sample(1:10, 100, replace = TRUE),
ID_Einrichtung = sample(3000001:3000020, 100, replace = TRUE),
Anzahl_Std = rnorm(100),
Bez_Oeffnungszeit = sample(c("Montag Vormittag",
"Montag Nachmittag",
"Dienstag Vormittag",
"Montag Abend"), 100,
replace = T))

# Create a new column containing "Vormittag", "Nachmittag" or "Abend"
my.data$cat <- ""
my.data$cat[grep("Vormittag", my.data$Bez_Oeffnungszeit)] <- "Vormittag"
my.data$cat[grep("Nachmittag", my.data$Bez_Oeffnungszeit)] <- "Nachmittag"
my.data$cat[grep("Abend", my.data$Bez_Oeffnungszeit)] <- "Abend"

# Now just call aggregate over the Std variable using ID and category
# as grouping factors.
res <- aggregate(my.data$Anzahl_Std,
by = list(ID = my.data$ID_Einrichtung, cat = my.data$cat),
FUN = sum)

How to replace NA data of specific dates with the average data of different years of same dates of a dataframe in R?

This might work. It groups the months and days pairs and the replace the NAs from the mean.

library(dplyr)
A <- A %>%
group_by(month, day, hour, minute) %>%
mutate(rain = ifelse(is.na(rain),
mean(rain, na.rm=TRUE), rain))

SQL Server add computed column in VIEW with conditions

Yes, google how to create a view in SQL Server and in the SELECT statement of your view multiply the Price column with the TaxId column. Since you have to consider the NULL value you should use COALESCE like this

SELECT
Price * (1 + COALESCE(TaxId/100), 0) AS newValue
...

assuming that you have just '8.5' in your column and not '8.5%'. Is it varchar or not?

P.S.: COALESCE() returns the first non-null value

UPDATE after question was edited:

You just have to slightly modify your syntax

SELECT 
CAST(
CASE
WHEN TaxId = 1 THEN Price *1.085
WHEN TaxId = 2 THEN Price *1.2
WHEN TaxId IS NULL THEN Price END AS decimal(15,4)
) AS CalculatedPrice
FROM...

Transforming matrix of presence/absence to Data.frame of vertice connection. (Removing duplicated rows with eeuqal unordered values)

The problem can be solved using the purrr package.

# reproduce input
mat <- matrix(
data = c(1,0,0,0,0,0,0,
0,1,1,0,1,0,0,
0,0,1,0,1,0,0,
1,1,0,0,0,1,0,
0,0,0,0,0,1,0,
0,1,0,0,1,0,0,
1,0,1,1,1,0,0), nrow = 7, ncol = 7)
colnames(mat) <- LETTERS[1:7]
rownames(mat) <- c("Aiz", "Aren", "Atx", "Berr", "Bra", "Bur", "Cab")

# convert to dataframe
df <- mat %>%
dplyr::as_tibble() %>%
dplyr::bind_cols(
tibble::tibble(Names = rownames(mat)))

# calculate the connections
purrr::map_df(df$Names, function(x){
output <-purrr::map_df(df$Names, function(y){
if(x >= y) return(tibble::tibble()) # avoid double counting
tibble::tibble(
siteA = x,
siteB = y,
weight = sum(as.integer(df[df$Names==x,1:7]) & as.integer(df[df$Names==y,1:7])))
})
})

good luck

Extract single value from dataframe based on logical condition, Tidyverse style

There are many ways to skin a cat in R; here are lots (keep in mind you can continually mix and match alot of the functions used below) in general you have correctly observed the inefficiency of tidyverse functions:

# Base R method 1: => returns data.frame:
subset(example_data, species == "Gorilla_gorilla", select = "var1")

# Base R method 2: => returns vector length 1 (R's scalar):
example_data$var1[match("Gorilla_gorilla", example_data$species)]

# Base R method 3: => (result as df):
example_data[match("Gorilla_gorilla", example_data$species), "var1", drop = FALSE]

# Tidyverse method 1: => returns df
library(tidyverse)
example_data %>%
slice(which(species == "Gorilla_gorilla")) %>%
select(var1)

# Tidyverse method 2: => returns df
example_data %>%
filter(species == "Gorilla_gorilla") %>%
select(var1)

# Tidyverse method 3: => returns vector
example_data %>%
filter(species == "Gorilla_gorilla") %>%
pull(var1)

# data.table method 1: => returns data.table / data.frame object
library(data.table)
setDT(example_data)[species == "Gorilla_gorilla", "var1"]

# data.tabel method 2: => returns data.table / data.frame object:
setDT(example_data)[species == "Gorilla_gorilla", 2]

# data.tabel method 3: => returns data.table / data.frame object:
setDT(example_data)[species == "Gorilla_gorilla", .SD, .SDcols = 2]

# data.table method 4: => return vector:
as.matrix(setDT(example_data)[species == "Gorilla_gorilla", 2])[1]

Rownames for data.table in R for model.matrix

There are a couple of issues here.

Firstly, it is a feature of a data.table that they do not have a rownames, instead they have keys which are far more powerful. See this great vignette.

But, it isn't the end of the world. model.matrix returns sensible rownames when you pass it a data.table

For example

A <- data.table(ID = 1:5, x = c(NA, 1:4), y = c(4:2,NA,3))

mm <- model.matrix( ~ x + y, A)

rownames(mm)

## [1] "2" "3" "5"

So rows 2,3 and 5 are those included in the model.matrix.

Now, you can add this sequence as a column to A. This will be useful if you then set the key to something else (thereby losing the original order)

A[, rowid := seq_len(nrow(A)]

You might consider making it character (like the rownames of mm)) but it won't really matter (as you can just as easily convert rownames(mm) to numeric when you need to reference.

As to the warning that data.table gives, if you read the next sentence

Avoid key<-, names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: setkey(), setnames() and setattr()

rownames are an attribute rownames<- (internally at somepoint using the equivalent to attr<-) will (possibly copy) in the same way.

The line from `row.names<-.data.frame` is

attr(x, "row.names") <- value

That being said, data.tables don't have rownames, so there is no point setting them.



Related Topics



Leave a reply



Submit