Selecting multiple odd or even columns/rows for dataframe
You can always generate sequences with seq:
even_indexes<-seq(2,42,2)
odd_indexes<-seq(1,41,2)
Then,
x.loadings <- data.frame(x=data.pc$loadings[odd_indexes,1])
How to replace values from even rows into odd rows in python?
Try this:
df = df.set_index('Date').shift(-1)
df.loc[df.reset_index().index % 2 == 1] = 0
df = df.reset_index()
Output:
>>> df
Date col1 col2
0 2011 50.0 50.0
1 2012 0.0 0.0
2 2013 60.0 60.0
3 2014 0.0 0.0
Compare even and odd rows in a Pandas Data Frame
You can use a GroupBy.transform
approach:
# for each pair, is there only one kind of Id?
out = df[df.groupby(np.arange(len(df))//2)['Id'].transform('nunique').eq(1)]
Or, more efficient, using the underlying numpy array:
# convert to numpy
a = df['Id'].to_numpy()
# are the odds equal to evens?
out = df[np.repeat((a[::2]==a[1::2]), 2)]
output:
Index Time Id
2 2 10:10:02 12
3 3 10:10:04 12
4 4 10:10:06 13
5 5 10:10:07 13
6 6 10:10:08 11
7 7 10:10:10 11
Select odd rows from a specific column in a dataframe
The row/column
index is used when there are dim attributes. vector
doesn't have it.
is.vector(df$Amount)
If we extract the vector, then just use the row index
df$Amount[c(FALSE, TRUE)]
If we want to subset the rows of the dataset,
df[c(FALSE, TRUE), 'Amount', drop = FALSE]
In the above code, we are specify the row index (i
), 'j' as the column index or column name, and drop
(?Extract
- is by default drop = TRUE
for data.frame
. So, we need to specify drop = FALSE
to not lose the dimensions and coerce to a vector)
Applying a function to modify odd and even rows in R dataframe column
Two approaches, with loop
or dplyr
:
``` r
x<-(c(4,3,5,6,2,1))
df<-as.data.frame(x)
func<- function(x){
res <- rep(NA,length(x))
for(i in seq(1,length(x))){
if(i%%2 == 1){
res[i] = x[i]/ifelse(i==length(x),NA,x[i+1])+x[i]}
else if(i%%2 == 0){
res[i] = x[i]/ifelse(i==1,NA,x[i-1])+x[i]}
}
res
}
func(df$x)
#> [1] 5.333333 3.750000 5.833333 7.200000 4.000000 1.500000
library(dplyr)
df %>% mutate(x = ifelse(row_number()%%2, x/lead(x)+x ,x/lag(x)+x))
#> x
#> 1 5.333333
#> 2 3.750000
#> 3 5.833333
#> 4 7.200000
#> 5 4.000000
#> 6 1.500000
Selecting odds/even rows only in R using readxl
Doesn't seem like read_excel
offers this functionality
read_excel(path, sheet = 1, col_names = TRUE, col_types = NULL, na = "", skip = 0)
You can subset after reading in the file with
df <- read_excel("C:\\Users\\Patrick\\Desktop\\Age.xlsx", col_names=T)
df[c(TRUE, FALSE),] # for odd rows
df[c(FALSE, TRUE),] # for even rows
Pandas - Combine even/odd columns and aggregate by hour
After reading my_csv_file.csv
, you should add the corresponding in/out columns, create a timestamp column and group by the timestamp at hour level:
import pandas as pd
# Read file, no header!
df = pd.read_csv('my_csv_file.csv', header=None)
n_cols = len(df.columns)
# Sum all inputs and outputs
df['in'] = df.iloc[:,range(2,n_cols ,2)].sum(axis=1)
df['out'] = df.iloc[:,range(3,n_cols ,2)].sum(axis=1)
df = df.drop(columns=range(2,n_cols))
# Create a timestamp with the date and hour
df['timestamp'] = pd.to_datetime((df[0] + ' ' + df[1]))
df =df.drop(columns=[0,1])
# Groupby same hour and same date and sum
df_grouped = df.groupby([df.timestamp.dt.date, df.timestamp.dt.hour], group_keys=False).sum()
# Prettify the output
df_grouped.index.names = ['date', 'hour']
df_grouped = df_grouped.reset_index()
# date hour in out
#0 2020-12-01 16 0 2
#1 2020-12-01 17 1 1
#2 2020-12-01 18 1 0
Note: to recreate the data I used for the example you can use this line of code (in replacement of the read_csv
)
df = pd.DataFrame({0: {0: '12/01/2020', 1: '12/01/2020', 2: '12/01/2020', 3: '12/01/2020', 4: '12/01/2020', 5: '12/01/2020', 6: '12/01/2020', 7: '12/01/2020', 8: '12/01/2020', 9: '12/01/2020'}, 1: {0: '16:02:00', 1: '16:03:00', 2: '16:04:00', 3: '16:05:00', 4: '17:06:00', 5: '17:07:06', 6: '17:08:00', 7: '17:09:01', 8: '18:10:00', 9: '18:11:00'}, 2: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 1, 6: 0, 7: 0, 8: 0, 9: 1}, 3: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 4: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 5: {0: 2, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 6: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 7: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 8: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 9: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 1, 8: 0, 9: 0}})
Related Topics
Splitting a Large Data Frame into Smaller Segments
R: How to Check If All Columns in a Data.Frame Are the Same
Create Counter Within Consecutive Runs of Values
R: Error in Usemethod("Group_By_"):Applied to an Object of Class
Rstudio Does Not Display Any Output in Console After Entering Code
Setting Individual Axis Limits With Facet_Wrap and Scales = "Free" in Ggplot2
Calculate the Area Under a Curve
Change R Default Library Path Using .Libpaths in Rprofile.Site Fails to Work
Join 3 Columns of Different Lengths in R
Replacing Na Values from Another Dataframe by Id
How Does the 'Prop.Table()' Function Work in R
I Want to Split Street Address into Two Columns. One With Street Number Other With Street Name
R Count Distinct Elements Based on Two Columns by Group
Saving Output of Confusionmatrix as a .Csv Table
How to Combine Multiple Variable Data to a Single Variable Data