How to Create Variable Columns and Fill Them Up

How to create variable columns and fill them up?

columns can do this:

#parent {
  background-color: firebrick;
  column-width:120px; /* set the width of columns and the number will be automatic */
  column-gap: 20px; /* to replace margin between element */
  padding:0 10px;
}

.child {
  background-color: #fff;
  height: 30px;
  display:inline-block; /* use inline-block because block element are buggy */
  width:100%; /* make them full width so they behave like block */
  margin:10px 0; /* keep only top and bottom margin */
  padding: 3px;
  box-sizing:border-box;
}

<div id="parent">
  <div class="child">child</div>
  <div class="child">child</div>
  <div class="child">child</div>
  <div class="child">child</div>
  <div class="child">child</div>
  <div class="child">child</div>
  <div class="child">child</div>
  <div class="child">child</div>
</div>

How do I create a new variable in my dataframe filling the values with the dataframe name?

Before you concat/join your dataframes together add a new column with the countries name as the default value, then concat.

print(df.name)
>>> Iran
print(df2.name)
>>> United States of America

df['Name'] = df.name
df2['Name'] = df2.name
countryDF = pd.concat([df, df2], axis=1).reset_index()

Dont know what added manipulations you are wanting to do i.e. Cutting out columns etc.

Create a new column a fill with values from a set of multiple columns conditional on column names

One option to achieve your desired result would be via an if condition:

library(dplyr)
library(stringr)
df %>% 
  rowwise() %>%
  mutate(new_col = if (str_c('A0', X) %in% names(.)) get(str_c('A0', X)) else NA) %>%
  ungroup()
#> # A tibble: 8 × 9
#>     A01   A02   A03   A04   A05   A06   A07     X new_col
#>   <int> <int> <int> <int> <int> <int> <int> <int>   <int>
#> 1     0     0    -5    -1    -1     2     3     2       0
#> 2     0    -1    -4    -3    -3    -3    -3     2      -1
#> 3     2     0     2     3     1     3     3     6       3
#> 4     0     1    -4     1    -1     1     1     7       1
#> 5     4     4     3     3     3     4     4    12      NA
#> 6     1     4     2    -3     0     0     0    15      NA
#> 7    10     9     8     9     7     7     7    22      NA
#> 8    10    12    12    12    10    12     9    24      NA

Turn field names into column names for specific variables and fill them with certain logic

Not sure I fully understand the question, but the code below produces your example dataframe.

library(tidyverse)
product<-c("ab","ab","ab","ac","ac","ac")
shop<-c("sad","sad","sad","sadas","fghj","xzzv")
category<-c("a","a","a","c","b","b")
tempr<-c(35,35,14,24,14,5)
value<-c(0,0,-6,8,4,0)
store<-data.frame(product,shop,category,tempr,value)

store %>% filter(value != 0 ) %>%  # Remove 0 values 
  mutate(combined =  paste0(tempr,"(",value,")")) %>% # Combine columns for spread
  select(-tempr,-value) %>%  #
  spread(shop,combined) # spread to create shop columns and temr/value values. 

  #       product category  fghj    sad     sadas
  # 1      ab        a      <NA>    14(-6)  <NA>
  # 2      ac        b       14(4) <NA>     <NA>
  # 3      ac        c      <NA>   <NA>     24(8)

Create new sequentially named variables and fill with mean of level

Depending on if I understood you right, I'll propose this giant ball of duct tape...

# fake data
dummydata <- data.frame(id=c(1:100),sex=rep(c(1,0),50),WBC=rnorm(100),RBC=rnorm(100))

# a function to calculate decile means
decilemean <- function(x) {
  xrank <- rank(x)
  xdec <- floor((xrank-1)/length(x)*10)+1
  decmeans <- as.numeric(tapply(x,xdec,mean))
  xdecmeans <- decmeans[xdec]
  return(xdecmeans)
}

# looping thru your data columns and making new columns
newcol <- 5          # the first new column to create
for(j in c(3,4)) {   # all of your colums to decilemean-ify
  dummydata[,newcol] <- NA
  dummydata[dummydata$sex==0,newcol] <- decilemean(dummydata[dummydata$sex==0,j])
  names(dummydata)[newcol] <- paste0(names(dummydata)[j],"_decmean_women")
  dummydata[,newcol+1] <- NA
  dummydata[dummydata$sex==1,newcol+1] <- decilemean(dummydata[dummydata$sex==1,j])
  names(dummydata)[newcol+1] <- paste0(names(dummydata)[j],"_decmean_men")
  newcol <- newcol+2
}

I'd recommend testing it though ;)

Creating columns for each observed value of a variable

Base R approach, we can split the outcome column based on id and create a dataframe incrementally adding one value at a time in outcome variable and filling rest of them with NA and finally rbind these list of dataframes into one dataframe.

n <- 5
df[paste0("outcome_t", seq_len(n))] <- do.call(rbind, 
    lapply(split(df$outcome, df$id), function(x) 
  t(sapply(seq_along(x), function(y) c(x[seq_len(y - 1)], rep(NA, n - (y - 1)))))))

df
#   id t outcome outcome_t1 outcome_t2 outcome_t3 outcome_t4 outcome_t5
#1   1 1      10         NA         NA         NA         NA         NA
#2   1 2      20         10         NA         NA         NA         NA
#3   1 3      30         10         20         NA         NA         NA
#4   1 4      40         10         20         30         NA         NA
#5   1 5      40         10         20         30         40         NA
#6   2 1      20         NA         NA         NA         NA         NA
#7   2 2      30         20         NA         NA         NA         NA
#8   2 3      40         20         30         NA         NA         NA
#9   2 4      40         20         30         40         NA         NA
#10  2 5      20         20         30         40         40         NA

A tidyverse option using separate

library(tidyverse)

df %>%
   group_by(id) %>%
   mutate(new = map_chr(seq_along(outcome), 
         ~paste0(outcome[seq_len(. - 1)], collapse = ","))) %>%
   separate(new, into = paste0("outcome_t", seq_len(n)), 
                 sep = ",", fill = "right") %>%
   mutate(outcome_t1 = replace(outcome_t1, outcome_t1 == "", NA))

data

df <- data.frame(id = rep(c(1, 2), each = 5), t = 1:5, 
     outcome = c(10, 20, 30, 40, 40, 20, 30, 40, 40, 20))

How to create columns/variables by extracting characters from given column in R

Try

library(tidyr)
df_sep <- separate(df, key, into=c("State","Zip_Code", "Age_Group", "Race", "Gender"), sep="_")

   State Zip_Code Age_Group Race Gender      date   census
1     01    35004     10-14    +      M 11NOV2001 2.934397
2     01    35004     10-14    +      M 06JAN2002 3.028231
3     01    35004     10-14    +      M 07APR2002 3.180712
4     01    35004     10-14    +      M 02JUN2002 3.274546
5     01    35004     10-14    +      M 28JUL2002 3.368380
6     01    35004     10-14    +      M 22SEP2002 3.462214
7     01    35004     10-14    +      M 22DEC2002 3.614694
8     01    35004     10-14    +      M 16FEB2003 3.708528
9     01    35004     10-14    +      M 13JUL2003 3.954843
10    01    35004     10-14    +      M 07SEP2003 4.048677

Edit: Alright, in your comments you have made it clear that you really want to have a solution that loops through observations, which is an inefficient approach and for a good reason typically considered bad practice. Having expressed my objections, let me show you one approach:

First, we need to populate the dataframe with the columns. To use your approach, this would be:

Var = c("State","Zip_Code", "Age_Group", "Race", "Gender")
for(j in Var){
  df <- within(df, assign(j, NA))
}

However, a more efficient approach would be:

df[, Var]<- NA

Both give:

head(df)
                 key      date   census State Zip_Code Age_Group Race Gender
1 01_35004_10-14_+_M 11NOV2001 2.934397    NA       NA        NA   NA     NA
2 01_35004_10-14_+_M 06JAN2002 3.028231    NA       NA        NA   NA     NA
3 01_35004_10-14_+_M 07APR2002 3.180712    NA       NA        NA   NA     NA
4 01_35004_10-14_+_M 02JUN2002 3.274546    NA       NA        NA   NA     NA
5 01_35004_10-14_+_M 28JUL2002 3.368380    NA       NA        NA   NA     NA
6 01_35004_10-14_+_M 22SEP2002 3.462214    NA       NA        NA   NA     NA

Now, for each observation, we want to split key into components and fill columns 4 to 8 with the corresponding elements. This will be achieved with the following:

df[, Var] <- t(sapply(df$key, function(x) unlist(strsplit(as.character(x[1]), "_"))))

Here, sapply loops through the elements of df$key and passes each element as argument the the function that I have defined, and collects the result in an array.

See:

sapply(df$key, function(x) unlist(strsplit(as.character(x[1]), "_")))
     [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]    [,9]    [,10]  
[1,] "01"    "01"    "01"    "01"    "01"    "01"    "01"    "01"    "01"    "01"   
[2,] "35004" "35004" "35004" "35004" "35004" "35004" "35004" "35004" "35004" "35004"
[3,] "10-14" "10-14" "10-14" "10-14" "10-14" "10-14" "10-14" "10-14" "10-14" "10-14"
[4,] "+"     "+"     "+"     "+"     "+"     "+"     "+"     "+"     "+"     "+"    
[5,] "M"     "M"     "M"     "M"     "M"     "M"     "M"     "M"     "M"     "M"

Transposing it t() makes sure that it "fits" into the dataframe df[, Var], and here you see that the results are identical:

identical(df[,Var], df_sep[Var])
[1] TRUE

I assume that some of the entries in df$key differ in their format, which is why you may want to check each value first. To do so, you can just embellish the function in the sapply call.

How do I fill a column with one value in Pandas?

Just select the column and assign like normal:

In [194]:
df['A'] = 'foo'
df

Out[194]:
     A
0  foo
1  foo
2  foo
3  foo

Assigning a scalar value will set all the rows to the same scalar value

How to Create Variable Columns and Fill Them Up