How to Move Cells With a Value Row-Wise to the Left in a Dataframe

How to move cells with a value row-wise to the left in a dataframe

yourdata[]<-t(apply(yourdata,1,function(x){
                           c(x[!is.na(x)],x[is.na(x)])}))

should work : for each row, it replaces the row by a vector that consists of, first, the value that are not NA, then the NA values.

Python Dataframe. Move rows values left according index of rows

Another way by using a simple loop to shift the values in every row, and then use
fillna to replace NA values with 0:

for i in range(len(df)):
    df.iloc[i,:] = df.iloc[i,:].shift(-i)

df.fillna(0, inplace=True)

Output:

>>> df
   one   two  three  four
0   20  15.0   10.0   5.0
1   15  10.0    5.0   0.0
2   10   5.0    0.0   0.0
3    5   0.0    0.0   0.0

Shift pd.dataframe's rows depending of value in a specific cells

If use shift in pandas by default, then last columns are lost. So is necessary first add new columns filled by missing values - number of columns depends of difference of non 2017 values.

df = df.set_index('Year')

diff = np.setdiff1d(df.index.dropna().unique(), [2017]).astype(int)
print (diff)
[2018 2019]

df = df.assign(**{f'new{x}':np.nan for x in range(max(diff-2017))})

Then you can use shift in loop and filter by DataFrame.loc by years in index:

for y in diff:
    df.loc[y, :] = df.astype(float).shift(y - 2017, axis=1).loc[y, :]

Last replace missing values, cast to integers and convert index to columns:

df = df.fillna(0).astype(int).reset_index()
print (df)
   Year  B  C  D  E  new0  new1
0  2017  4  0  0  5     0     0
1  2019  0  0  5  0     1     3
2  2018  0  4  0  3     6     0
3  2017  5  0  5  9     0     0
4  2017  5  0  7  2     0     0
5  2017  4  7  1  4     0     0

EDIT:

Solution with another column:

df = pd.DataFrame({
         'new':list('abcdef'),
         'Year':[2017, 2019, 2018, 2017, 2017, 2017],
         'B':[4,5,4,5,5,4],
         'C':[0,0,0,0,0,7],
         'D':[0,1,3,5,7,1],
         'E':[5,3,6,9,2,4]})
print (df)
  new  Year  B  C  D  E
0   a  2017  4  0  0  5
1   b  2019  5  0  1  3
2   c  2018  4  0  3  6
3   d  2017  5  0  5  9
4   e  2017  5  0  7  2
5   f  2017  4  7  1  4

df = df.set_index(['new','Year'])

diff = np.setdiff1d(df.index.get_level_values('Year').dropna().unique(), [2017]).astype(int)
print (diff)
[2018 2019]

df1 = pd.DataFrame(index=df.index, columns=['new{}'.format(x) for x in range(max(diff-2017))])
df = pd.concat([df, df1], axis=1) 
print (df)
          B  C  D  E new0 new1
new Year                      
a   2017  4  0  0  5  NaN  NaN
b   2019  5  0  1  3  NaN  NaN
c   2018  4  0  3  6  NaN  NaN
d   2017  5  0  5  9  NaN  NaN
e   2017  5  0  7  2  NaN  NaN
f   2017  4  7  1  4  NaN  NaN

for y in diff:
    idx = pd.IndexSlice
    df.loc[idx[:, y], :] = df.astype(float).shift(y - 2017, axis=1).loc[idx[:, y], :]

df = df.fillna(0).astype(int).reset_index()
print (df)
  new  Year  B  C  D  E  new0  new1
0   a  2017  4  0  0  5     0     0
1   b  2019  0  0  5  0     1     3
2   c  2018  0  4  0  3     6     0
3   d  2017  5  0  5  9     0     0
4   e  2017  5  0  7  2     0     0
5   f  2017  4  7  1  4     0     0

Pandas. A pretty way to delete cell and shift left others in row?

You need select rows for shifting, e.g. here is tested if first 2 values in X1 are numeric by str[:2] and Series.str.isnumeric, invert mask by ~, so only for non numeric value use DataFrame.shift:

m = ~df['X1'].str[:2].str.isnumeric()

Another idea for mask, thank you @Manakin is test if datetimes in format HH:MM:

m = pd.to_datetime(df['X1'],format='%H:%M',errors='coerce').isna()

Also if want test numeric 2 numbers with : with length 2:

m = ~df['X1'].str.contains('^\d{2}:\d{2}$')

df[m] = df[m].shift(-1, axis=1)
print(df)
      X1       X2       X3
0  12:40  anytext  anytext
1  12:44  anytext      NaN
2  14:06  anytext      NaN
3  15:44  anytext  anytext
4  16:01  anytext  anytext

If need modify all columns after X1 one idea:

df=pd.DataFrame({'X0':['anytext','anytext','anytext','anytext','anytext'],
                 'X1':['12:40','boss','engen','15:44','16:01'],
                 'X2':['anytext','12:44','14:06','anytext','anytext'],
                 'X3':['anytext','anytext','anytext','anytext','anytext']}) 

m = ~df['X1'].str.contains('^\d{2}:\d{2}$')
df.loc[m, 'X1':] =df.loc[m, 'X1':].shift(-1, axis=1)
print(df)
       X0     X1       X2       X3
0  anytext  12:40  anytext  anytext
1  anytext  12:44  anytext      NaN
2  anytext  14:06  anytext      NaN
3  anytext  15:44  anytext  anytext
4  anytext  16:01  anytext  anytext

Another with convert X0 to index:

df = df.set_index('X0')
m = ~df['X1'].str.contains('^\d{2}:\d{2}$')
df[m] = df[m].shift(-1, axis=1)
df = df.reset_index()
print(df)
        X0     X1       X2       X3
0  anytext  12:40  anytext  anytext
1  anytext  12:44  anytext      NaN
2  anytext  14:06  anytext      NaN
3  anytext  15:44  anytext  anytext
4  anytext  16:01  anytext  anytext

Is there a way to shift pandas data frame first row only one cell to the right?

Yes, you can do something like this shift he first row of a dataframe to the right one column. Use iloc to select this row all columns which returns a pd.Series, then use shift to shift the values of this series one position and assign this newly shifted series back to the first row of the dataframe.

df.iloc[0, :] = df.iloc[0, :].shift()

MCVE:

import pandas as pd
import numpy as np

df = pd.DataFrame([[*'ABCD']+[np.nan],[1,2,3,4,5],[5,6,7,9,10],[11,12,13,14,15]])

df
# Input DataFrame
#    0   1   2   3     4
# 0   A   B   C   D   NaN
# 1   1   2   3   4   5.0
# 2   5   6   7   9  10.0
# 3  11  12  13  14  15.0


df.iloc[0, :] = df.iloc[0, :].shift()

df
# Output DataFrame
#      0   1   2   3   4
# 0  NaN   A   B   C   D
# 1    1   2   3   4   5
# 2    5   6   7   9  10
# 3   11  12  13  14  15

Remove all cells containing 0 and move values to the left

you can also do:

read.table(text=gsub('\\b0\\b','',do.call(paste,df)),fill=T,col.names = names(df))
  Item X35 X45 X55 X65 X75 X85 X95 X100
1    1  35  85  NA  NA  NA  NA  NA   NA
2    2  55  65  NA  NA  NA  NA  NA   NA
3    3  75  85  NA  NA  NA  NA  NA   NA
4    4  45 100  NA  NA  NA  NA  NA   NA
5    5  85  95  NA  NA  NA  NA  NA   NA

Move non-empty cells to the left in pandas DataFrame

Here's what I did:

I unstacked your dataframe into a longer format, then grouped by the name column. Within each group, I drop the NaNs, but then reindex to the full h1 thought h4 set, thus re-creating your NaNs to the right.

from io import StringIO
import pandas

def defragment(x):
    values = x.dropna().values
    return pandas.Series(values, index=df.columns[:len(values)])

datastring = StringIO("""\
Name    h1    h2    h3    h4
A       1     nan   2     3
B       nan   nan   1     3
C       1     3     2     nan""")

df = pandas.read_table(datastring, sep='\s+').set_index('Name')
long_index = pandas.MultiIndex.from_product([df.index, df.columns])

print(
    df.stack()
      .groupby(level='Name')
      .apply(defragment)
      .reindex(long_index)  
      .unstack()  
)

And so I get:

   h1  h2  h3  h4
A   1   2   3 NaN
B   1   3 NaN NaN
C   1   3   2 NaN

Using R to shift values to the left of data.frame

We can loop over the rows and concatenate the non-NA elements followed by the NA elements and assign it back to the dataset

df[] <-  t(apply(df, 1, function(x) c(x[!is.na(x)], x[is.na(x)])))
df
#        A      B     C
#1  yellow purple  <NA>
#2  yellow   <NA>  <NA>
#3  orange yellow  <NA>
#4  orange  brown  <NA>
#5   brown purple  <NA>
#6  yellow purple  pink
#7  purple  green  pink
#8  yellow   pink green
#9  purple orange  <NA>
#10 purple  brown  <NA>

data

df <- structure(list(A = c("yellow", NA, "orange", "orange", NA, "yellow", 
"purple", "yellow", "purple", "purple"), B = c("purple", NA, 
"yellow", NA, "brown", "purple", "green", "pink", "orange", NA
 ), C = c(NA, "yellow", NA, "brown", "purple", "pink", "pink", 
 "green", NA, "brown")), .Names = c("A", "B", "C"), row.names = c("1", 
 "2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")

Python Pandas: How to move one row to the first row of a Dataframe?

Reindexing is probably the optimal solution for putting the rows in any new order in 1 apparent step, except it may require producing a new DataFrame which could be prohibitively large.

For example

import pandas as pd

t = pd.read_csv('table.txt',sep='\s+')
t
Out[81]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
1   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
2   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

t.index
Out[82]: Int64Index([0, 1, 2, 3], dtype='int64')

t2 = t.reindex([2,0,1,3]) # cannot do this in place
t2
Out[93]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
2   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
0   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
1   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

Now the index can be set back to range(4) without reindexing:

t2.index=range(4)
Out[102]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
1   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
2   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

It can also be done with 'tuple switching' and row selection as a basic mechanism and without creating a new DataFrame. For example:

import pandas as pd

t = pd.read_csv('table.txt',sep='\s+')

t.ix[1], t.ix[2] = t.ix[2], t.ix[1]
t.ix[0], t.ix[1] = t.ix[1], t.ix[0]  
t
Out[96]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
1   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
2   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

Another in place method sets the DataFrame index for the desired ordering so that, for example, the 3rd row gets index 0, etc. and then the DataFrame is sorted in place. It's encapsulated in the following function that assumes the rows are indexed with some range(m) for positive integer m and the DataFrame is simply indexed (no MultiIndex) as in the example provided in the question.

def putfirst(n,df):
    if not isinstance(n, int):
        print 'error: 1st arg must be an int'
        return
    if n < 1:
        print 'error: 1st arg must be an int > 0'
        return
    if n == 1:
       print 'nothing to do when first arg == 1'
       return
    if n > len(df):
       print 'error: n exceeds the number of rows in the DataFrame'
       return
    df.index = range(1,n) + [0] + range(n,df.index[-1]+1)
    df.sort(inplace=True)

The arguments of putfirst are n, which is the ordinal position of the row to relocate to the first row position, so that if the 3rd row is to be so relocated then n = 3; and df is the DataFrame containing the row to be relocated.

Here is a demo:

import pandas as pd

df = pd.DataFrame(np.random.randn(10, 5),columns=['a', 'b', 'c', 'd', 'e'])

df.set_index("a") # ineffective without assignment or inplace=True
Out[182]: 
                  b         c         d         e
a                                                
 1.394072 -1.076742 -0.192466 -0.871188  0.420852
-1.211411 -0.258867 -0.581647 -1.260421  0.464575
-1.070241  0.804223 -0.156736  2.010390 -0.887104
-0.977936 -0.267217  0.483338 -0.400333  0.449880
 0.399594 -0.151575 -2.557934  0.160807  0.076525
-0.297204 -1.294274 -0.885180 -0.187497 -0.493560
-0.115413 -0.350745  0.044697 -0.897756  0.890874
-1.151185 -2.612303  1.141250 -0.867136  0.383583
-0.437030  0.347489 -1.230179  0.571078  0.060061
-0.225524  1.349726  1.350300 -0.386653  0.865990

df
Out[183]: 
          a         b         c         d         e
0  1.394072 -1.076742 -0.192466 -0.871188  0.420852
1 -1.211411 -0.258867 -0.581647 -1.260421  0.464575
2 -1.070241  0.804223 -0.156736  2.010390 -0.887104
3 -0.977936 -0.267217  0.483338 -0.400333  0.449880
4  0.399594 -0.151575 -2.557934  0.160807  0.076525
5 -0.297204 -1.294274 -0.885180 -0.187497 -0.493560
6 -0.115413 -0.350745  0.044697 -0.897756  0.890874
7 -1.151185 -2.612303  1.141250 -0.867136  0.383583
8 -0.437030  0.347489 -1.230179  0.571078  0.060061
9 -0.225524  1.349726  1.350300 -0.386653  0.865990

df.index
Out[184]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')

putfirst(3,df)
df
Out[186]: 
          a         b         c         d         e
0 -1.070241  0.804223 -0.156736  2.010390 -0.887104
1  1.394072 -1.076742 -0.192466 -0.871188  0.420852
2 -1.211411 -0.258867 -0.581647 -1.260421  0.464575
3 -0.977936 -0.267217  0.483338 -0.400333  0.449880
4  0.399594 -0.151575 -2.557934  0.160807  0.076525
5 -0.297204 -1.294274 -0.885180 -0.187497 -0.493560
6 -0.115413 -0.350745  0.044697 -0.897756  0.890874
7 -1.151185 -2.612303  1.141250 -0.867136  0.383583
8 -0.437030  0.347489 -1.230179  0.571078  0.060061
9 -0.225524  1.349726  1.350300 -0.386653  0.865990

Moving data from right to left column in a tibble

Using dplyr and tidyr. Reshape from wide to long, exclude "^RSY" and NA diagnosis, reshape long to wide.

library(dplyr)
library(tidyr)

gather(data, key = "k", value = "v", -id) %>% 
  filter(!(grepl("^[R|S|Y]", v) | is.na(v))) %>% 
  group_by(id) %>% 
  mutate(diagN = paste0("diagnosis_", row_number())) %>% 
  select(-k) %>% 
  spread(key = "diagN", value = "v") %>% 
  ungroup()

# # A tibble: 10 x 3
#       id diagnosis_1 diagnosis_2
#    <int> <chr>       <chr>      
#  1     1 F32         F40        
#  2     2 F431        NA         
#  3     3 F65         NA         
#  4     4 F431        NA         
#  5     5 F11         F19        
#  6     6 F60         NA         
#  7     7 G35         NA         
#  8     8 F32         NA         
#  9     9 F32         F11        
# 10    10 Z032        NA

How to Move Cells With a Value Row-Wise to the Left in a Dataframe