How to Move Cells With a Value Row-Wise to the Left in a Dataframe

How to move cells with a value row-wise to the left in a dataframe

yourdata[]<-t(apply(yourdata,1,function(x){
c(x[!is.na(x)],x[is.na(x)])}))

should work : for each row, it replaces the row by a vector that consists of, first, the value that are not NA, then the NA values.

Python Dataframe. Move rows values left according index of rows

Another way by using a simple loop to shift the values in every row, and then use
fillna to replace NA values with 0:

for i in range(len(df)):
df.iloc[i,:] = df.iloc[i,:].shift(-i)

df.fillna(0, inplace=True)

Output:

>>> df
one two three four
0 20 15.0 10.0 5.0
1 15 10.0 5.0 0.0
2 10 5.0 0.0 0.0
3 5 0.0 0.0 0.0

Shift pd.dataframe's rows depending of value in a specific cells

If use shift in pandas by default, then last columns are lost. So is necessary first add new columns filled by missing values - number of columns depends of difference of non 2017 values.

df = df.set_index('Year')

diff = np.setdiff1d(df.index.dropna().unique(), [2017]).astype(int)
print (diff)
[2018 2019]

df = df.assign(**{f'new{x}':np.nan for x in range(max(diff-2017))})

Then you can use shift in loop and filter by DataFrame.loc by years in index:

for y in diff:
df.loc[y, :] = df.astype(float).shift(y - 2017, axis=1).loc[y, :]

Last replace missing values, cast to integers and convert index to columns:

df = df.fillna(0).astype(int).reset_index()
print (df)
Year B C D E new0 new1
0 2017 4 0 0 5 0 0
1 2019 0 0 5 0 1 3
2 2018 0 4 0 3 6 0
3 2017 5 0 5 9 0 0
4 2017 5 0 7 2 0 0
5 2017 4 7 1 4 0 0

EDIT:

Solution with another column:

df = pd.DataFrame({
'new':list('abcdef'),
'Year':[2017, 2019, 2018, 2017, 2017, 2017],
'B':[4,5,4,5,5,4],
'C':[0,0,0,0,0,7],
'D':[0,1,3,5,7,1],
'E':[5,3,6,9,2,4]})
print (df)
new Year B C D E
0 a 2017 4 0 0 5
1 b 2019 5 0 1 3
2 c 2018 4 0 3 6
3 d 2017 5 0 5 9
4 e 2017 5 0 7 2
5 f 2017 4 7 1 4

df = df.set_index(['new','Year'])

diff = np.setdiff1d(df.index.get_level_values('Year').dropna().unique(), [2017]).astype(int)
print (diff)
[2018 2019]

df1 = pd.DataFrame(index=df.index, columns=['new{}'.format(x) for x in range(max(diff-2017))])
df = pd.concat([df, df1], axis=1)
print (df)
B C D E new0 new1
new Year
a 2017 4 0 0 5 NaN NaN
b 2019 5 0 1 3 NaN NaN
c 2018 4 0 3 6 NaN NaN
d 2017 5 0 5 9 NaN NaN
e 2017 5 0 7 2 NaN NaN
f 2017 4 7 1 4 NaN NaN

for y in diff:
idx = pd.IndexSlice
df.loc[idx[:, y], :] = df.astype(float).shift(y - 2017, axis=1).loc[idx[:, y], :]

df = df.fillna(0).astype(int).reset_index()
print (df)
new Year B C D E new0 new1
0 a 2017 4 0 0 5 0 0
1 b 2019 0 0 5 0 1 3
2 c 2018 0 4 0 3 6 0
3 d 2017 5 0 5 9 0 0
4 e 2017 5 0 7 2 0 0
5 f 2017 4 7 1 4 0 0

Pandas. A pretty way to delete cell and shift left others in row?

You need select rows for shifting, e.g. here is tested if first 2 values in X1 are numeric by str[:2] and Series.str.isnumeric, invert mask by ~, so only for non numeric value use DataFrame.shift:

m = ~df['X1'].str[:2].str.isnumeric()

Another idea for mask, thank you @Manakin is test if datetimes in format HH:MM:

m = pd.to_datetime(df['X1'],format='%H:%M',errors='coerce').isna()

Also if want test numeric 2 numbers with : with length 2:

m = ~df['X1'].str.contains('^\d{2}:\d{2}$')


df[m] = df[m].shift(-1, axis=1)
print(df)
X1 X2 X3
0 12:40 anytext anytext
1 12:44 anytext NaN
2 14:06 anytext NaN
3 15:44 anytext anytext
4 16:01 anytext anytext

If need modify all columns after X1 one idea:

df=pd.DataFrame({'X0':['anytext','anytext','anytext','anytext','anytext'],
'X1':['12:40','boss','engen','15:44','16:01'],
'X2':['anytext','12:44','14:06','anytext','anytext'],
'X3':['anytext','anytext','anytext','anytext','anytext']})

m = ~df['X1'].str.contains('^\d{2}:\d{2}$')
df.loc[m, 'X1':] =df.loc[m, 'X1':].shift(-1, axis=1)
print(df)
X0 X1 X2 X3
0 anytext 12:40 anytext anytext
1 anytext 12:44 anytext NaN
2 anytext 14:06 anytext NaN
3 anytext 15:44 anytext anytext
4 anytext 16:01 anytext anytext

Another with convert X0 to index:

df = df.set_index('X0')
m = ~df['X1'].str.contains('^\d{2}:\d{2}$')
df[m] = df[m].shift(-1, axis=1)
df = df.reset_index()
print(df)
X0 X1 X2 X3
0 anytext 12:40 anytext anytext
1 anytext 12:44 anytext NaN
2 anytext 14:06 anytext NaN
3 anytext 15:44 anytext anytext
4 anytext 16:01 anytext anytext

Is there a way to shift pandas data frame first row only one cell to the right?

Yes, you can do something like this shift he first row of a dataframe to the right one column. Use iloc to select this row all columns which returns a pd.Series, then use shift to shift the values of this series one position and assign this newly shifted series back to the first row of the dataframe.

df.iloc[0, :] = df.iloc[0, :].shift()

MCVE:

import pandas as pd
import numpy as np

df = pd.DataFrame([[*'ABCD']+[np.nan],[1,2,3,4,5],[5,6,7,9,10],[11,12,13,14,15]])

df
# Input DataFrame
# 0 1 2 3 4
# 0 A B C D NaN
# 1 1 2 3 4 5.0
# 2 5 6 7 9 10.0
# 3 11 12 13 14 15.0


df.iloc[0, :] = df.iloc[0, :].shift()

df
# Output DataFrame
# 0 1 2 3 4
# 0 NaN A B C D
# 1 1 2 3 4 5
# 2 5 6 7 9 10
# 3 11 12 13 14 15

Remove all cells containing 0 and move values to the left

you can also do:

read.table(text=gsub('\\b0\\b','',do.call(paste,df)),fill=T,col.names = names(df))
Item X35 X45 X55 X65 X75 X85 X95 X100
1 1 35 85 NA NA NA NA NA NA
2 2 55 65 NA NA NA NA NA NA
3 3 75 85 NA NA NA NA NA NA
4 4 45 100 NA NA NA NA NA NA
5 5 85 95 NA NA NA NA NA NA

Move non-empty cells to the left in pandas DataFrame

Here's what I did:

I unstacked your dataframe into a longer format, then grouped by the name column. Within each group, I drop the NaNs, but then reindex to the full h1 thought h4 set, thus re-creating your NaNs to the right.

from io import StringIO
import pandas

def defragment(x):
values = x.dropna().values
return pandas.Series(values, index=df.columns[:len(values)])

datastring = StringIO("""\
Name h1 h2 h3 h4
A 1 nan 2 3
B nan nan 1 3
C 1 3 2 nan""")

df = pandas.read_table(datastring, sep='\s+').set_index('Name')
long_index = pandas.MultiIndex.from_product([df.index, df.columns])

print(
df.stack()
.groupby(level='Name')
.apply(defragment)
.reindex(long_index)
.unstack()
)

And so I get:

   h1  h2  h3  h4
A 1 2 3 NaN
B 1 3 NaN NaN
C 1 3 2 NaN

Using R to shift values to the left of data.frame

We can loop over the rows and concatenate the non-NA elements followed by the NA elements and assign it back to the dataset

df[] <-  t(apply(df, 1, function(x) c(x[!is.na(x)], x[is.na(x)])))
df
# A B C
#1 yellow purple <NA>
#2 yellow <NA> <NA>
#3 orange yellow <NA>
#4 orange brown <NA>
#5 brown purple <NA>
#6 yellow purple pink
#7 purple green pink
#8 yellow pink green
#9 purple orange <NA>
#10 purple brown <NA>

data

df <- structure(list(A = c("yellow", NA, "orange", "orange", NA, "yellow", 
"purple", "yellow", "purple", "purple"), B = c("purple", NA,
"yellow", NA, "brown", "purple", "green", "pink", "orange", NA
), C = c(NA, "yellow", NA, "brown", "purple", "pink", "pink",
"green", NA, "brown")), .Names = c("A", "B", "C"), row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")

Python Pandas: How to move one row to the first row of a Dataframe?

Reindexing is probably the optimal solution for putting the rows in any new order in 1 apparent step, except it may require producing a new DataFrame which could be prohibitively large.

For example

import pandas as pd

t = pd.read_csv('table.txt',sep='\s+')
t
Out[81]:
DG/VD TYPE State Access Consist Cache sCC Size Units Name
0 0/0 RAID1 Optl RW No RWTD - 1.818 TB one
1 1/1 RAID1 Optl RW No RWTD - 1.818 TB two
2 2/2 RAID1 Optl RW No RWTD - 1.818 TB three
3 3/3 RAID1 Optl RW No RWTD - 1.818 TB four

t.index
Out[82]: Int64Index([0, 1, 2, 3], dtype='int64')

t2 = t.reindex([2,0,1,3]) # cannot do this in place
t2
Out[93]:
DG/VD TYPE State Access Consist Cache sCC Size Units Name
2 2/2 RAID1 Optl RW No RWTD - 1.818 TB three
0 0/0 RAID1 Optl RW No RWTD - 1.818 TB one
1 1/1 RAID1 Optl RW No RWTD - 1.818 TB two
3 3/3 RAID1 Optl RW No RWTD - 1.818 TB four

Now the index can be set back to range(4) without reindexing:

t2.index=range(4)
Out[102]:
DG/VD TYPE State Access Consist Cache sCC Size Units Name
0 2/2 RAID1 Optl RW No RWTD - 1.818 TB three
1 0/0 RAID1 Optl RW No RWTD - 1.818 TB one
2 1/1 RAID1 Optl RW No RWTD - 1.818 TB two
3 3/3 RAID1 Optl RW No RWTD - 1.818 TB four

It can also be done with 'tuple switching' and row selection as a basic mechanism and without creating a new DataFrame. For example:

import pandas as pd

t = pd.read_csv('table.txt',sep='\s+')

t.ix[1], t.ix[2] = t.ix[2], t.ix[1]
t.ix[0], t.ix[1] = t.ix[1], t.ix[0]
t
Out[96]:
DG/VD TYPE State Access Consist Cache sCC Size Units Name
0 2/2 RAID1 Optl RW No RWTD - 1.818 TB three
1 0/0 RAID1 Optl RW No RWTD - 1.818 TB one
2 1/1 RAID1 Optl RW No RWTD - 1.818 TB two
3 3/3 RAID1 Optl RW No RWTD - 1.818 TB four

Another in place method sets the DataFrame index for the desired ordering so that, for example, the 3rd row gets index 0, etc. and then the DataFrame is sorted in place. It's encapsulated in the following function that assumes the rows are indexed with some range(m) for positive integer m and the DataFrame is simply indexed (no MultiIndex) as in the example provided in the question.

def putfirst(n,df):
if not isinstance(n, int):
print 'error: 1st arg must be an int'
return
if n < 1:
print 'error: 1st arg must be an int > 0'
return
if n == 1:
print 'nothing to do when first arg == 1'
return
if n > len(df):
print 'error: n exceeds the number of rows in the DataFrame'
return
df.index = range(1,n) + [0] + range(n,df.index[-1]+1)
df.sort(inplace=True)

The arguments of putfirst are n, which is the ordinal position of the row to relocate to the first row position, so that if the 3rd row is to be so relocated then n = 3; and df is the DataFrame containing the row to be relocated.

Here is a demo:

import pandas as pd

df = pd.DataFrame(np.random.randn(10, 5),columns=['a', 'b', 'c', 'd', 'e'])

df.set_index("a") # ineffective without assignment or inplace=True
Out[182]:
b c d e
a
1.394072 -1.076742 -0.192466 -0.871188 0.420852
-1.211411 -0.258867 -0.581647 -1.260421 0.464575
-1.070241 0.804223 -0.156736 2.010390 -0.887104
-0.977936 -0.267217 0.483338 -0.400333 0.449880
0.399594 -0.151575 -2.557934 0.160807 0.076525
-0.297204 -1.294274 -0.885180 -0.187497 -0.493560
-0.115413 -0.350745 0.044697 -0.897756 0.890874
-1.151185 -2.612303 1.141250 -0.867136 0.383583
-0.437030 0.347489 -1.230179 0.571078 0.060061
-0.225524 1.349726 1.350300 -0.386653 0.865990

df
Out[183]:
a b c d e
0 1.394072 -1.076742 -0.192466 -0.871188 0.420852
1 -1.211411 -0.258867 -0.581647 -1.260421 0.464575
2 -1.070241 0.804223 -0.156736 2.010390 -0.887104
3 -0.977936 -0.267217 0.483338 -0.400333 0.449880
4 0.399594 -0.151575 -2.557934 0.160807 0.076525
5 -0.297204 -1.294274 -0.885180 -0.187497 -0.493560
6 -0.115413 -0.350745 0.044697 -0.897756 0.890874
7 -1.151185 -2.612303 1.141250 -0.867136 0.383583
8 -0.437030 0.347489 -1.230179 0.571078 0.060061
9 -0.225524 1.349726 1.350300 -0.386653 0.865990

df.index
Out[184]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')

putfirst(3,df)
df
Out[186]:
a b c d e
0 -1.070241 0.804223 -0.156736 2.010390 -0.887104
1 1.394072 -1.076742 -0.192466 -0.871188 0.420852
2 -1.211411 -0.258867 -0.581647 -1.260421 0.464575
3 -0.977936 -0.267217 0.483338 -0.400333 0.449880
4 0.399594 -0.151575 -2.557934 0.160807 0.076525
5 -0.297204 -1.294274 -0.885180 -0.187497 -0.493560
6 -0.115413 -0.350745 0.044697 -0.897756 0.890874
7 -1.151185 -2.612303 1.141250 -0.867136 0.383583
8 -0.437030 0.347489 -1.230179 0.571078 0.060061
9 -0.225524 1.349726 1.350300 -0.386653 0.865990

Moving data from right to left column in a tibble

Using dplyr and tidyr. Reshape from wide to long, exclude "^RSY" and NA diagnosis, reshape long to wide.

library(dplyr)
library(tidyr)

gather(data, key = "k", value = "v", -id) %>%
filter(!(grepl("^[R|S|Y]", v) | is.na(v))) %>%
group_by(id) %>%
mutate(diagN = paste0("diagnosis_", row_number())) %>%
select(-k) %>%
spread(key = "diagN", value = "v") %>%
ungroup()

# # A tibble: 10 x 3
# id diagnosis_1 diagnosis_2
# <int> <chr> <chr>
# 1 1 F32 F40
# 2 2 F431 NA
# 3 3 F65 NA
# 4 4 F431 NA
# 5 5 F11 F19
# 6 6 F60 NA
# 7 7 G35 NA
# 8 8 F32 NA
# 9 9 F32 F11
# 10 10 Z032 NA


Related Topics



Leave a reply



Submit