How to Add Row and Column to a Dataframe of Different Length

add columns different length pandas

Use concat and pass axis=1 and ignore_index=True:

In [38]:

import numpy as np
df = pd.DataFrame({'a':np.arange(5)})
df1 = pd.DataFrame({'b':np.arange(4)})
print(df1)
df
b
0 0
1 1
2 2
3 3
Out[38]:
a
0 0
1 1
2 2
3 3
4 4
In [39]:

pd.concat([df,df1], ignore_index=True, axis=1)
Out[39]:
0 1
0 0 0
1 1 1
2 2 2
3 3 3
4 4 NaN

Add column vector to a dataframe of different length

Simple example, np.repeat() does what you need

D2 = np.array([1,2])
np.repeat(D2,60)

Adding list with different length as a new column to a dataframe

If you convert the list to a Series then it will just work:

datasetTest.loc[:,'predict_close'] = pd.Series(test_pred_list)

example:

In[121]:
df = pd.DataFrame({'a':np.arange(3)})
df

Out[121]:
a
0 0
1 1
2 2

In[122]:
df.loc[:,'b'] = pd.Series(['a','b'])
df

Out[122]:
a b
0 0 a
1 1 b
2 2 NaN

The docs refer to this as setting with enlargement which talks about adding or expanding but it also works where the length is less than the pre-existing index.

To handle where the index doesn't start at 0 or in fact is not an int:

In[126]:
df = pd.DataFrame({'a':np.arange(3)}, index=np.arange(3,6))
df

Out[126]:
a
3 0
4 1
5 2

In[127]:
s = pd.Series(['a','b'])
s.index = df.index[:len(s)]
s

Out[127]:
3 a
4 b
dtype: object

In[128]:
df.loc[:,'b'] = s
df

Out[128]:
a b
3 0 a
4 1 b
5 2 NaN

You can optionally replace the NaN if you wish calling fillna

How to add row and column to a dataframe of different length?

Transpose 'y' and repeat to the desired number of rows. Set column names to 'x'.

cbind(Dataset, `colnames<-`(t(Headers$y)[rep(1, nrow(Dataset)), ], Headers$x))

H W x1 x2 x3 x4
1 20 30 1 2 3 4
2 10 20 1 2 3 4
3 11 30 1 2 3 4
4 8 10 1 2 3 4
5 10 6 1 2 3 4

Concat two Pandas DataFrame column with different length of index

Use:

df1 = pd.DataFrame({
'A':list('abcdef'),
'B':[4,5,4,5,5,4],
})

df2 = pd.DataFrame({
'SMA':list('rty')
})

df3 = df1.join(df2.set_index(df1.index[-len(df2):]))

Or:

df3 = pd.concat([df1, df2.set_index(df1.index[-len(df2):])], axis=1)
print (df3)
A B SMA
0 a 4 NaN
1 b 5 NaN
2 c 4 NaN
3 d 5 r
4 e 5 t
5 f 4 y

How it working:

First is selected index in df1 by length of df2 from back:

print (df1.index[-len(df2):])
RangeIndex(start=3, stop=6, step=1)

And then is overwrite existing values by DataFrame.set_index:

print (df2.set_index(df1.index[-len(df2):]))
SMA
3 r
4 t
5 y

Python : Add a column into a dataframe with different length repeating the added column till fill the dataframe length

You can use np.tile to repeat the elements of column C:

m, n = len(df1), len(df2)
df1['C'] = np.tile(df2['C'], int(np.ceil(m / n)))[:m]

Result:

   A   B   C
0 1 AA 11
1 2 AB 12
2 3 AC 11
3 5 AD 12

Adding new column to dataframe of different length from list

I'm not exactly clear on what you're trying to do, but maybe you want something like this?

df = DataFrame()

def myfunc(number):
row_index = 0
for x in range(0,10):
if 'some condition':
df.loc[row_index, 'results%d' % number] = x
row_index += 1

Split lists in a dataframe with different length lists in columns and rows

It seems that the exploded columns and the non-exploded columns need to be separated. Since we can't hide them in the index as we normally do (given C2) contains lists (which are unhashable) we must separate the DataFrame then rejoin.

# Convert to single series to explode
cols = ['C1', 'C4']
new_df = df[cols].stack().explode().to_frame()
# Enumerate groups then unstack
new_df = new_df.set_index(
new_df.groupby(level=[0, 1]).cumcount(),
append=True
).unstack(1).groupby(level=0).ffill()

# Join Back Unaffected columns
new_df = new_df.droplevel(0, axis=1).droplevel(1, axis=0).join(
df[df.columns.symmetric_difference(cols)]
)
# Re order columns and reset index
new_df = new_df.reindex(df.columns, axis=1).reset_index(drop=True)

new_df:

  C1   C2  C3   C4
0 A [1] s1 123
1 B [1] s1 123
2 C [2] s2 321
3 D [3] s3 777
4 E [3] s3 111
5 F [4] s4 145

We stack to get all values into a single series then explode together and convert back to_frame

cols = ['C1', 'C4']
new_df = df[cols].stack().explode().to_frame()

new_df

        0
0 C1 A
C1 B
C4 123
1 C1 C
C4 321
2 C1 D
C1 E
C4 777
C4 111
3 C1 F
C4 145

We can create a new index by enumerating groups with groupby cumcount set_index and unstacking:

new_df = new_df.set_index(
new_df.groupby(level=[0, 1]).cumcount(),
append=True
).unstack(1)
     0     
C1 C4
0 0 A 123
1 B NaN
1 0 C 321
2 0 D 777
1 E 111
3 0 F 145

We can then groupby ffill within index groups:

new_df = new_df.groupby(level=0).ffill()

new_df:

     0     
C1 C4
0 0 A 123
1 B 123
1 0 C 321
2 0 D 777
1 E 111
3 0 F 145

We can then join back the unaffected columns to the DataFrame and reindex to reorder them the way the initially appeared also droplevel to remove unneeded index levels, lastly reset_index:

# Join Back Unaffected columns
new_df = new_df.droplevel(0, axis=1).droplevel(1, axis=0).join(
df[df.columns.symmetric_difference(cols)]
)
# Re order columns and reset index
new_df = new_df.reindex(df.columns, axis=1).reset_index(drop=True)

new_df:

  C1   C2  C3   C4
0 A [1] s1 123
1 B [1] s1 123
2 C [2] s2 321
3 D [3] s3 777
4 E [3] s3 111
5 F [4] s4 145

Python Pandas: Assign lists with different lengths as a row to pandas dataframe

You could construct your lists into a DataFrame and concat them:

(pd.concat([df.set_index('ColA'),
pd.DataFrame([list_a, list_c], index=['a', 'c'])],
axis=1).rename_axis('ColA').reset_index())

[out]

  ColA  ColB    0    1    2
0 a 0 0.0 1.0 NaN
1 b 1 NaN NaN NaN
2 c 2 0.0 1.0 2.0

Or as @QuangHoang suggested, use DataFrame.merge:

df.merge(pd.DataFrame([list_a, list_c], index=['a', 'c']),
left_on='ColA',
right_index=True,
how='left')


Related Topics



Leave a reply



Submit