Add numpy array as column to Pandas data frame
import numpy as np
import pandas as pd
import scipy.sparse as sparse
df = pd.DataFrame(np.arange(1,10).reshape(3,3))
arr = sparse.coo_matrix(([1,1,1], ([0,1,2], [1,2,0])), shape=(3,3))
df['newcol'] = arr.toarray().tolist()
print(df)
yields
0 1 2 newcol
0 1 2 3 [0, 1, 0]
1 4 5 6 [0, 0, 1]
2 7 8 9 [1, 0, 0]
How to add numpy matrix as new columns for pandas dataframe?
You can turn the matrix into a datframe and use concat
with axis=1
:
For example, given a dataframe df
and a numpy array mat
:
>>> df
a b
0 5 5
1 0 7
2 1 0
3 0 4
4 6 4
>>> mat
array([[0.44926098, 0.29567859, 0.60728561],
[0.32180566, 0.32499134, 0.94950085],
[0.64958125, 0.00566706, 0.56473627],
[0.17357589, 0.71053224, 0.17854188],
[0.38348102, 0.12440952, 0.90359566]])
You can do:
>>> pd.concat([df, pd.DataFrame(mat)], axis=1)
a b 0 1 2
0 5 5 0.449261 0.295679 0.607286
1 0 7 0.321806 0.324991 0.949501
2 1 0 0.649581 0.005667 0.564736
3 0 4 0.173576 0.710532 0.178542
4 6 4 0.383481 0.124410 0.903596
How to make a new column of numpy arrays in a pandas data frame?
Specify the datatype as "object" while creating the new column and then insert the elements as needed:
df["new"] = pd.Series(dtype="object")
df.at[1, 'new'] = [2 , 'l']
>>> df
id a b new
0 1 on on NaN
1 2 on off [2, l]
2 3 off on NaN
3 4 off off NaN
How to add numpy array elements row-wise to a pandas dataframe?
Creating an array for the problem, and convert this to a list.
a = np.array([[ 0.00021284, -0.04443965, 0.03926146, 0.04830161,
-0.11913304, 0.03370821],
[ 0.01778569, -0.05192029, -0.00792321, -0.01799901,
-0.09819183, 0.06020728],
[-0.00748426, -0.02401578, 0.01762747, 0.09334017,
-0.11837556, 0.00603597],
[-0.03505319, -0.01932572, -0.03248611, 0.00356432,
-0.082398 , 0.03887841],
[-0.05111802, -0.0309066 , 0.03542011, -0.01343899,
-0.10434885, -0.0315006 ]]).tolist()
Results in:
print(a)
[[0.00021284, -0.04443965, 0.03926146, 0.04830161, -0.11913304, 0.03370821], [0.01778569, -0.05192029, -0.00792321, -0.01799901, -0.09819183, 0.06020728], [-0.00748426, -0.02401578, 0.01762747, 0.09334017, -0.11837556, 0.00603597], [-0.03505319, -0.01932572, -0.03248611, 0.00356432, -0.082398, 0.03887841], [-0.05111802, -0.0309066, 0.03542011, -0.01343899, -0.10434885, -0.0315006]]
Then add the list to the dataframe.
df = pd.DataFrame({"Message": [
"How are you?",
"What is your name?",
"What do you do?",
"What is your address?",
"Let's hang out?"]})
df['Array'] = a
print(df)
For:
Message Array
0 How are you? [0.00021284, -0.04443965, 0.03926146, 0.048301...
1 What is your name? [0.01778569, -0.05192029, -0.00792321, -0.0179...
2 What do you do? [-0.00748426, -0.02401578, 0.01762747, 0.09334...
3 What is your address? [-0.03505319, -0.01932572, -0.03248611, 0.0035...
4 Let's hang out? [-0.05111802, -0.0309066, 0.03542011, -0.01343...
To create everything at the beginning, you can use dictionary:
df = pd.DataFrame({"Message": [
"How are you?",
"What is your name?",
"What do you do?",
"What is your address?",
"Let's hang out?"], "Array": a})
adding a new column to existing dataframe and fill with numpy array
Code from https://www.geeksforgeeks.org/adding-new-column-to-existing-dataframe-in-pandas/Import pandas package
import pandas as pd
Define a dictionary containing datadata = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Height': [5.1, 6.2, 5.1, 5.2]
}
Convert the dictionary into DataFrameoriginal_df = pd.DataFrame(data)
Using 'Qualification' as the column name and equating it to the listaltered_df = original_df.assign(Qualification = ['Msc', 'MA', 'Msc', 'Msc'])
Observe the resultaltered_df
How to add numpy array values to dataframe at a certain index?
For now another problem is that you also erase all others values of the column, you may not set a DataFrame
but just the array as the new value.
To set values in a column, at specific index, use df.loc[df.index[#], 'NAME']
import numpy as np
import pandas as pd
df = pd.DataFrame([[1, 2] for _ in range(100)], columns=['column_1', 'column_2'])
my_array = np.array([41892.79355875, 40239.97933262, 39466.32169404, 38416.39545664,
40012.3803004, 41135.45946026, 43084.18917943, 44825.08405799,
44066.70603561, 46636.34415037, 45855.25783352, 45863.87118957,
44697.45547342, 48065.5708295, 47931.83508874])
df.loc[df.index[-15:], 'column_2'] = my_array
print(df)
column_1 column_2
0 1 2.000000
1 1 2.000000
2 1 2.000000
3 1 2.000000
4 1 2.000000
.. ... ...
95 1 45855.257834
96 1 45863.871190
97 1 44697.455473
98 1 48065.570830
99 1 47931.835089
python - how to append numpy array to a pandas dataframe
Assign the predictions to a variable and then extract the columns from the variable to be assigned to the pandas dataframe cols. If x
is the 2D numpy array with predictions,
x = sentiment_model.predict_proba(test_matrix)
then you can do,
test_data['prediction0'] = x[:,0]
test_data['prediction1'] = x[:,1]
Dynamically store data in the columns of pandas dataframe from numpy arrays being generated from “for loop”
Use Numpy stack
over the dictionary values (this will give you a Numpy array with shape (10, 241, 241)
) then use reshape
to modify the shape to (10,58081)
follow by transpose, to place the days as columns. Next, convert to a Pandas dataframe and fix the column names using the dictionary keys.
import pandas as pd
import numpy as np
#setup
np.random.seed(12345)
df_dictionary = {}
days = {f'day_{d}': np.random.rand(241,241).round(2) for d in range(1,11)}
df_dictionary['arrays_to_iterate'] = days
print(df_dictionary)
#code
all_days = np.stack(list(df_dictionary['arrays_to_iterate'].values())).reshape(10, -1).T
df = pd.DataFrame(all_days)
df.columns = df_dictionary['arrays_to_iterate'].keys()
print(df)
Ouput from df_dictionary
{'arrays_to_iterate':
{'day_1':
array(
[[0.93, 0.32, 0.18, ..., 0.62, 0.89, 0.78],
[0.72, 0.31, 0.36, ..., 0.5 , 0.89, 0.38],
...,
[0.36, 0.62, 0.77, ..., 0.03, 0.57, 0.04],
[0.02, 0.07, 0.66, ..., 0.62, 0.5 , 0.04]]),
'day_2': array(
[[0.14, 0.13, 0.91, ..., 0.06, 0.72, 0.93],
[0.13, 0.02, 0.09, ..., 0.39, 0.72, 0.13],
...
Output from df
day_1 day_2 day_3 day_4 day_5 day_6 day_7 day_8 day_9 day_10
0 0.93 0.14 0.06 0.10 0.01 0.66 0.67 0.18 0.93 0.40
1 0.32 0.13 0.81 0.57 0.23 0.60 0.48 0.07 0.08 0.32
2 0.18 0.91 0.95 0.27 0.36 0.11 0.25 0.71 0.24 0.44
3 0.20 0.51 0.52 0.62 0.09 0.31 0.19 0.78 0.83 0.58
4 0.57 0.14 0.89 0.51 0.67 0.29 0.48 0.95 0.36 0.97
... ... ... ... ... ... ... ... ... ... ...
58076 0.98 0.20 0.54 0.96 0.89 0.24 0.05 0.81 0.35 0.57
58077 0.53 0.96 0.04 0.60 0.16 0.38 0.83 0.49 0.28 0.02
58078 0.62 0.50 0.74 0.67 0.43 0.30 0.91 0.68 0.15 0.43
58079 0.50 0.11 0.57 0.42 0.85 0.97 0.86 0.60 0.75 0.33
58080 0.04 0.74 0.74 0.94 0.98 0.35 0.52 0.12 0.47 0.53
[58081 rows x 10 columns]
Related Topics
How to Get All Users in a Telegram Channel Using Telethon
Passing a List of Values from Python to the in Clause of an SQL Query
How to Remove Parentheses from a String
How to Run Two Python Scripts Simultaneously from a Master Script
How to Remove Strings Present in a List from a Column in Pandas
How to Use Variables in SQL Statement in Python
Removing Backslashes from a String in Python
Filtering the Dataframe Based on the Column Value of Another Dataframe
How to Kill a While Loop With a Keystroke
Convert Number Strings With Commas in Pandas Dataframe to Float
How to Make a Function Change Variables While in a While Loop
Discord Bot Messaging a User With a Specific User Id
How to Remove Nan from List Python/Numpy
Finding Out Who Got the Highest Mark Among the Students
Python: [Errno 10054] an Existing Connection Was Forcibly Closed by the Remote Host
How to Insert String Value into Specific Column Value on Python Pandas
How to Set Automatically the Width of a Column in Xlsxwriter