Pandas DataFrame stored list as string: How to convert back to list
As you pointed out, this can commonly happen when saving and loading pandas DataFrames as .csv
files, which is a text format.
In your case this happened because list objects have a string representation, allowing them to be stored as .csv
files. Loading the .csv
will then yield that string representation.
If you want to store the actual objects, you should use DataFrame.to_pickle()
(note: objects must be picklable!).
To answer your second question, you can convert it back with ast.literal_eval
:
>>> from ast import literal_eval
>>> literal_eval('[1.23, 2.34]')
[1.23, 2.34]
pandas - convert string into list of strings
You can split the string manually:
>>> df['Tags'] = df.Tags.apply(lambda x: x[1:-1].split(','))
>>> df.Tags[0]
['Tag1', 'Tag2']
How to convert string back to list using Pandas
You can use ast.literal_eval
as :
>>> import ast
>>> a = "['BONGO', 'TOZZO', 'FALLO', 'PINCO']"
>>> print ast.literal_eval(a)
>>> ['BONGO', 'TOZZO', 'FALLO', 'PINCO']
Pandas stored list as string, but cannot convert it back due to decimal
You can do it with eval()
since ast.literal_eval()
is not converting to Decimal()
object just note you need to be very aware of your data with this method.
The eval()
method will execute a given string just like the Python interpreter so it will create objects that in the given string in your case Decimal()
.
val = "[{'product':'ABC', 'quantity':1, 'price':Decimal(91.99)}, {'product':'YXZ', 'quantity':2, 'price':Decimal(11.99)}"
print(eval(val))
Output
[{'product': 'ABC',
'quantity': 1,
'price': Decimal('91.9899999999999948840923025272786617279052734375')},
{'product': 'YXZ',
'quantity': 2,
'price': Decimal('11.9900000000000002131628207280300557613372802734375')}]
How can I save DataFrame as list and not as string
Try this:
import pandas as pd
df = pd.DataFrame({'a': ["[1,2,3,4]", "[6,7,8,9]"]})
df['b'] = df['a'].apply(eval)
print(df)
The data in column b is now an array.
a b
0 [1,2,3,4] [1, 2, 3, 4]
1 [6,7,8,9] [6, 7, 8, 9]
Transform string that should be list of floats in a column of dataframe?
Use ast.literal_eval
:
import ast
df['interval'] = df['interval'].apply(ast.literal_eval)
Output
>>> df
interval
0 [100.0, 3.0]
1 [3.0, 2.0]
2 [2.0, 1.0]
3 [1, 0.25]
4 [0.25, 0.0]
>>> df.loc[0, 'interval']
[100.0, 3.0]
>>> type(df.loc[0, 'interval'])
list
Now you can convert to columns if you want:
>>> df['interval'].apply(pd.Series)
0 1
0 100.00 3.00
1 3.00 2.00
2 2.00 1.00
3 1.00 0.25
4 0.25 0.00
Column of lists, convert list to string as a new column
List Comprehension
If performance is important, I strongly recommend this solution and I can explain why.
df['liststring'] = [','.join(map(str, l)) for l in df['lists']]
df
lists liststring
0 [1, 2, 12, 6, ABC] 1,2,12,6,ABC
1 [1000, 4, z, a] 1000,4,z,a
You can extend this to more complicated use cases using a function.
def try_join(l):
try:
return ','.join(map(str, l))
except TypeError:
return np.nan
df['liststring'] = [try_join(l) for l in df['lists']]
Series.apply
/Series.agg
with ','.join
You need to convert your list items to strings first, that's where the map
comes in handy.
df['liststring'] = df['lists'].apply(lambda x: ','.join(map(str, x)))
Or,
df['liststring'] = df['lists'].agg(lambda x: ','.join(map(str, x)))
<!- >
df
lists liststring
0 [1, 2, 12, 6, ABC] 1,2,12,6,ABC
1 [1000, 4, z, a] 1000,4,z,a
pd.DataFrame
constructor with DataFrame.agg
A non-loopy/non-lambda solution.
df['liststring'] = (pd.DataFrame(df.lists.tolist())
.fillna('')
.astype(str)
.agg(','.join, 1)
.str.strip(','))
df
lists liststring
0 [1, 2, 12, 6, ABC] 1,2,12,6,ABC
1 [1000, 4, z, a] 1000,4,z,a
Converting list of strings in pandas column into string
Edit:
As your edit shows, it seems the rows are not actually lists
but strings
interpreted as lists. We can use eval
to ensure the format is of type list
so as to later perform the join
. It seems your sample data is the following:
df = pd.DataFrame({'index':[0,1,2,3,4],
'words':["['me']","['they']","['it','we','it']","[]","['we','we','it']"]})
How about this? Using apply
with a lambda function which uses ' '.join()
for each row (list):
df['words'] = df['words'].apply(eval).apply(' '.join)
print(df)
Output:
index words
0 0 me
1 1 they
2 2 it we it
3 3
4 4 we we it
Related Topics
How to Reduce a Jpeg Size to a 'Desired Size'
Why Is My Pygame Display Not Responding While Waiting for Input
Object of Custom Type as Dictionary Key
How to Get the Position of a Character in Python
Access Elementtree Node Parent Node
Draw a Transparent Rectangles and Polygons in Pygame
Extracting Date from a String in Python
Calling Class Staticmethod Within the Class Body
Plotting a 2D Heatmap with Matplotlib
Assigning to Variable from Parent Function: "Local Variable Referenced Before Assignment"
How to Interpret Conda Package Conflicts
Multi-Level Defaultdict with Variable Depth
Make 2 Functions Run at the Same Time
List VS Tuple, When to Use Each
How to Filter Rows Containing a String Pattern from a Pandas Dataframe