ValueError: could not convert string to float---how to convert a list of lists of strings into a numpy array type float?
If you replace the spaces with commas, you can use json.loads
to read the string as a list, and pass that to np.asarray
:
import json
import numpy as np
foo = "[[7.0352220e-01 5.3130367e-06 1.5167372e-05 1.0797821e-06] \
[1.3130367e-06 2.4584832e-01 2.2375602e-05 7.3299240e-06]]"
a = np.asarray(json.loads(foo.replace(" ", ",")), dtype=np.float32)
print(a)
#array([[7.0352220e-01, 5.3130367e-06, 1.5167372e-05, 1.0797821e-06],
# [1.3130367e-06, 2.4584832e-01, 2.2375602e-05, 7.3299240e-06]])
print(a.dtype)
#float32
This assumes there is exactly 1 space between values. If that is not the case, you can use re.sub
to replace multiple spaces with a comma:
import re
a = np.asarray(json.loads(re.sub("\s+", ",", foo)))
#array([[7.0352221e-01, 5.3130366e-06, 1.5167372e-05, 1.0797821e-06],
# [1.3130367e-06, 2.4584831e-01, 2.2375601e-05, 7.3299238e-06]],
# dtype=float32)
Could not convert string to float using dtype in nparray
You seem to be misunderstanding how structured arrays work. You don't specify the data type of a "column", you specify a the datatype of a structure, and you build an array of structs. Numpy
arrays are homogeneous arrays, you cannot have mixed datatypes. So, you could do this:
>>> e1 = ('julien', 6270, 17, 0.2703992365198028)
>>> e2 = ('john_smith', '2983', '10', '0.3341129301703976')
>>> e3 = ('helo', '19', '0', '0.0')
>>> data = [e1, e2, e3]
>>> arr = np.array(data, dtype=[('name', '<U255'), ('amount0', float), ('amount1', float), ('amount2', float)])
>>> arr
array([('julien', 6270.0, 17.0, 0.2703992365198028),
('john_smith', 2983.0, 10.0, 0.3341129301703976),
('helo', 19.0, 0.0, 0.0)],
dtype=[('name', '<U255'), ('amount0', '<f8'), ('amount1', '<f8'), ('amount2', '<f8')])
>>>
But notice,
>>> arr.shape
(3,)
There are no columns. Of course, we can just pretend like there were:
>>> arr['name']
array(['julien', 'john_smith', 'helo'],
dtype='<U255')
>>> arr[0]['name']
'julien'
But honestly, it sounds like you really want a pandas.DataFrame
>>> import pandas as pd
>>> pd.DataFrame(data, columns=['name', 'amount0', 'amount1', 'amount2'])
name amount0 amount1 amount2
0 julien 6270 17 0.270399
1 john_smith 2983 10 0.3341129301703976
2 helo 19 0 0.0
>>>
Notice, I had to modify your str
datatype to accept unicode
, because numpy
interprets str
as byte-strings. You could always makes your strings bytes
objects by encoding them. This is probably the way to go if you are only working with ascii characters.
All string list to a numpy float array
You can probably use eval() to turn the entire string into an actual list. eval() is generally not good to use, but in this case it might be your best bet.
What you listed as your "example" is not correct. You are listing the result of your print statement and list comprehension. What is being stored as an entry for that column is a string.
you should be able to simply take each item and wrap it in eval
eval(arr)
that should return you a shape (3,3) python list. From there you can convert it to a numpy array as necessary and change the types.
Convert a list of list of strings into ndarray of floats
Try this :
import numpy as np
l1 = ['0.20115899', '0.111678', '0.10674', '0.05564842', '-0.09271969', '-0.02292056', '-0.04057575', '0.2019901', '-0.05368654', '-0.1708179']
l2 = ['-2.17182860e-01', '-1.04081273e-01', '7.75325894e-02', '7.51972795e-02', '-7.11168349e-02', '-4.75254208e-02', '-2.94160955e-02']
l1 = np.array([float(i) for i in l1])
l2 = np.array([float(i) for i in l2])
print(l1.dtype)
Output :
float64
Converting string to float with error handling inside a list comprehension
Use an if
condition inside the inner list comprehension to ignore empty strings:
[[float(j) for j in i if i] for i in list1]
if i
will test the "truthiness" of strings. This will only return False for empty strings, so they are ignored.
Or, if you want to be more robust about it, use a function to perform the conversion with exception handling:
def try_convert(val):
try:
return float(val)
except ValueError, TypeError:
pass
[[float(z) for z in (try_convert(j) for j in i) if z] for i in list1]
Related Topics
Python: How to Check If Cell in CSV File Is Empty
How to Remove Words in a Column in Pandas
Pandas: Subtracting Two Date Columns and the Result Being an Integer
Convert Regular Python String to Raw String
Selecting Specific Rows and Columns from Numpy Array
Replacing All Negative Values in Certain Columns by Another Value in Pandas
How to Close a Tkinter Window by Pressing a Button
Print the Student Name and the Score of Student in Python3
Python: Opencv - Selecting Region of an Image
I Received an Error Message That I Don't Quite Understand
Find Specific Words in Text File and Print the Line Using Python
How to Map the Differences Between Two Strings
How to Select All Elements Greater Than a Given Values in a Dataframe
Pandas Convert from Datetime to Integer Timestamp
Finding the Max Value in a Two Dimensional Array
How to Convert a 1 Channel Image into a 3 Channel With Opencv2
How to Retrieve SQL Result Column Value Using Column Name in Python