Converting numpy dtypes to native python types
Use val.item()
to convert most NumPy values to a native Python type:
import numpy as np
# for example, numpy.float32 -> python float
val = np.float32(0)
pyval = val.item()
print(type(pyval)) # <class 'float'>
# and similar...
type(np.float64(0).item()) # <class 'float'>
type(np.uint32(0).item()) # <class 'int'>
type(np.int16(0).item()) # <class 'int'>
type(np.cfloat(0).item()) # <class 'complex'>
type(np.datetime64(0, 'D').item()) # <class 'datetime.date'>
type(np.datetime64('2001-01-01 00:00:00').item()) # <class 'datetime.datetime'>
type(np.timedelta64(0, 'D').item()) # <class 'datetime.timedelta'>
...
(Another method is np.asscalar(val)
, however it is deprecated since NumPy 1.16).
For the curious, to build a table of conversions of NumPy array scalars for your system:
for name in dir(np):
obj = getattr(np, name)
if hasattr(obj, 'dtype'):
try:
if 'time' in name:
npn = obj(0, 'D')
else:
npn = obj(0)
nat = npn.item()
print('{0} ({1!r}) -> {2}'.format(name, npn.dtype.char, type(nat)))
except:
pass
There are a few NumPy types that have no native Python equivalent on some systems, including: clongdouble
, clongfloat
, complex192
, complex256
, float128
, longcomplex
, longdouble
and longfloat
. These need to be converted to their nearest NumPy equivalent before using .item()
.
Converting native python types to numpy dtypes
numpy.float
is just the regular Python float
type. It's not a NumPy dtype. It's almost certainly not what you need:
>>> import numpy
>>> numpy.float is float
True
If you want the dtype NumPy would coerce your scalar to, just make an array and get its dtype:
>>> numpy.array(7.7).dtype
dtype('float64')
If you want the type NumPy uses for scalars of this dtype, access the dtype's type
attribute:
>>> numpy.array(7.7).dtype.type
<class 'numpy.float64'>
Easier way of converting numpy datatypes to native python datatypes
NumPy.item() instead of NumPy element it will give the approximate python native type
a =[val.item() if type(val).__module__ == np.__name__ else val for val in a ]
for val in native:
print(type(val))
numpyNum = np.float(1.2)
pythonNum = num.item()
pythonNativeTypeValues = [ v.item() for v in a]
When you have multiple types in your list you need to check the element type is NumPy or not so the code will be as follow
import numpy as np
import datetime
a = [np.float64(1.2), np.int64(123), 'blablabla', datetime.datetime.now()]
native = []
for val in a:
if type(val).__module__ == np.__name__:
val =val.item()
native.append(val)
for val in native:
print(type(val))
#<class 'float'>
#<class 'int'>
#<class 'str'>
#<class 'datetime.datetime'>
If you plan to use list compression the code will be one line and that is
native =[val.item() if type(val).__module__ == np.__name__ else val for val in a ]
Cannot convert numpy dtypes to its native python types (int64 to int)
So it seems that Amazon s3 is a bit sensitive to dtypes so in order for it to be compatible you can first cast to int
and then to object
so it's compatible:
avg_Credit_Bal['No. of transactions'] = sum_Credit_Bal['No. of transactions'].astype(int).astype(object)
If you look at the type of the elements it will output object
indicating that it's a generic python object:
type(avg_Credit_Bal['No. of transactions'][0])
will output object
Convert list of numpy.float64 to float in Python quickly
The tolist()
method should do what you want. If you have a numpy array, just call tolist()
:
In [17]: a
Out[17]:
array([ 0. , 0.14285714, 0.28571429, 0.42857143, 0.57142857,
0.71428571, 0.85714286, 1. , 1.14285714, 1.28571429,
1.42857143, 1.57142857, 1.71428571, 1.85714286, 2. ])
In [18]: a.dtype
Out[18]: dtype('float64')
In [19]: b = a.tolist()
In [20]: b
Out[20]:
[0.0,
0.14285714285714285,
0.2857142857142857,
0.42857142857142855,
0.5714285714285714,
0.7142857142857142,
0.8571428571428571,
1.0,
1.1428571428571428,
1.2857142857142856,
1.4285714285714284,
1.5714285714285714,
1.7142857142857142,
1.857142857142857,
2.0]
In [21]: type(b)
Out[21]: list
In [22]: type(b[0])
Out[22]: float
If, in fact, you really have python list of numpy.float64 objects, then @Alexander's answer is great, or you could convert the list to an array and then use the tolist()
method. E.g.
In [46]: c
Out[46]:
[0.0,
0.33333333333333331,
0.66666666666666663,
1.0,
1.3333333333333333,
1.6666666666666665,
2.0]
In [47]: type(c)
Out[47]: list
In [48]: type(c[0])
Out[48]: numpy.float64
@Alexander's suggestion, a list comprehension:
In [49]: [float(v) for v in c]
Out[49]:
[0.0,
0.3333333333333333,
0.6666666666666666,
1.0,
1.3333333333333333,
1.6666666666666665,
2.0]
Or, convert to an array and then use the tolist()
method.
In [50]: np.array(c).tolist()
Out[50]:
[0.0,
0.3333333333333333,
0.6666666666666666,
1.0,
1.3333333333333333,
1.6666666666666665,
2.0]
If you are concerned with the speed, here's a comparison. The input, x
, is a python list of numpy.float64 objects:
In [8]: type(x)
Out[8]: list
In [9]: len(x)
Out[9]: 1000
In [10]: type(x[0])
Out[10]: numpy.float64
Timing for the list comprehension:
In [11]: %timeit list1 = [float(v) for v in x]
10000 loops, best of 3: 109 µs per loop
Timing for conversion to numpy array and then tolist()
:
In [12]: %timeit list2 = np.array(x).tolist()
10000 loops, best of 3: 70.5 µs per loop
So it is faster to convert the list to an array and then call tolist()
.
Convert numpy elements to numpy dtypes
If I add to your code a __repr__
method and some prints:
import numpy as np
from operator import attrgetter
class myobj():
def __init__(self, value):
self.myattr = value
def __repr__(self):
return self.myattr.__repr__()
obj_array = np.empty((3,3), dtype='object')
for i in range(obj_array.shape[0]):
for j in range(obj_array.shape[0]):
obj_array[i,j] = myobj(i+j)
native_type_array = np.frompyfunc(attrgetter('myattr'), 1, 1)(obj_array)
print(native_type_array.shape)
print(native_type_array.dtype)
print(native_type_array)
print(obj_array)
I get
1011:~/mypy$ python3 stack38332556.py
(3, 3)
object
[[0 1 2]
[1 2 3]
[2 3 4]]
[[0 1 2]
[1 2 3]
[2 3 4]]
native_type_array
is also an object dtype array - that's what the doc for frompyfunc
says it does. But since the elements are numbers, the display looks nice.
And by giving myobj
a similar repr
, I get the same thing. If I change the repr
def __repr__(self):
return '<%s>'%self.myattr
I get:
[[<0> <1> <2>]
[<1> <2> <3>]
[<2> <3> <4>]]
This applies to lists as well. print([myobj(10),myobj(11)])
produces [<10>, <11>]
Numpy dtype 'h' as dtype
Indeed, the numpy docs can be hard to navigate. Here's the main page on dtype, but it doesn't mention 'h'
.
So to probe it experimentally:
import numpy as np
np.dtype('h')
--> dtype('int16')
It's a 16-bit signed integer.
Numpy Array get datatype by cell?
import numpy as np
arr = np.array([1, "sd", 3.6])
You'll notice that the values in this array are not numerics and strings, they're just strings.
>>> arr
array(['1', 'sd', '3.6'], dtype='<U32')
You'll also note that they're not python strings. There is a reason for this but it isn't important here.
>>> type(arr[1])
<class 'numpy.str_'>
>>> type(arr[1]) == type(str)
False
You should not try to mix data types like you are doing. Use a list instead. The difference in data types that you have in your input list is lost when you turn it into an array. I note that you're calling an array element a 'cell' - it isn't, arrays don't work like spreadsheets.
That said, if you absolutely must do this:
arr = np.array([1, "sd", 3.6], dtype=object)
>>> arr
array([1, 'sd', 3.6], dtype=object)
This will keep all the array elements as python objects instead of using numpy dtypes.
>>> np.array([type(x) == str for x in arr])
array([False, True, False])
Then you can test the type of each element accordingly.
h5py: convert numpy data to native python types
Not sure if it would make sense for your use-case, but you might think about storing single scalars or strings as attributes:
http://www.h5py.org/docs/intro/quick.html#attributes
Related Topics
Replace Single Quote to Double Quote Python Pandas Dataframe
Key Error: None of [Int64Index...] Dtype='Int64] Are in the Columns
How to Check If a String Contains 2 of the Same Character
How to Get Elasticsearch to Perform an Exact Match Query
How to Split by Commas That Are Not Within Parentheses
Unable Log in to the Django Admin Page With a Valid Username and Password
Webscraping Financial Data from Morningstar
How to Close an Internet Tab With Cmd/Python
Make Alternate Letters Capital
How to Limit Iterations of a Loop in Python
Python Sockets Multiple Messages on Same Connection
Regular Expression for Double and Integer Validation
How to Replace Negative Numbers in Pandas Data Frame by Zero
Django Viewset Has Not Attribute 'Get_Extra_Actions'
Get Current Url from Browser Using Python
Split String in a Spark Dataframe Column by Regular Expressions Capturing Groups
Fill With Nan When Length of Values Does Not Match Length of Index