Constructing Pandas Dataframe from Values in Variables Gives "Valueerror: If Using All Scalar Values, You Must Pass an Index"

Constructing pandas DataFrame from values in variables gives ValueError: If using all scalar values, you must pass an index

The error message says that if you're passing scalar values, you have to pass an index. So you can either not use scalar values for the columns -- e.g. use a list:

>>> df = pd.DataFrame({'A': [a], 'B': [b]})
>>> df
A B
0 2 3

or use scalar values and pass an index:

>>> df = pd.DataFrame({'A': a, 'B': b}, index=[0])
>>> df
A B
0 2 3

Getting Error: ValueError: If using all scalar values, you must pass an index when converting ndarray to pandas Dataframe

All of your sub_iti, tw, col_iti are 2D numpy arrays. However, when you do:

df=pd.DataFrame ({'sub_iti': sub_iti,
'tw': tw,
'col_iti': col_iti} )

Pandas expected them to be 1D numpy arrays or lists, since that's how columns of a DataFrame should be. You can try:

df=pd.DataFrame ({'sub_iti': sub_iti.tolist(),
'tw': tw.tolist(),'col_iti': col_iti.tolist()})

Output:

  sub_iti    tw col_iti
0 [s1] [xx] [TA]
1 [s1] [xx] [BAT]
2 [s1] [xx] [T]
3 [s1] [cc] [TA]
4 [s1] [cc] [BAT]
5 [s1] [cc] [T]

But I do think that you should remove the lists inside each cell, and use ravel() instead of tolist():

df=pd.DataFrame ({'sub_iti': sub_iti.ravel(),
'tw': tw.ravel(),'col_iti': col_iti.ravel()})

Output:

  sub_iti  tw col_iti
0 s1 xx TA
1 s1 xx BAT
2 s1 xx T
3 s1 cc TA
4 s1 cc BAT
5 s1 cc T

Dictionary to Dataframe Error: If using all scalar values, you must pass an index

This error occurs because pandas needs an index. At first this seems sort of confusing because you think of list indexing. What this is essentially asking for is a column number for each dictionary to correspond to each dictionary. You can set this like so:

import pandas as pd
list = ['a', 'b', 'c', 'd']
df = pd.DataFrame(list, index = [0, 1, 2, 3])

The data frame then yields:

   0  
0 'a'
1 'b'
2 'c'
3 'd'

For you specifically, this might look something like this using numpy (not tested):

list_of_dfs = {} 

for I in range(0,len(regionLoadArray)):
list_of_dfs[I] = pd.read_csv(regionLoadArray[I])

ind = np.arange[len(list_of_dfs)]

dataframe = pd.DataFrame(list_of_dfs, index = ind)

Constructing pandas DataFrame from values in variables gives ValueError: If using all scalar values, you must pass an index

The error message says that if you're passing scalar values, you have to pass an index. So you can either not use scalar values for the columns -- e.g. use a list:

>>> df = pd.DataFrame({'A': [a], 'B': [b]})
>>> df
A B
0 2 3

or use scalar values and pass an index:

>>> df = pd.DataFrame({'A': a, 'B': b}, index=[0])
>>> df
A B
0 2 3

pandas read_json: If using all scalar values, you must pass an index

Try

ser = pd.read_json('people_wiki_map_index_to_word.json', typ='series')

That file only contains key value pairs where values are scalars. You can convert it to a dataframe with ser.to_frame('count').

You can also do something like this:

import json
with open('people_wiki_map_index_to_word.json', 'r') as f:
data = json.load(f)

Now data is a dictionary. You can pass it to a dataframe constructor like this:

df = pd.DataFrame({'count': data})

how to solve If using all scalar values, you must pass an index problem pandas

Try to convert the values of dictionary to list if they are scalars:

from ast import literal_eval

vals = literal_eval(d[1].strip())
df = pd.DataFrame(
{k: v if isinstance(v, (list, tuple)) else [v] for k, v in vals.items()}
)
print(df)

ValueError: If using all scalar values, you must pass an index

The problem is that when you use the DataFrame constructor:

pd.DataFrame({m: eurusd.interpolate(method=m) for m in methods})

the value for each m is a DataFrame, which will be interpreted as a scalar value, which is admittedly confusing. This constructer expects some sort of sequence or Series. The following should solve the problem:

pd.DataFrame({m: eurusd['BID-CLOSE'].interpolate(method=m) for m in methods})

Since subsetting on a column returns a Series. So, for example instead of:

In [34]: pd.DataFrame({'linear':df.interpolate('linear')})
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-34-4b6c095c6da3> in <module>()
----> 1 pd.DataFrame({'linear':df.interpolate('linear')})

/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
222 dtype=dtype, copy=copy)
223 elif isinstance(data, dict):
--> 224 mgr = self._init_dict(data, index, columns, dtype=dtype)
225 elif isinstance(data, ma.MaskedArray):
226 import numpy.ma.mrecords as mrecords

/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in _init_dict(self, data, index, columns, dtype)
358 arrays = [data[k] for k in keys]
359
--> 360 return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
361
362 def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
5229 # figure out the index, if necessary
5230 if index is None:
-> 5231 index = extract_index(arrays)
5232 else:
5233 index = _ensure_index(index)

/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in extract_index(data)
5268
5269 if not indexes and not raw_lengths:
-> 5270 raise ValueError('If using all scalar values, you must pass'
5271 ' an index')
5272

ValueError: If using all scalar values, you must pass an index

Use this instead:

In [35]: pd.DataFrame({'linear':df['BID-CLOSE'].interpolate('linear')})
Out[35]:
linear
timestamp
2016-10-10 22:00:00 1.309710
2016-10-10 22:00:00 1.319710
2016-10-10 22:00:00 1.317210
2016-10-10 22:00:00 1.317710
2016-10-10 22:00:00 1.314110
2016-10-10 22:00:00 1.313010
2016-10-10 22:00:00 1.311910
2016-10-10 22:00:00 1.310810
2016-10-10 22:00:00 1.309710
2016-10-10 22:00:00 1.311310
2016-10-10 22:00:00 1.314910
2016-10-10 22:00:00 1.320210
2016-10-10 22:00:00 1.322577
2016-10-10 22:00:00 1.324943
2016-10-10 22:00:00 1.327310
2016-10-10 22:00:00 1.327310
2016-10-10 22:00:00 1.317010
2016-10-10 22:00:00 1.308310

Fair warning, though, I am getting a LinAlgError: singular matrix error when I try 'quadratic' and 'cubic' interpolation on your data. Not sure why though.



Related Topics



Leave a reply



Submit