Key Error When Selecting Columns in Pandas Dataframe After Read_Csv

Key error when selecting columns in pandas dataframe after read_csv

use sep='\s*,\s*' so that you will take care of spaces in column-names:

transactions = pd.read_csv('transactions.csv', sep=r'\s*,\s*',
header=0, encoding='ascii', engine='python')

alternatively you can make sure that you don't have unquoted spaces in your CSV file and use your command (unchanged)

prove:

print(transactions.columns.tolist())

Output:

['product_id', 'customer_id', 'store_id', 'promotion_id', 'month_of_year', 'quarter', 'the_year', 'store_sales', 'store_cost', 'unit_sales', 'fact_count']

KeyError When Selecting a Column

There appears to be whitespace in your column names. You can remove whitespace as follows:

df_ret.columns = df_ret.columns.str.strip()

You can then access the series as expected:

print(df_ret['Cohorts Retention Rate'])

KeyError while reading a CSV file in Python

Actually looking at the df.info it seems that the separator,at least judging by the column name, is a semicolon. Also the column name should be "Time (s)". Please try:

data=pd.read_csv("Test.csv",sep=";")
z = -0.01 * np.linspace(1, 11, 11)
x = data['Time (s)']

Key error when printing columns in dataframe after read_csv()

This data is in fixed-width format, so use read_fwf with skiprows:

h = pd.read_fwf('history.data', skiprows=5)

KeyError when indexing Pandas dataframe

You most likely have an extra character at the beginning of your file, that is prepended to your first column name, 'Date'. Simply Copy / Paste your output to a non-unicode console produces.

Index([u'?Date', u'Open', u'High', u'Low', u'Close', u'Volume'], dtype='object')

How can I strip the whitespace from Pandas DataFrame headers?

You can give functions to the rename method. The str.strip() method should do what you want:

In [5]: df
Out[5]:
Year Month Value
0 1 2 3

[1 rows x 3 columns]

In [6]: df.rename(columns=lambda x: x.strip())
Out[6]:
Year Month Value
0 1 2 3

[1 rows x 3 columns]

Note: that this returns a DataFrame object and it's shown as output on screen, but the changes are not actually set on your columns. To make the changes, either use this in a method chain or re-assign the df variabe:

df = df.rename(columns=lambda x: x.strip())


Related Topics



Leave a reply



Submit