Dataframe Set_Index Not Setting

Dataframe set_index not setting

You need to either specify inplace=True, or assign the result to a variable. Try:

df.set_index('Timestamp', inplace=True, drop=True)

Basically, there are two things that you might want to do when you set the index. One is new_df = old_df.set_index('Timestamp', inplace=False). I.e. You want a new DataFrame that has the new index, but still want a copy of the original DataFrame. The other is df.set_index('Timestamp', inplace=True). Which is for when you want to modify the existing object.

df.set_index() Not Working as What I Expected

You need assign back:

city_prop = city_prop.set_index('index')

Or:

city_prop.set_index('index', inplace = True)

EDIT:

df = pd.read_csv('CityProperEskwenilaExtraIndicators.csv', 
skiprows=1,
header=None,
sep=';',
index_col=[0,1]).T

print (df.head())
0 Barangay Longitude Latitude Poverty rate Terrain type \
1 # See annex See annex Per 100 inhabitants See annex
2 1 27,67231183 66,3112793 18 Difficult
3 2 65,15620167 53,32027629 54 Difficult
4 3 34,94438385 89,7970517 63 Difficult
5 4 10,97542641 84,26323733 42 Normal
6 5 26,05436012 61,30689679 70 Difficult

0 Roads needing repair Access to WASH Access to clean water \
1 kilometers of road % of population % of population
2 55,40469584 50,2 71,2
3 14,08228761 51,8 88,9
4 33,20044684 77 97,4
5 1,695918463 74,7 52,1
6 85,08259271 70,1 99,3

0 Violent incidents Homicides
1 rate per 100K rate per 100K
2 7,72 6,833797715
3 8,3 5,513650409
4 3,72 2,931838433
5 6,26 5,883509349
6 6,55 5,348430398

#replace ,
df = df.replace(',','.', regex=True)
#remove second level
df.columns = df.columns.droplevel(1)
#convert columns to numeric
excluded = ['Terrain type','Poverty rate']
cols = df.columns.difference(excluded)
#to floats
df[cols] = df[cols].astype(float)
#to integer
df['Poverty rate'] = df['Poverty rate'].astype(int)
print (df.head())
0 Barangay Longitude Latitude Poverty rate Terrain type \
2 1.0 27.672312 66.311279 18 Difficult
3 2.0 65.156202 53.320276 54 Difficult
4 3.0 34.944384 89.797052 63 Difficult
5 4.0 10.975426 84.263237 42 Normal
6 5.0 26.054360 61.306897 70 Difficult

0 Roads needing repair Access to WASH Access to clean water \
2 55.404696 50.2 71.2
3 14.082288 51.8 88.9
4 33.200447 77.0 97.4
5 1.695918 74.7 52.1
6 85.082593 70.1 99.3

0 Violent incidents Homicides
2 7.72 6.833798
3 8.30 5.513650
4 3.72 2.931838
5 6.26 5.883509
6 6.55 5.348430

print (df.dtypes)
0
Barangay float64
Longitude float64
Latitude float64
Poverty rate int32
Terrain type object
Roads needing repair float64
Access to WASH float64
Access to clean water float64
Violent incidents float64
Homicides float64
dtype: object

Pandas dataframe set_index not accepting array

The problem here is passing a numpy array to set_index. Convert it to a list and it should work.

So replace

df.set_index(indexes, inplace=True)

with

df.set_index(indexes.tolist(), inplace=True)

Pandas set_index does not set the index

set_index is not inplace (unless you pass inplace=True). otherwise all correct

In [7]: df = df.set_index(pd.DatetimeIndex(df['b']))

In [8]: df
Out[8]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 100 entries, 2013-06-14 09:10:23.523845 to 2013-06-14 10:12:51.650043
Data columns (total 2 columns):
b 100 non-null values
c 100 non-null values
dtypes: datetime64[ns](1), int64(1)

also as a FYI, in forthcoming 0.12 release (next week),
you can pass unit=us to specify units of microseconds since epoch

In [13]: pd.to_datetime(a,unit='us')
Out[13]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-06-14 13:10:23.523845, ..., 2013-06-14 14:12:51.650043]
Length: 100, Freq: None, Timezone: None

set_index does not effectively set the index

Try to set index earlier:

data = pd.read_csv('djia_data.csv', index_col='date', parse_dates=['date'])

When i use set_index,I am not able to create a seperate dataframe with set_index column name

Since you have set 'Account' to be the index you can't select it as a column, but you only need to select the column 'd' and the dates will appear as well. To make 'Account' a column, just duplicate it from the index.

inp['account'] = inp.index
inp = inp[['account', 'd']]

problem in setting index in pandas DataFrame

To set the DataFrame index (row labels) using one or more existing columns.

You can use the DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)

Where the parameters define:

keys : column label or list of column labels / arrays drop : boolean,
default True

Delete columns to be used as the new index

append : boolean, default False

Whether to append columns to existing index

inplace : boolean, default False

Modify the DataFrame in place (do not create a new object)

verify_integrity : boolean, default False

Check the new index for duplicates. Otherwise defer the check until
necessary. Setting to False will improve the performance of this
method

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'Country':['Nigeria','Bangladesh','China'],
'population':[89765,98744,654520],
'Birth_Rate':[23.54,34.43,20.3],
'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
columns = ['Country','population','Birth_Rate','Update_Date'])

df2 = pd.DataFrame({'Country':['India','Sri Lanka','Dubai'],
'population':[98343,2453,57432],
'Birth_Rate':[33.54,44.44,23.3],
'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
columns =['Country','population','Birth_Rate','Update_Date'])

df3 = df2.append(df1)
df3.set_index('Country', inplace = True,
append = True, drop = True)
print(df3)

OUTPUT:

Sample Image



Related Topics



Leave a reply



Submit