Key Error: None of [Int64Index...] Dtype='Int64] Are in the Columns

Key Error: None of [Int64Index...] dtype='int64] are in the columns

You created your scaled_inputs_all DataFrame using loc
function, so it most likely contains no consecutive indices.

On the other hand, you created shuffled_indices as a shuffle
from just a range of consecutive numbers.

Remember that scaled_inputs_all[shuffled_indices] gets rows
of scaled_inputs_all which have index values equal to
elements of shuffled_indices.

Maybe you should write:

scaled_inputs_all.iloc[shuffled_indices]

Note that iloc provides integer-location based indexing, regardless of
index values, i.e. just what you need.

Key Error: None of [Int64Index…] dtype='int64] are in the [columns]

You should use df.loc[indexes] to select rows by their indexes. If you want to select rows by their integer location you should use df.iloc[indexes].

In addition to that, you can read this page on Indexing and Selecting data with pandas.

KeyError: None of [Int64Index([ 12313,\n , 34534],\n dtype='int64', leng

In this post they answer it in a different way a bit but one of the comments answers my questions.

  • Receiving KeyError: "None of [Int64Index([ ... dtype='int64', length=1323)] are in the [columns]" @bubble

  • It have to be Numpy vectorized not a data frame when you load in your data.

X = mydataframe.drop(['acol','bcol'], axis=1).values 
y = mydataframe['targetvalue'].values

Receiving KeyError: None of [Int64Index([ ... dtype='int64', length=1323)] are in the [columns]

in this piece of code train, test are arrays of indices, while you using it as a columns when selection from DataFrame:

for train, test in kf.split(X, Y):
probas_ = model.fit(X[train], Y[train]).predict_proba(X[test])

you should use iloc instead:

    probas_ = model.fit(X.iloc[train], Y.iloc[train]).predict_proba(X.iloc[test])

KeyError: None of [Int64Index dtype='int64', length=9313)] are in the [columns]

Seems like you have a data frame slicing issue rather than something wrong with StratifiedKFold itself. I crafted a df for that purpose and solved it using iloc to slice an array of indexes here:

from sklearn import model_selection

# The list of some column names in flag
flag = ["raw_sentence", "score"]
x=df.loc[:, ~df.columns.isin(flag)].copy()
y= df[flag].copy()
skf =model_selection.StratifiedKFold(n_splits=2, random_state=None, shuffle=False)
for train_index, test_index in skf.split(x, y):
print("TRAIN:", train_index, "TEST:", test_index)
x_train, x_test = x.iloc[list(train_index)], x.iloc[list(test_index)]

And train_indexes and test_indexes being nd-arrays kinda messes the work here, i convert them to the lists.

you may refer: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html



Related Topics



Leave a reply



Submit