Pandas.Read_Excel Parameter "Sheet_Name" Not Working

pandas.read_excel parameter "sheet_name" not working

It looks like you're using the old version of Python.
So try to change your code

df = pd.read_excel(file_with_data, sheetname=sheet_with_data)

It should work properly.

Pandas cannot read specific Excel sheet

from the docs:

sheetname : string, int, mixed list of strings/ints, or None, default
0

Deprecated since version 0.21.0: Use sheet_name instead

That also means that it used to be sheetname before the version 0.21.0
;-)

Pandas Read Excel Read Second Tab Ignoring the Sheet Name

pd.ExcelFile has a method .parse() which works exactly like pd.read_excel(). Both functions accept the parameter sheet_name which handles many ways of selecting one or more sheets to import. In your case, you want to refer to the sheet number, so you should pass sheet_name an integer value indicating the sheet. Pandas numbers the sheets starting with 0, so the 2nd sheet can be selected with sheet_name=1 as:

pd.ExcelFile('File Path').parse(sheet_name=1)

This is equivalent to:

pd.read_excel('File Path', sheet_name=1)

Th sheet_name and the other parameters for reading Excel files are described in the pandas docs.

Is there a way to load sheets with a specific regex with pandas.read_excel()

You can use pandas.ExcelFile to have a peek at the sheet names, then select the sheets to keep with any method (here your regex), finally load with pandas.read_excel:

import re

xl = pd.ExcelFile('filename.xlsx')

regex = re.compile('your_regex')

sheets = [n for n in xl.sheet_names if regex.match(n)]
# ['matching_sheet1', 'matching_sheet2']

dfs = pd.read_excel(xl, sheet_name=sheets)
# {'matching_sheet1': DataFrame1,
# 'matching_sheet2': DataFrame2}

Cannot read all sheets of the excel file using pandas

Solution:

#output is nested lists of list of dictionaries
def extract(self, file_name):
raw_excel=pd.read_excel(file_name,sheet_name=None)
return [v[v.columns.intersection(conf)].to_dict(orient='records')
for k, v in raw_excel.items()]

Explanation:

If use sheet_name=None output is dictionary of DataFrames, here raw_excel.

If need loop by dict here is used list comprehension with method items, so v is values and k for keys.

For filter only columns from DataFrame if exist in conf is used Index.intersection.

Last is used to_dict, so get for each DataFrame list of dictionaries. Final output, in another words return get lists of list of dictionaries.

If need flatten ouput is possible use this solution:

flat_list = [item for sublist in t for item in sublist]

So code is changed:

#output is flatten list of dictionaries
def extract(self, file_name):
raw_excel=pd.read_excel(file_name,sheet_name=None)
return [x for k, v in raw_excel.items()
for x in v[v.columns.intersection(conf)].to_dict(orient='records')]

Why won't to run my script for every sheet in excel file with pandas

If you set sheet_name=None, df is not a dataframe but a dict of dataframe where keys are the sheet name.

From the documentation:

Returns DataFrame or dict of DataFrames

DataFrame from the passed in Excel file. See notes in sheet_name argument for more information on when a dict of DataFrames is returned.

dfs = pd.read_excel('Data1.xlsx', sheet_name=None)
>>> type(dfs)
dict

>>> dfs.keys()
dict_keys(['Sheet1', 'Sheet2', 'Sheet3'])

>>> dfs['Sheet1']
id first_name last_name
0 1 Roxanna Calderbank
1 2 Hali Kilmartin
2 3 Moss Hatzar
3 4 Kari Giordano
4 5 Dylan Witnall


Related Topics



Leave a reply



Submit