Reading an Excel file in python using pandas
Close: first you call ExcelFile
, but then you call the .parse
method and pass it the sheet name.
>>> xl = pd.ExcelFile("dummydata.xlsx")
>>> xl.sheet_names
[u'Sheet1', u'Sheet2', u'Sheet3']
>>> df = xl.parse("Sheet1")
>>> df.head()
Tid dummy1 dummy2 dummy3 dummy4 dummy5 \
0 2006-09-01 00:00:00 0 5.894611 0.605211 3.842871 8.265307
1 2006-09-01 01:00:00 0 5.712107 0.605211 3.416617 8.301360
2 2006-09-01 02:00:00 0 5.105300 0.605211 3.090865 8.335395
3 2006-09-01 03:00:00 0 4.098209 0.605211 3.198452 8.170187
4 2006-09-01 04:00:00 0 3.338196 0.605211 2.970015 7.765058
dummy6 dummy7 dummy8 dummy9
0 0.623354 0 2.579108 2.681728
1 0.554211 0 7.210000 3.028614
2 0.567841 0 6.940000 3.644147
3 0.581470 0 6.630000 4.016155
4 0.595100 0 6.350000 3.974442
What you're doing is calling the method which lives on the class itself, rather than the instance, which is okay (although not very idiomatic), but if you're doing that you would also need to pass the sheet name:
>>> parsed = pd.io.parsers.ExcelFile.parse(xl, "Sheet1")
>>> parsed.columns
Index([u'Tid', u'dummy1', u'dummy2', u'dummy3', u'dummy4', u'dummy5', u'dummy6', u'dummy7', u'dummy8', u'dummy9'], dtype=object)
read the excel file in directory using pandas python
The directory is missing when you read_excel
, you only point to the file as you showed with the print.
You need to rebuild the full path with for instance, os.path.join:
import os
import pandas as pd
for filename in os.listdir(my_path):
if filename.startswith('PB orders Dec'):
dec = pd.read_excel(os.path.join(my_path, filename), sheet_name='Raw data')
Reading an excel file into a pandas DF that has a pipe and spaces as delimiters
You can use a regex in the sep
field:
my_file = '''
ID|Name|Job|Nationality|
123 Cian|IT|-|
222 John|Teacher|Spanish|
'''
df = pd.read_csv(StringIO(my_file), sep='[ |]')
Using Pandas to pd.read_excel() for multiple worksheets of the same workbook
Try pd.ExcelFile
:
xls = pd.ExcelFile('path_to_file.xls')
df1 = pd.read_excel(xls, 'Sheet1')
df2 = pd.read_excel(xls, 'Sheet2')
As noted by @HaPsantran, the entire Excel file is read in during the ExcelFile()
call (there doesn't appear to be a way around this). This merely saves you from having to read the same file in each time you want to access a new sheet.
Note that the sheet_name
argument to pd.read_excel()
can be the name of the sheet (as above), an integer specifying the sheet number (eg 0, 1, etc), a list of sheet names or indices, or None
. If a list is provided, it returns a dictionary where the keys are the sheet names/indices and the values are the data frames. The default is to simply return the first sheet (ie, sheet_name=0
).
If None
is specified, all sheets are returned, as a {sheet_name:dataframe}
dictionary.
Read excel file in python using pandas
Your fileLocation
variable includes the name of the file. reading fileLocation + fileName
is essentially reading
C:\\Users\\GTS\\Desktop\\Network Interdiction Problem\\Manuscript\\Interdiction_Data.xlsxInterdiction_Data.xlsx
Another issue is that you have quotation marks around your variable names when calling pd.read_excel()
meaning that you are passing a string to the function.
Try:
data = pd.read_excel(fileLocation)
Related Topics
Best Way to Join/Merge by Range in Pandas
How to Make Separator in Pandas Read_CSV More Flexible Wrt Whitespace, for Irregular Separators
Why am I Getting "Indentationerror: Expected an Indented Block"
Unboundlocalerror with Nested Function Scopes
Unnest (Explode) a Pandas Series
Python: Calling 'List' on a Map Object Twice
Python: Problem with Raw_Input Reading a Number
How to Make Ball Bounce Off Wall with Pygame
How to Pass a List as a Command-Line Argument with Argparse
Typeerror: Unhashable Type: 'Dict'
How to Construct a Timedelta Object from a Simple String
How to Get the Input from the Tkinter Text Widget
How to "Test" Nonetype in Python
":=" Syntax and Assignment Expressions: What and Why
What Do Square Brackets, "[]", Mean in Function/Class Documentation