Use AWS Glue Python with NumPy and Pandas Python Packages
I think the current answer is you cannot. According to AWS Glue Documentation:
Only pure Python libraries can be used. Libraries that rely on C extensions, such as the pandas Python Data Analysis Library, are not yet supported.
But even when I try to include a normal python written library in S3, the Glue job failed because of some HDFS permission problem. If you find a way to solve this, please let me know as well.
Using Pandas AWS Glue Python Shell Jobs
- Goto https://docs.aws.amazon.com/glue/latest/dg/add-job-python.html#create-python-extra-library. Check section
To create a Python .egg or .whl file for 'how to create setup file for python shell job' - In setup.py file, add line
install_requires=['pandas==0.25.1']
:
setup(name="<module name>",
version="0.1",
packages=['<package name if any or ignore>'],
install_requires=['pandas==0.25.1']
)
I also wrote small shell script to deploy python shell job without manual steps to create egg file and upload to s3 and deploy via cloudformation. Script does all automatically.
You may find code at https://github.com/fatangare/aws-python-shell-deploy
AWS Glue python shell - Using multiple libraries
This question is already answered by gbeaven, but for some reasons I am unable mark it as answer. This was fixed by comma separating the file paths in the additional python modules.
Related Topics
How to Set Headers Using Python's Urllib
Python: Why Does My List Change When I'm Not Actually Changing It
Restricting the Value in Tkinter Entry Widget
How to "Zip Sort" Parallel Numpy Arrays
Python Generator That Groups Another Iterable into Groups of N
Python Matplotlib Framework Under MACosx
MySQL "Incorrect String Value" Error When Save Unicode String in Django
Scrapy - How to Manage Cookies/Sessions
How to Call Python Code from C Code
Python Module to Change System Date and Time
Python Map Object Is Not Subscriptable
Most Efficient Way to Search the Last X Lines of a File
Typeerror: 'Range' Object Does Not Support Item Assignment