Extract date from file name with import re in python
It looks to me like the regex you're using is also at fault, and so it fails when trying to group(0)
from the empty return.
Assuming all your dates are stored as digits the following regex i've made seems to work quite well.
(?!.+_)\d+(?=\.xlsx)
The next issue is when formatting the date it experiences an issue with the way you're formatting the date, to me it looks like 12112019 would be the 12/11/2019 obviously this could also be the 11/12/2019 but the basic is that we change the way strftime formats the date.
So for the date / month / year format we would use
# %d%m%Y
event_date_obj = datetime.strptime(event_date, '%d%m%Y')
And we would simply swap %d and %m for the month / date / year format. So your complete code would look something like this:
date = os.path.basename(xls)
pattern = "(?!.+_)\d+(?=\.xlsx)"
event_date = re.search(pattern, date).group(0)
event_date_obj = datetime.strptime (event_date, '%d%m%Y')
For further information on how to use strftime see https://strftime.org/.
How to extract string from a filename containing date using regex in python?
You can use re.search in your if statement:
test = re.search('(.*)_\d{4}[_]\d{2}[_]\d{2}', flow_id)
if test:
your_match = test[1]
test[0] is the whole string
test[1] is the first parentheses
test[2] is the date (as a string)
--
Edit
def get_clean_flow_id(filename):
test = re.search('(.*)_\d{4}[_]\d{2}[_]\d{2}', filename)
if test:
return test[1]
else:
return None
Returns:
>>> get_clean_flow_id("Livongo_Weekly_Enrollment_2019_03_19.csv")
'Livongo_Weekly_Enrollment'
>>> get_clean_flow_id("Omada_weekly_fle_20190319120301.csv")
>>> get_clean_flow_id("tivity_weekly_fle_20190319120301.json")
>>>
Regex to extract date and time from string
You can use email utils to parse date and then convert in the format you wish:
from email import utils
date = utils.parsedate_to_datetime('Thu Jun 07 01:13:25 +0000 2018')
date.strftime('%d/%b/%Y')
date.strftime('%H:%M:%S')
How to extract the date from the file name using python
You can get datetime data like this:
In [1]: import csv
In [2]: import datetime
In [3]: with open('del.csv') as f:
...: csv_reader = csv.reader(f, delimiter=" ")
...: next(csv_reader)
...: for row in csv_reader:
...: dt = datetime.datetime.strptime('-'.join(row[:2]), '%Y_%m_%d-%H:%M:%S')
...: print(dt)
2012-01-01 00:02:14
2012-01-01 00:08:29
2012-01-01 00:14:45
del.csv
: Name of your csv file
dt
: a Python datetime object
Related Topics
Turn String into a List and Remove Carriage Returns (Python)
Change Date Formats in CSV With Python 3
How to Assign Values to a Numpy Array as a Function of Index
Python: How to Keep Repeating a Program Until a Specific Input Is Obtained
Concatenate Two Columns in Csv: Python
Import Local Module in Jupyter Notebook
How to Crop the Black Background of the Image Using Opencv in Python
How to Generate and Open an Outlook Email With Python (But Do Not Send)
How to Decompile a Compiled .Pyc File into a .Py File
Why Calling .Sort() Function on Pandas Series Sorts Its Values In-Place and Returns Nothing
Python: Getting Around Division by Zero
Python Serial: How to Use the Read or Readline Function to Read More Than 1 Character At a Time
How to Use Installed Packages in Pycharm
Jsondecodeerror: Expecting Value: Line 1 Column 1 (Char 0)
How to Replace Nan Values Where the Other Columns Meet a Certain Criteria
Faster Way to Read Excel Files to Pandas Dataframe