Delete All Files Older Than 30 Days, Based on File Name as Date

Delete all files older than 30 days, based on file name as date

I am by no means a systems administrator, but you could consider a simple shell script along the lines of:

# Generate the date in the proper format
discriminant=$(date -d "30 days ago" "+%Y_%m_%d")

# Find files based on the filename pattern and test against the date.
find . -type f -maxdepth 1 -name "*_*_*.txt" -printf "%P\n" |
while IFS= read -r FILE; do
if [ "${discriminant}" ">" "${FILE%.*}" ]; then
echo "${FILE}";
fi
done

Note that this is will probably be considered a "layman" solution by a professional. Maybe this is handled better by awk, which I am unfortunately not accustomed to using.

Removing files older than X days using a date format in the filename

With bash and a regex:

for i in app-*.log*; do
[[ "$i" =~ -([0-9]{4}-[0-9]{2}-[0-9]{2}) ]] \
&& [[ "${BASH_REMATCH[1]}" < "2020-12-20" ]] \
&& echo rm -v "$i"
done

${BASH_REMATCH[1]} contains 2020-12-17, e.g.

As one line:

for i in app-*.log*; do [[ "$i" =~ -([0-9]{4}-[0-9]{2}-[0-9]{2}) ]] && [[ "${BASH_REMATCH[1]}" < "2020-12-20" ]] && echo rm -v "$i"; done

Output:


rm -v app-2020-12-17.log.2
rm -v app-2020-12-18.log.1
rm -v app-2020-12-18.log.2
rm -v app-2020-12-18.log.31
rm -v app-2020-12-18.log.32
rm -v app-2020-12-18.log.33
rm -v app-2020-12-18.log.3.gz

Delete 30 days older files based on the date in the filename

The following, based on arithmetic comparison of timestamps, should work :

keep_ts=$(date --date="30 days ago" +%s)
for file in $yourDir/*.csv; do
file_ts=$(date --date="$(echo $file | cut -d_ -f3 | cut -d. -f1)" +%s)
if [ "$file_ts" -lt "$keep_ts" ]; then
rm "$file"
fi
done

Delete files older than 180 days based on file name date

This batch file uses a brute force method to calculate the subtraction of 180 days from the current day using a little trickery with the XCOPY command. There are alot better and quicker ways to get the subtracted date by converting the dates to julian dates in batch file. But I have always used this one because I understand it and I normally only have to go back a week or two with all of my work stuff.

Powershell or any other Windows scripting language like Jscript or Vbscript would be a better solution for getting the date. With Powershell you could in theory call it from your batch file with one line of code to get the date within a split second.

PowerShell -Command "&{((Get-Date).AddDays(-180)).ToString('yyyyMMdd')}"

Regardless of all that, here is pure batch file solution. Set the folder variable to where your files exist. When you are satisfied with the output on the screen remove the word ECHO from this line of the code: echo del "%%G_%%H_%%I". This is just a safety precaution for testing. Once you remove the echo it will then delete the files.

@echo off
setlocal
set "folder=C:\some folder\sub folder"

REM set the number of days to substract
SET DAYS=180

REM Call function to check if the date is valid.
CALL :validdate "%days%" subdate
echo Older than: %subdate%
pushd "%folder%"

REM Get a list of the files
REM file pattern is: Backup_YYYYMMDDHHMMSS_FileName.ext
set "search=[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]"
FOR /F "tokens=1,2* delims=_" %%G IN ('dir /a-d /b Backup_*_*.* 2^>nul ^|findstr /I /R /C:"^backup_%search%_*.*"') DO (
setlocal enabledelayedexpansion
set "fdate=%%H"
set "fdate=!fdate:~0,8!"
IF !fdate! lss %subdate% echo del "%%G_%%H_%%I"
endlocal
)
popd
pause

endlocal
GOTO :EOF

:validdate
setlocal
set "day=%~1"
set rand=%random%
md "%temp%\dummy%rand%\empty%rand%"

REM Get todays date
for /f "tokens=2 delims==" %%a in ('wmic OS Get localdatetime /value') do set "dt=%%a"

REM set year month and day into its own variables.
set /a y=%dt:~0,4%
set /a m=1%dt:~4,2%
set /a d=1%dt:~6,2%

:loop
if "%day%"=="0" (
rd /s /q "%temp%\dummy%rand%"
endlocal &set "%~2=%y%%m:~-2%%d:~-2%"
GOTO :EOF
)
set /a d-=1

if %d% lss 101 (
set d=131
set /a m-=1

if %m% lss 101 (
set m=112
set /a y-=1
)
)

xcopy /d:%m:~-2%-%d:~-2%-%y% /t "%temp%\dummy%rand%\empty%rand%" "%temp%\dummy%rand%" >nul 2>&1 && (set /a day-=1 & goto loop) || goto loop

GOTO :EOF

Here is an example of me using it on my system. I first show you a directory listing of all the backup files then I run the batch file to show you what files it will delete and what the subtracted date is.

C:\BatchFiles\DATE_TIME_FUNCTIONS>dir /b backup*
Backup_20170504013002_FileName.txt
Backup_20170505093002_FileName.txt
Backup_20170506113002_FileName.txt
Backup_20170507123002_FileName.txt

C:\BatchFiles\DATE_TIME_FUNCTIONS>DeleteFiles.bat
Older than: 20170506
del "Backup_20170504013002_FileName.txt"
del "Backup_20170505093002_FileName.txt"
Press any key to continue . . .

Now my subtract date function will slow down slightly because it uses a loop to get the subtracted date. The more days you have to subtract the longer it will take to run. Although at this stage it really isn't that much time just to subtract back to the last leap day. Was only about 5 seconds on my computer.

But this code uses a nice algorithm to use the julian date to do the date math. It is much quicker than my previous code at calculating the date.

@echo off
setlocal

Call :GetDateTime Year Month Day
Call :SubtractDate %Year% %Month% %Day% -613 Ret

echo subdate: %Ret%
pause
GOTO :EOF

:SubtractDate Year Month Day <+/-Days> Ret
::Adapted from DosTips Functions::
setlocal & set a=%4
set "yy=%~1"&set "mm=%~2"&set "dd=%~3"
set /a "yy=10000%yy% %%10000,mm=100%mm% %% 100,dd=100%dd% %% 100"
if %yy% LSS 100 set /a yy+=2000 &rem Adds 2000 to two digit years
set /a JD=dd-32075+1461*(yy+4800+(mm-14)/12)/4+367*(mm-2-(mm-14)/12*12)/12-3*((yy+4900+(mm-14)/12)/100)/4
if %a:~0,1% equ + (set /a JD=%JD%+%a:~1%) else set /a JD=%JD%-%a:~1%
set /a L= %JD%+68569, N= 4*L/146097, L= L-(146097*N+3)/4, I= 4000*(L+1)/1461001
set /a L= L-1461*I/4+31, J= 80*L/2447, K= L-2447*J/80, L= J/11
set /a J= J+2-12*L, I= 100*(N-49)+I+L
set /a YYYY= I, MM=100+J, DD=100+K
set MM=%MM:~-2% & set DD=%DD:~-2%
set ret=%YYYY: =%%MM: =%%DD: =%
endlocal & set %~5=%ret%
exit /b

:GetDateTime Year Month Day Hour Minute Second
@echo off & setlocal
for /f "tokens=2 delims==" %%a in ('wmic OS Get localdatetime /value') do set "dt=%%a"
set "YY=%dt:~2,2%" & set "YYYY=%dt:~0,4%" & set "MM=%dt:~4,2%" & set "DD=%dt:~6,2%"
set "HH=%dt:~8,2%" & set "Min=%dt:~10,2%" & set "Sec=%dt:~12,2%"
( ENDLOCAL
IF "%~1" NEQ "" set "%~1=%YYYY%"
IF "%~2" NEQ "" set "%~2=%MM%"
IF "%~3" NEQ "" set "%~3=%DD%"
IF "%~4" NEQ "" set "%~4=%HH%"
IF "%~5" NEQ "" set "%~5=%Min%"
IF "%~6" NEQ "" set "%~6=%Sec%"
)
exit /b

delete if 30days old by folder name

Try using the [DateTime]::ParseExact() method, it's much simpler for your purposes:

function Delete-Folder-30days{
gci "\\$args\Apps\AndrewTest" -Directory | ?{[datetime]::ParseExact($_.Name,"MMddyyyy",$null) -lt (get-date).AddDays(-30)} | Remove-Item -Recurse
}

Delete-Folder-30days $Server

Edit: Sorry! Had the String and Format switched (should be like ("04122014","MMddyyyy",$null)) but I had the first two arguments reversed.

Edit2: You want it to include .zip files as well? There's a couple of things. If you want to include all files then it is really simple. Just remove the -Directory from the GCI command and it will look at all files and folders in the target directory. If you ONLY want folders and .ZIP files then it gets a little more complicated. Basically we will still remove the -Directory switch, but we'll have to add some filtering into the Where clause as such:

?{($_.PSIsContainer -and [datetime]::ParseExact($_.Name,"MMddyyyy",$null) -lt (get-date).AddDays(-30)) -or ($_.Extension -ieq ".zip" -and [datetime]::ParseExact($_.BaseName,"MMddyyyy",$null) -lt (get-date).AddDays(-30))} 

So now instead of just checking the specially formatted date, you are asking Is this a folder, or does it have the .zip file extension? If at least one of those two is true, does it match the specially formatted date?

Delete all files, that are named by date (yyyy-MM-dd), older than x days

This is my attempt. I hope it works for your goal (it works for me).

Dim curDate As Date = Date.Today.ToString("yyyy-MM-dd")
Dim folderPath As String = "Put your path here!"

For Each fileFound As String In Directory.GetFiles(folderPath)
If Regex.IsMatch(fileFound, "\d{4}-\d{2}-\d{2}") Then
Dim regex As Regex = New Regex("\d{4}-\d{2}-\d{2}")
Dim matchFileDate As Match = regex.Match(fileFound)
Dim fileDate As DateTime = DateTime.ParseExact(matchFileDate.Value, "yyyy-MM-dd", CultureInfo.InvariantCulture)
Dim days As Integer = fileDate.Subtract(curDate).Days

If days < -7 Then
My.Computer.FileSystem.DeleteFile(fileFound)
End If
End If
Next

This will only work if you are working with this date format yyyy-mm-dd all the time in your file names. This is because the regex "\d{4}-\d{2}-\d{2}" will condition it. However, you could find better regex for this.

Identifying the files older than x-months by the filename only and deleting them

Working on the premise that you mean 90 days - if you need specifically months, we can check that too, but it's different logic.

here's some code you could work from -


(you said you don't want to work from a list, so I edited to use the current directory.)

$: cat chkDates
# while read f # replaced with -
for f in *[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]*
do # first get the epoch timestamp of the file based on the sate string embedded in the name
filedate=$(
date +%s -d $(
echo $f | sed -E 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/'
) # this returns the date substring
) # this converts it to an epoch integer of seconds since 1/1/70
# now see if it's > 90 days ( you said 3 months. if you need *months* we have to do some more...)
daysOld=$(( ( $(date +%s) - $filedate ) / 86400 )) # this should give you an integer result, btw
if (( 90 < $daysOld ))
then echo $f is old
else echo $f is not
fi
done # < listOfFileNames # not reading list now

You can pass date a date to report, and a format to present it.

sed pattern explanation

Note the sed -E 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/' command. This assumes the date format will be consistently YYYY-MM-DD, and does no validations of reasonableness. It will happily accept any 4 digits, then 2, then 2, delimited by dashes.

-E uses expanded regexes, so parens () can denote values to be remembered, without needing \'s. . means any character, and * means any number (including zero) of the previous pattern, so .* means zero or more characters, eating up all the line before the date. [0-9] means any digit. {x,y} sets a minimum(x) and maximum(y) number of consecutive matches - with only one value {4} means only exactly 4 of the previous pattern will do. So, '.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*' means ignore as many characters as you can until seeing 4 digits, then a dash, 2 digits, then a dash, then 2 digits; remember that pattern (the ()'s), then ignore any characters behind it.

In a substitution, \1 means the first remembered match, so

sed -E 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/'

means find and remember the date pattern in the filenames, and replace the whole name with just that part in the output. This assumes the date will be present - on a filename where there is no date, the pattern will not match, and the whole filename will be returned, so be careful with that.

(hope that helped.)

By isolating the date string from the filenames with sed (your examples were format-consistent, so I used that) we pass it in and ask for the UNIX Epoch timestamp of that date string using date +%s -d $(...), to represent the file with a math-handy number.

Subtract that from the current date in the same format, you get the approximate age of the file in seconds. Divide that by the number of seconds in a day and you get days old. The file date will default to midnight, but the math will drop fractions, so it sorts out.

here's the file list I made, working from your examples

$: cat listOfFileNames
fileone.log.2018-03-23
fileone.log.2018-09-23
file_two_2018-03-23.log
file_two_2018-08-23.log
filethree.log.2018-03-23
filethree.log.2018-10-02
file_four_file_four_2018-03-23.log
file_four_file_four_2019-03-23.log

I added a file for each that would be within the 90 days as of this posting - including one that is "post-dated", which can easily happen with this sort of thing.

Here's the output.

$: ./chkDates
fileone.log.2018-03-23 is old
fileone.log.2018-09-23 is not
file_two_2018-03-23.log is old
file_two_2018-08-23.log is not
filethree.log.2018-03-23 is old
filethree.log.2018-10-02 is not
file_four_file_four_2018-03-23.log is old
file_four_file_four_2019-03-23.log is not

That what you had in mind?

An alternate pure-bash way to get just the date string

(You still need date to convert to the epoch seconds...)

instead of

   filedate=$(
date +%s -d $(
echo $f | sed -E 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/'
) # this returns the date substring
) # this converts it to an epoch integer of seconds since 1/1/70

which doesn't seem to be working for you, try this:

tmp=${f%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]*} # unwanted prefix
d=${f#$tmp} # prefix removed
tmp=${f#*[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]} # unwanted suffix
filedate=${d%$tmp} # suffix removed
filedate=$( date +%s --date=$filedate ) # epoch time

This is hard to read, but doesn't have to spawn as many subprocesses to get the work done. :)

If that doesn't work, then I'm suspicious of your version of date. Mine:

$: date --version
date (GNU coreutils) 8.26

Deleting files in a directory, keeping files created on a certain day of the month unless it's the weekend, then keeping files from following Monday

Your code seems OK but datetime has some functions to make it cleaner

If you convert timestamp to datetime

created_datetime = datetime.datetime.fromtimestamp(created_timestamp)

then you can get day directly as integer

created_day = created_datetime.day

You can also get date without time to compare with other date

created_date = created_datetime.date()

Today date (without time) you can get with

today = datetime.date.today()

or

today = datetime.datetime.now().date()

and then you can get 30 days before using

before_30_days = today - datetime.timedelta(days=30)

or

one_day = datetime.timedelta(days=1)

before_30_days = today - 30*one_day

And you can compare date (without time)

created_date < before_30_days


import datetime
import time
import os

# calculate only once
today = datetime.date.today()
before_30_days = today - datetime.timedelta(days=30)
#one_day = datetime.timedelta(days=1)
#before_30_days = today - 30*one_day

for root, _, filenames in os.walk('test'):
for filename in filenames:
file_path = os.path.join(root, filename)
created_timestamp = os.path.getctime(file_path)
created_datetime = datetime.datetime.fromtimestamp(created_timestamp)
created_date = created_datetime.date()
created_day = created_datetime.day
#print(created_day)

if created_date < before_30_days: ### deleting older than 30 days
day_of_week = created_datetime.weekday() # monday = 0
#day_of_week = created_datetime.isoweekday() # monday = 1
#print(created_date, '|', day_of_week)

# getting rid of anything with a creation day before the 15th or after the 17th,
# or if 16th/17th and not a Monday, or if 15th but on a weekend
if (15 <= created_day <= 17
or (created_day == 16 and not weekday == 0)
or (created_day == 17 and not weekday == 0)
or (created_day == 15 and weekday > 4)):
#os.remove(file_path)
print(created_date, '|', day_of_week, '|', file_path)


Related Topics



Leave a reply



Submit