Bash Script to Remove Directories Based on Modified File Date

bash script to remove directories based on modified file date

#! /bin/bash

# Usage: "ThisProgram /path/to/root/of/files"
# If "/path/to/root/of/files" is not specified, use the current directory instead.

# First, get a list of all subdirs, in depth-first order
find "${1:-.}" -depth -type d -print0 |
while read -r -d '' i
do

# For each subdir, test to see if it matches two conditions. If either
# condition fails, this subdir is not a candidate for deletion.
echo "Trying $i"

# First: is it at the lowest level, i.e. does it have any surviving children?
[ "$(find "$i" -type d -print | wc -l)" -gt 1 ] && continue
echo "$i has no subdirs"

# Second: does it have any recent files?
[ "$(find "$i" -type f -mtime -90 | wc -l)" -gt 0 ] && continue
echo "$i has no new files"

# If we got here, then this candidate has no subdirs and no recent files. Nuke it.
echo rm -rf "$i"

done

delete duplicate file based on modified time, limited by minutes

Here is a simple script to remove files which are less than one minute apart, assuming the file names sort them by time stamp.

prev=0
for file in /path/to/files/*; do
timestamp=$(stat -c %Y -- "$file")
if ((timestamp - prev < 60)); then
rm -- "$file"
else
prev=$timestamp
fi
done

The shell expands the wildcard * to an alphabetical listing; I am assuming this is enough to ensure that any two files which were uploaded around the same time will end up one after the other.

The arguments to stat are system-dependent; on Linux, -c %Y produces the file's last modification time in seconds since the epoch.

The -- should not be necessary as long as the wildcard has a leading path, but I'm including it as a "belt and suspenders" solution in case you'd like to change the wildcard to something else. (Still, probably a good idea to add ./ before any relative path, which then again ensures that the -- will not be necessary.)

Is this a good way to create unique folders based on modified dates?

There are a lot of simpler, less error prone ways to do this. If you have the GNU version of date(1), for example:

#!/usr/bin/env bash
shopt -s nullglob
declare -A mtimes
# Adjust pattern as needed
for file in *.{png,jpg}; do
mtimes[$(date -r "$file" +'%Y-%m-%d')]=1
done
mkdir "${!mtimes[@]}"

This uses a bash associative array to store all the timestamps to use to create new directories from and then makes them all at once with a single mkdir.


And since I mentioned preferring to do it in something other than pure shell in a comment, a tcl one-liner:

tclsh8.6 <<'EOF'
file mkdir {*}[lsort -unique [lmap file [glob -nocomplain -type f *.{png,jpg}] { clock format [file mtime $file] -format %Y-%m-%d }]]
EOF

or perl:

perl -MPOSIX=strftime -e '$mtimes{strftime q/%Y-%m-%d/, localtime((stat)[9])} = 1 for (glob q/*.{png,jpg}/); mkdir for keys %mtimes'

Both of these have the advantage of not needing a specific implementation of date (The -r option isn't POSIX; not sure how widely supported it is outside of the GNU coreutils version), or bash 4+ (An issue if you're using, say, a Mac (I think they still come with perl, at least until the next OS X version or two)).

Need windows batch command one-liner to remove folders by name, not by date.time using powershell if applicable

The PowerShell code for this would be:

Get-ChildItem -Path 'RootPath\Where\The\Folders\To\Delete\Are\Found' -Filter '*_*_*' -Directory |
Where-Object { $_.Name -match '\d{2}_\d{2}_\d{4}' } | # filter some more using regex -match
Sort-Object { [datetime]::ParseExact($_.Name, 'MM_dd_yyyy', $null) } | # sort by date
Select-Object -SkipLast 7 | # skip the newest 7 folders
Remove-Item -Recurse -Force # remove the rest

To play it safe, add -WhatIf to the final Remove-Item command. By doing that, the code does not actually delete anything, but show in the console what would be deleted. If you are satisfied that is correct, then remove -WhatIf to actually remove those folders.

As Olaf already commented, don't think using one-line code would be best, because what you'll end up with is code that isn't readable anymore and where mistakes are extremely hard to find.
There is no penalty whatsoever for multiline code, in fact it is THE way to go!

Removing files in a sub directory based on modification date

The title of your question says "based on modification date". So why not simply using find with mtime option?

find subdirectory -mtime +5d -exec rm -v {} \;

Will delete all files older than 5 days.

Deleting files inside the folder from specific date using powershell

I would do this by first iterating the source folder path for directories with a name that can be converted to a datetime less than the datetime in variabe $specificDate.

Then use Get-ChildItem again inside these folders to find and remove files that do not have .tf.err in their name:

$specificdate = [datetime]::ParseExact('2022-01-01','yyyy-MM-dd', $null)
$sourceFolder = 'Y:\Data\Retail\ABC\Development\ak\AK_Data\*'

Get-ChildItem -Path $sourceFolder -Directory |
Where-Object { [datetime]::ParseExact($_.Name,'yyyy-MM-dd', $null) -lt $specificdate } |
ForEach-Object {
Write-Host "Removing files from folder $($_.Name).."
Get-ChildItem -Path $_.FullName -File |
Where-Object { $_.Name -notlike '*.tf.err*' } |
Remove-Item -WhatIf
}

Again here, I added the -WhatIf switch so you can first see what WOULD happen.
If you're OK with that, remove -WhatIf and run the code again to actually delete the files



Related Topics



Leave a reply



Submit