BASH: How to find no. of days (considering only Network / Business Days ) between two dates (i.e. exclude weekends Saturday/Sunday)
It may be the reinvention of the wheel but here's a bash solution (if interested).
Note that it requires the -d
option to the date
command.
while IFS="," read -r endday startday; do
if (( lineno++ == 0 )); then # handle header line
echo "Resolved,StartOfWork,TotalDates"
continue
fi
startsec=$(date -d "$startday" +%s)
startdayofweek=$(date -d "$startday" +%w) # 0 for Sun, ... 6 for Sat
endsec=$(date -d "$endday" +%s)
days=$(( (endsec - startsec) / 86400 + 1 )) # calendar days
weeks=$(( days / 7 )) # number of weeks
frac=$(( days % 7 )) # fraction mod 7
if (( startdayofweek == 0 )); then # case of starting on Sunday
if (( frac > 0 )); then
add=1 # additional number of holidays
else
add=0
fi
else
magic=$(( frac + (startdayofweek + 6) % 7 ))
# calculate number of holidays
# in the fraction period
if (( magic < 6 )); then
add=0
elif (( magic == 6 )); then
add=1
else
add=2
fi
fi
holidays=$(( weeks * 2 + add )) # total number of holidays
workdays=$(( days - holidays )) # subtract the holidays
echo "$endday,$startday,$workdays"
done < inputfile
R: finding difference in business days
Nweekdays()
function is adapted from @J. Won. solution at Calculate the number of weekdays between 2 dates in R
This modified function takes into account of date differences of either positive or negative,
whereas the above link has accepted solution for positive date difference.
library("dplyr")
e2 <- structure(list(date.pr = structure(c(16524, 16524, 16507, 16510, 16510, 16524, 16510, 5974), class = "Date"),
date.po = structure(c(16524, 16525, 16510, 16517, 16524, 16510, 16531, 15974), class = "Date")),
.Names = c("date.1", "date.2"), class = c("tbl_df", "data.frame"), row.names = c(NA, -8L))
Nweekdays <- Vectorize(
function(a, b)
{
ifelse(a < b,
return(sum(!weekdays(seq(a, b, "days")) %in% c("Saturday", "Sunday")) - 1),
return(sum(!weekdays(seq(b, a, "days")) %in% c("Saturday", "Sunday")) - 1))
})
> e2 %>%
mutate(wkd1 = format(date.1, "%A"),
wkd2 = format(date.2, "%A"),
ndays_with_wkends = ifelse((date.2 > date.1), (date.2 - date.1), (date.1 - date.2)),
ndays_no_wkends = Nweekdays(date.1, date.2))
Source: local data frame [8 x 6]
date.1 date.2 wkd1 wkd2 ndays_with_wkends ndays_no_wkends
(date) (date) (chr) (chr) (dbl) (dbl)
1 2015-03-30 2015-03-30 Monday Monday 0 0
2 2015-03-30 2015-03-31 Monday Tuesday 1 1
3 2015-03-13 2015-03-16 Friday Monday 3 1
4 2015-03-16 2015-03-23 Monday Monday 7 5
5 2015-03-16 2015-03-30 Monday Monday 14 10
6 2015-03-30 2015-03-16 Monday Monday 14 10
7 2015-03-16 2015-04-06 Monday Monday 21 15
8 1986-05-11 2013-09-26 Sunday Thursday 10000 7143
> e2 %>% mutate(ndays_no_wkends = Nweekdays(date.1, date.2))
Source: local data frame [8 x 3]
date.1 date.2 ndays_no_wkends
(date) (date) (dbl)
1 2015-03-30 2015-03-30 0
2 2015-03-30 2015-03-31 1
3 2015-03-13 2015-03-16 1
4 2015-03-16 2015-03-23 5
5 2015-03-16 2015-03-30 10
6 2015-03-30 2015-03-16 10
7 2015-03-16 2015-04-06 15
8 1986-05-11 2013-09-26 7143
Subtract two days and calculate working days only (no Saturdays or Sundays)
Take a look at this site, they provide you a java program that helps you with that:
The wdnum() method returns the number of weekdays (excluding weekends) that have passed since Monday, 29 December 1969. It works by calculating the number of days since 1 January, 1970 (getTime() divided by the number of milliseconds in a day), adding 3 and returning the number of week days in full weeks and possibly a partial week that have passed since then.
Computing number of business days between start/end columns
Something like this may work:
from pyspark.sql import functions as F
df_facts = spark.createDataFrame(
[('data1', '2022-05-08', '2022-05-14'),
('data1', '2022-05-08', '2022-05-21')],
['data', 'start_date', 'end_date']
)
df_holidays = spark.createDataFrame([('2022-05-10',)], ['holiday_date'])
df = df_facts.withColumn('exploded', F.explode(F.sequence(F.to_date('start_date'), F.to_date('end_date'))))
df = df.filter(~F.dayofweek('exploded').isin([1, 7]))
df = df.join(F.broadcast(df_holidays), df.exploded == df_holidays.holiday_date, 'anti')
df = df.groupBy('data', 'start_date', 'end_date').agg(F.count('exploded').alias('business_days'))
df.show()
# +-----+----------+----------+-------------+
# | data|start_date| end_date|business_days|
# +-----+----------+----------+-------------+
# |data1|2022-05-08|2022-05-14| 4|
# |data1|2022-05-08|2022-05-21| 9|
# +-----+----------+----------+-------------+
Answers:
Is there a better way than creating a UDF...?
This method does not use udf
, so it must perform better.
Is there a way to write a
pandas_udf
instead? Would it be faster enough?
pandas_udf
performs better than regular udf
. But no-udf approaches should be even better.
Are there some optimizations I can apply like cache the holidays table somehow on every worker?
Spark engine performs optimizations itself. However, there are some relatively rare cases when you may help it. In the answer, I have used F.broadcast(df_holidays)
. The broadcast
sends the dataframe to all of the workers. But I am sure that the table would automatically be broadcasted to the workers, as it looks like it's supposed to be very small.
Day difference without weekends
Very easy with my favourites: DateTime
, DateInterval
and DatePeriod
$start = new DateTime('2012-09-06');
$end = new DateTime('2012-09-11');
// otherwise the end date is excluded (bug?)
$end->modify('+1 day');
$interval = $end->diff($start);
// total days
$days = $interval->days;
// create an iterateable period of date (P1D equates to 1 day)
$period = new DatePeriod($start, new DateInterval('P1D'), $end);
// best stored as array, so you can add more than one
$holidays = array('2012-09-07');
foreach($period as $dt) {
$curr = $dt->format('D');
// substract if Saturday or Sunday
if ($curr == 'Sat' || $curr == 'Sun') {
$days--;
}
// (optional) for the updated question
elseif (in_array($dt->format('Y-m-d'), $holidays)) {
$days--;
}
}
echo $days; // 4
PHP Updating Array of Dates but exclude Weekends (Saturday and Sunday)
Assuming you're trying to get all the days excluding weekends for the current month: you could use array_filter()
with a callback to get the weekend days and then use array_diff()
to create a new array containing only week days:
$year = date('Y');
$month = date('n');
$weekend_days = array_filter($allDays[$year][$month], function($d) {
return (date('N', strtotime(date("Y-m-$d"))) >= 6);
});
$allDays[$year][$month] = array_diff($allDays[$year][$month], $weekend_days);
print_r($allDays);
Demo.
VBA Calculate Number of Days in Weekend
This function should do the trick :
Public Function CountWeekendDays(Date1 As Date, Date2 As Date) As Long
Dim StartDate As Date, EndDate As Date, _
WeekendDays As Long, i As Long
If Date1 > Date2 Then
StartDate = Date2
EndDate = Date1
Else
StartDate = Date1
EndDate = Date2
End If
WeekendDays = 0
For i = 0 To DateDiff("d", StartDate, EndDate)
Select Case Weekday(DateAdd("d", i, StartDate))
Case 1, 7
WeekendDays = WeekendDays + 1
End Select
Next i
CountWeekendDays = WeekendDays
End Function
AS it is a Public Function
, after putting it into any module, you can use it directly in Excel like this =CountWeekendDays(A1,B1)
or in your loop like this :
For i = 2 to 50
variable = CountWeekendDays(Cells(i, "AD"), Cells(i, "T"))
next i
And here is your whole sub curated from useless stuff :
Sub DateWeekDiff()
Dim FRow As Long, Lrow As Long, PRow As Long
Dim CurrentSheet As Worksheet
Set CurrentSheet = Excel.Sheets("Duplicate Removed")
With CurrentSheet
FRow = .UsedRange.Cells(1).Row
Lrow = .Range("A" & .Rows.Count).End(xlUp).Row
For PRow = Lrow To 2 Step -1
.Cells(PRow, "AL").Value = _
CountWeekendDays(.Cells(PRow, "AD").Value, .Cells(PRow, "T").Value)
Next PRow
End With
End Sub
So you just have to paste the function at the start of my post and after you can use it like I did right above, or directly in Excel (this is for the cell AL2) =CountWeekendDays(AD2,T2)
Add business days to date in SQL without loops
This answer has been significantly altered since it was accepted, since the original was wrong. I'm more confident in the new query though, and it doesn't depend on DATEFIRST
I think this should cover it:
declare @fromDate datetime
declare @daysToAdd int
select @fromDate = '20130123',@DaysToAdd = 4
declare @Saturday int
select @Saturday = DATEPART(weekday,'20130126')
;with Numbers as (
select 0 as n union all select 1 union all select 2 union all select 3 union all select 4
), Split as (
select @DaysToAdd%5 as PartialDays,@DaysToAdd/5 as WeeksToAdd
), WeekendCheck as (
select WeeksToAdd,PartialDays,MAX(CASE WHEN DATEPART(weekday,DATEADD(day,n.n,@fromDate))=@Saturday THEN 1 ELSE 0 END) as HitWeekend
from
Split t
left join
Numbers n
on
t.PartialDays >= n.n
group by WeeksToAdd,PartialDays
)
select DATEADD(day,WeeksToAdd*7+PartialDays+CASE WHEN HitWeekend=1 THEN 2 ELSE 0 END,@fromDate)
from WeekendCheck
We split the time to be added into a number of weeks and a number of days within a week. We then use a small numbers table to work out if adding those few days will result in us hitting a Saturday. If it does, then we need to add 2 more days onto the total.
Related Topics
Will Read() Ever Block After Select()
How to Get Amount of Queued Data for Udp Socket
Receiving Key Press and Key Release Events in Linux Terminal Applications
How to Reference Files Relative to Application Root in Node.Js
X86_64 Assembly Linux System Call Confusion
How to Modify the Source of Buildroot Packages for Package Development
Writing a Putchar in Assembly for X86_64 with 64 Bit Linux
How to Download a File from Server Using Ssh
How to Recall the Argument of the Previous Bash Command
How to Read the Source Code of Shell Commands
Docker: Are Docker Links Deprecated
Changing Environment Variable of a Running Process
Using Software Floating Point on X86 Linux
How to Debug Linux Kernel Modules with Qemu
Why Do Shells Ignore Sigint and Sigquit in Backgrounded Processes
Linux X64: Why Does R10 Come Before R8 and R9 in Syscalls