Friday, March 20, 2009

Calculating Business Days

A common problem in making custom scheduler application is figuring out how to schedule future items or to predict the next due date. For an example, let's say you are the programmer in UPS (the shipping company wearing everything in brown) - and let say you want to give an ability for the customers to view that if a shipment is shipped on a certain date, it would take 5 business days after the shipment date to get to their door - let's call this a delivery date (an existing functionality). Simple enough, right? If we are just doing this in our head, it sounds pretty easy. But we need to remember that we must skip holidays and weekends - since UPS is only open and operating on BUSINESS DAYS.

There are actually an easy solution to this (which we will see) and I also made a more optimized solution and compared the performance.

We are going to assume that there is a list (List _holidays) that holds the holiday schedule. So our method will be given a start date and an interval - and it should figure out the next date. Here is the code for the easiest solution.


private DateTime ProjectDateByLoop(DateTime startDate, int interval)
{
DateTime endDate = startDate;
int i = 0;
while (i < interval)
{
endDate = endDate.AddDays(1.0);
if (endDate.DayOfWeek != DayOfWeek.Saturday && endDate.DayOfWeek != DayOfWeek.Sunday && (!_holidays.Contains(endDate)))
i++;
}
return endDate;
}
A very simple code, it basically increment the day one by one and check whether it is a holiday or not or whether it falls on a weekend.

As we see above, it is very very simple and it works accurately. In my example above about UPS, this is probably is the best solution for the problem: simple code, easy to maintain, and works! But, as I look into it, I started to wonder what if the interval is in the range of thousands, or millions? It is very unlikely and may never be used ever - but I felt compelled (with the encouragement of certain people) to find a better way of doing this.

So, with that in mind, I produced this code below.

private static int CountDaysInDates(DayOfWeek queriedDay, DateTime startDate, DateTime endDate)
{
TimeSpan ts = endDate - startDate;
// get number of full weeks within dates
int count = (int)Math.Floor(ts.TotalDays / 7);

// get remainder and adjust
int remainder = (int)(ts.TotalDays % 7);
int numOfDaySinceLastQueriedDay = (int)(endDate.DayOfWeek - queriedDay);
if (numOfDaySinceLastQueriedDay < 0) numOfDaySinceLastQueriedDay += 7;
if (remainder >= numOfDaySinceLastQueriedDay) count++;
return count;
}

private DateTime ProjectDate(DateTime startDate, int interval)
{
DateTime endDate = startDate;

if (interval == 0)
endDate = startDate;
else
{
endDate = startDate.AddDays(interval);
int numOfSaturday = CountDaysInDates(DayOfWeek.Saturday, startDate.DayOfWeek == DayOfWeek.Saturday ? startDate.AddDays(1) : startDate, endDate);
int numOfSunday = CountDaysInDates(DayOfWeek.Sunday, startDate.DayOfWeek == DayOfWeek.Sunday ? startDate.AddDays(1) : startDate, endDate);

int numOfHoliday = _holidays.Count(h => h >= startDate && h <= endDate && (h.DayOfWeek != DayOfWeek.Sunday && h.DayOfWeek != DayOfWeek.Saturday));
int offSet = numOfSaturday + numOfSunday + numOfHoliday;
endDate = ProjectDate(endDate, offSet);
}
return endDate;
}
At first glance, the second code should be much faster, since it does not have any loop at all. But, it is not quite as simple as the first code (even though it is still simple). The method CountDaysInDates is basically for counting how many times a day (like "Monday") occurs within 2 given dates. I use this method to calculate how many Saturdays and Sundays (weekends) to skipped. Why not just count the number of weekends? Because there is a likelihood that the start date is on a Sunday which needs to be skipped and counted only as "half" of a weekend. Secondly, I want the code to be flexible enough that if the algorithm is used to skip only Sundays, it can be modified easily.

The method ProjectDate is a recursive method. In it basically I checked the number of Saturdays and Sundays plus holidays and add them together - and then call ProjectDate again with the new start date (after interval was added) with the new interval (Sundays + Saturdays + holidays).

So ... how are the performance of these two algorithms? Amazingly enough, when they are used with small intervals, the performance difference is negligible in real use. So, I created an iteration for 100,000 times to run both and measure the execution time.


private void TargetDateHarness()
{
int totalPasses = 100000;
int interval = 720;
DateTime targetDate;

System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
for (int i = 0; i <= totalPasses; i++)
targetDate = ProjectDate(DateTime.Today, interval);
sw.Stop();
TimeSpan ts = sw.Elapsed;
testContextInstance.WriteLine("Non-Loop time: " + ts.TotalMilliseconds + " ms");

sw.Reset();
sw.Start();
for (int i = 0; i <= totalPasses; i++)
targetDate = ProjectDateByLoop(DateTime.Today, interval);
sw.Stop();
ts = sw.Elapsed;
testContextInstance.WriteLine("Loop time: " + ts.TotalMilliseconds + " ms");
}
The result:

Non-Loop time: 2018.7745 ms (2 seconds)
Loop time: 53362.9731 ms (53 seconds)