Using SQL To Accrue Values (Overtime in Time Entries)
Posted in SQL on March 15th, 2010 by Dave Andrews – 11 CommentsI’ve made it an exercise recently to write SQL code that will determine overtime hours and regular hours given a list of time entries. I wanted to do this through a single (well, nested) SQL query and to not use cursors to iterate through the entries.
The problem is that we have a table of time entries, the date the entries were made, and the employee who made them and we need to accrue the hours that employee worked in a given week. This is simple to do with iterators in code or cursors in T-SQL, but what fun would that be?
I’m going to simplify the problem by only looking at a single week of data. In a larger solution which looks over a larger range of dates such as a month you will need a subquery which assigns each week a number and group by that week number. In my full blown implementation of this, I used row_number() over (order by startdate) given a list of weeks with a startdate and enddate. But that is for a later post.
Let’s look at our week’s worth of data, which has 2 employees.
Table TimeEntries
| TimeEntryID | EmployeeID | DateWorked | Hours |
|---|---|---|---|
| 1 | 1 | 2010-01-04 | 8 |
| 2 | 1 | 2010-01-05 | 10 |
| 3 | 1 | 2010-01-06 | 7.5 |
| 4 | 1 | 2010-01-07 | 10 |
| 5 | 1 | 2010-01-08 | 10 |
| 6 | 1 | 2010-01-09 | 4 |
| 7 | 2 | 2010-01-04 | 10 |
| 8 | 2 | 2010-01-05 | 10 |
| 9 | 2 | 2010-01-06 | 10 |
| 10 | 2 | 2010-01-07 | 10 |
As you can infer from the above, employee 1 worked overtime (more than 40 hours) starting on the 8th, and employee 2 worked no overtime, since the employee worked exactly 40 hours in that week.
Problem 1: Multiple Records in a Day The first problem we must overcome is calculating the total number of hours worked in a given day. This is because we can have an employee who puts in 4 hours in the morning and 4 hours in the afternoon. If we don’t have this in a single record, it will be difficult down the line when our complicated joins and subqueries come in.
So we build this query which will sum up the hours in a given day. We will use this in a couple of places.
1 2 3 4 5 6 7 8 9 | SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked |
Very simple, we’re just summing up the hours per day. We’re going to use this query in several places.
Problem 2: Finding out which days are in overtime. The next thing we need to overcome is to come up with a means of determining which days should be in overtime. This is where “accrual” of sorts comes into play. We join each day’s record to every record before it. We then group that record and sum up the hours previous to the given day.
Let’s look at the query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | SELECT te.EmployeeId, te.TotalHoursInDay, te.DateWorked, sum(previous.TotalHoursInDay) HoursUpToThisPoint, case when te.TotalHoursInDay + SUM(previous.TotalHoursInDay) > 40 then 'Y' else 'N' end InOvertime FROM ( SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked ) te LEFT JOIN ( SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked ) previous ON te.EmployeeId = previous.EmployeeId AND te.DateWorked > previous.DateWorked GROUP BY te.EmployeeId, te.TotalHoursInDay, te.DateWorked |
The first subquery, te, creates a list of each day for each employee and the total hours for that day. This is to get the total current hours for the employee for the given day. We join that to the second subquery, previous. The Previous query generates a total number for all days PRIOR to the given day. (The logic comes from our join condition, te.DateWorked > previous.DateWorked)
The InOvertime flag will be used later on to determine overtime hours. It lets us know when the employee has crossed the threshold. That could probably be taken out, but it simplifies the logic later on.
So we get these results:
| EmployeeID | TotalHoursInDay | DateWorked | HoursUpToThisPoint | InOvertime |
|---|---|---|---|---|
| 1 | 8 | 2010-01-04 | NULL | N |
| 1 | 10 | 2010-01-05 | 8 | N |
| 1 | 7.5 | 2010-01-06 | 18 | N |
| 1 | 10 | 2010-01-07 | 25.5 | N |
| 1 | 10 | 2010-01-08 | 35.5 | Y |
| 1 | 4 | 2010-01-08 | 45.5 | Y |
| 2 | 10 | 2010-01-04 | NULL | N |
| 2 | 10 | 2010-01-05 | 10 | N |
| 2 | 10 | 2010-01-06 | 20 | N |
| 2 | 10 | 2010-01-07 | 30 | N |
As you can see, this subquery determines for us the total number of hours up until the given day, and when combined with the hours worked in the given day we know if the employee is in overtime or not.
Let’s put one more wrapper around this which will total up the hours and tell us how many hours were regular hours and how many were overtime.
Here is our final query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | SELECT EmployeeId, sum(case when InOvertime = 'N' then TotalHoursInDay when InOvertime = 'Y' AND HoursUpToThisPoint < 40 then 40 - HoursUpToThisPoint else 0 end) RegularHours, sum(case when InOvertime = 'Y' then case when HoursUpToThisPoint > 40 then TotalHoursInDay else HoursUpToThisPoint + TotalHoursInDay - 40 end else 0 end) OTHours FROM ( SELECT te.EmployeeId, te.TotalHoursInDay, te.DateWorked, sum(previous.TotalHoursInDay) HoursUpToThisPoint, case when te.TotalHoursInDay + SUM(previous.TotalHoursInDay) > 40 then 'Y' else 'N' end InOvertime FROM ( SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked ) te LEFT JOIN ( SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked ) previous ON te.EmployeeId = previous.EmployeeId AND te.DateWorked > previous.DateWorked GROUP BY te.EmployeeId, te.TotalHoursInDay, te.DateWorked ) HoursQuery GROUP BY HoursQuery.EmployeeId |
| EmployeeId | RegularHours | OTHours |
|---|---|---|
| 1 | 40 | 9.5 |
| 2 | 40 | 0 |
What we have done here is add some logic that acts on whether or not the a given time entry is in overtime. We group by the employee so we can aggregate hours.
If the record is in overtime, it might mean the employee is already over 40 hours or it might mean that the current record puts the employee over 40 hours. So we have to add a check that will take the difference as regular hours, since the hours that take the employee up to 40 are still regular hours.
The second check is to calculate overtime. If the time entry puts the employee over 40, then take the amount over 40 as overtime. If the employee is already over 40, then the entire entry is overtime.
I hope this post helps you have the mindset of doing as much as possible with your queries, rather than writing code to calculate values which can easily be queried.