Using SQL To Accrue Values (Overtime in Time Entries)
I’ve made it an exercise recently to write SQL code that will determine overtime hours and regular hours given a list of time entries. I wanted to do this through a single (well, nested) SQL query and to not use cursors to iterate through the entries.
The problem is that we have a table of time entries, the date the entries were made, and the employee who made them and we need to accrue the hours that employee worked in a given week. This is simple to do with iterators in code or cursors in T-SQL, but what fun would that be?
I’m going to simplify the problem by only looking at a single week of data. In a larger solution which looks over a larger range of dates such as a month you will need a subquery which assigns each week a number and group by that week number. In my full blown implementation of this, I used row_number() over (order by startdate) given a list of weeks with a startdate and enddate. But that is for a later post.
Let’s look at our week’s worth of data, which has 2 employees.
Table TimeEntries
| TimeEntryID | EmployeeID | DateWorked | Hours |
|---|---|---|---|
| 1 | 1 | 2010-01-04 | 8 |
| 2 | 1 | 2010-01-05 | 10 |
| 3 | 1 | 2010-01-06 | 7.5 |
| 4 | 1 | 2010-01-07 | 10 |
| 5 | 1 | 2010-01-08 | 10 |
| 6 | 1 | 2010-01-09 | 4 |
| 7 | 2 | 2010-01-04 | 10 |
| 8 | 2 | 2010-01-05 | 10 |
| 9 | 2 | 2010-01-06 | 10 |
| 10 | 2 | 2010-01-07 | 10 |
As you can infer from the above, employee 1 worked overtime (more than 40 hours) starting on the 8th, and employee 2 worked no overtime, since the employee worked exactly 40 hours in that week.
Problem 1: Multiple Records in a Day The first problem we must overcome is calculating the total number of hours worked in a given day. This is because we can have an employee who puts in 4 hours in the morning and 4 hours in the afternoon. If we don’t have this in a single record, it will be difficult down the line when our complicated joins and subqueries come in.
So we build this query which will sum up the hours in a given day. We will use this in a couple of places.
1 2 3 4 5 6 7 8 9 | SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked |
Very simple, we’re just summing up the hours per day. We’re going to use this query in several places.
Problem 2: Finding out which days are in overtime. The next thing we need to overcome is to come up with a means of determining which days should be in overtime. This is where “accrual” of sorts comes into play. We join each day’s record to every record before it. We then group that record and sum up the hours previous to the given day.
Let’s look at the query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | SELECT te.EmployeeId, te.TotalHoursInDay, te.DateWorked, sum(previous.TotalHoursInDay) HoursUpToThisPoint, case when te.TotalHoursInDay + SUM(previous.TotalHoursInDay) > 40 then 'Y' else 'N' end InOvertime FROM ( SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked ) te LEFT JOIN ( SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked ) previous ON te.EmployeeId = previous.EmployeeId AND te.DateWorked > previous.DateWorked GROUP BY te.EmployeeId, te.TotalHoursInDay, te.DateWorked |
The first subquery, te, creates a list of each day for each employee and the total hours for that day. This is to get the total current hours for the employee for the given day. We join that to the second subquery, previous. The Previous query generates a total number for all days PRIOR to the given day. (The logic comes from our join condition, te.DateWorked > previous.DateWorked)
The InOvertime flag will be used later on to determine overtime hours. It lets us know when the employee has crossed the threshold. That could probably be taken out, but it simplifies the logic later on.
So we get these results:
| EmployeeID | TotalHoursInDay | DateWorked | HoursUpToThisPoint | InOvertime |
|---|---|---|---|---|
| 1 | 8 | 2010-01-04 | NULL | N |
| 1 | 10 | 2010-01-05 | 8 | N |
| 1 | 7.5 | 2010-01-06 | 18 | N |
| 1 | 10 | 2010-01-07 | 25.5 | N |
| 1 | 10 | 2010-01-08 | 35.5 | Y |
| 1 | 4 | 2010-01-08 | 45.5 | Y |
| 2 | 10 | 2010-01-04 | NULL | N |
| 2 | 10 | 2010-01-05 | 10 | N |
| 2 | 10 | 2010-01-06 | 20 | N |
| 2 | 10 | 2010-01-07 | 30 | N |
As you can see, this subquery determines for us the total number of hours up until the given day, and when combined with the hours worked in the given day we know if the employee is in overtime or not.
Let’s put one more wrapper around this which will total up the hours and tell us how many hours were regular hours and how many were overtime.
Here is our final query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | SELECT EmployeeId, sum(case when InOvertime = 'N' then TotalHoursInDay when InOvertime = 'Y' AND HoursUpToThisPoint < 40 then 40 - HoursUpToThisPoint else 0 end) RegularHours, sum(case when InOvertime = 'Y' then case when HoursUpToThisPoint > 40 then TotalHoursInDay else HoursUpToThisPoint + TotalHoursInDay - 40 end else 0 end) OTHours FROM ( SELECT te.EmployeeId, te.TotalHoursInDay, te.DateWorked, sum(previous.TotalHoursInDay) HoursUpToThisPoint, case when te.TotalHoursInDay + SUM(previous.TotalHoursInDay) > 40 then 'Y' else 'N' end InOvertime FROM ( SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked ) te LEFT JOIN ( SELECT te.EmployeeId, te.DateWorked, SUM(te.Hours) TotalHoursInDay FROM TimeEntries te GROUP BY te.EmployeeId, te.DateWorked ) previous ON te.EmployeeId = previous.EmployeeId AND te.DateWorked > previous.DateWorked GROUP BY te.EmployeeId, te.TotalHoursInDay, te.DateWorked ) HoursQuery GROUP BY HoursQuery.EmployeeId |
| EmployeeId | RegularHours | OTHours |
|---|---|---|
| 1 | 40 | 9.5 |
| 2 | 40 | 0 |
What we have done here is add some logic that acts on whether or not the a given time entry is in overtime. We group by the employee so we can aggregate hours.
If the record is in overtime, it might mean the employee is already over 40 hours or it might mean that the current record puts the employee over 40 hours. So we have to add a check that will take the difference as regular hours, since the hours that take the employee up to 40 are still regular hours.
The second check is to calculate overtime. If the time entry puts the employee over 40, then take the amount over 40 as overtime. If the employee is already over 40, then the entire entry is overtime.
I hope this post helps you have the mindset of doing as much as possible with your queries, rather than writing code to calculate values which can easily be queried.
Follow Dave on Twitter
Hi Dave,
I have same time entery system and would like to create a report that would give me Regular and overtime hours based on FullName, Department, Hours and Dateworked.
I also would like to (after adding total hours of regular and overtime ) divide that total with Overtime hours to get Overtime percentage.
Can I do it in one SQL?
Thanks., Joseph
That should work. Just put another wrapping query around it that does something like:
select
subquery.*,
case when overtime_hours = 0 then 0 else total_hours / overtime_hours end OTPercentage
from
(
……… the full query here …….
) subquery
I tried bout I am getting different numbers. where Do I put the query ? and
Do I divide Hoursuptothispoint to itself to get OT pertencetage or Hoursuptothispoint + Totalhoursinday / Hoursuptothispoint ? as I am trying to refernce to Regularhours + OTHours / OTHours in your code? Thank you for time and help.
Hey Dave, I have a small project of creating an excel sheet that would collect data offline for a web app I have with a MySQL backend. Please contact me offline and let me know if you would be interested or if you know of someone who would.
Thanks
J.
I have custom time entry system, the only problem I am facing right now is our payroll runs bimonthly which is 15 and last day of month, and overtime calculation is on weekly basis means 40 hours a week.
Please advice how to take contol of this I need report with EmployeeId, RegularHours, OTHours on 15th and last day of month.
Thanks
The system I wrote does exactly that. The company does semi-monthly payments (15 and last day) so I had to have a “status” flag on each time entry. The statuses are:
1. Unapproved
2. Approved
3. Approved By Payroll
4. Payroll Processed
All time entries go in with status of 1. When the employee’s supervisor approves their timesheet, it becomes 2. When the payroll user agrees, the status is 3. When it’s processed by the file, it’s 4.
So what I had to do was sum up all the time entries that occurred before the 1st (or the 16th) that had a status of 4. If that total was more than 40, then the employee was already in overtime. If the sum of the week plus the previous “status 4″ entries was more then 40, then the employee went into overtime later in the week.
You also have to break up weeks by the start date and end date. Since we are going by 40 hours a week, we can’t use the date range of 1-15 and 16-31 for this. The 1st of the month may actually be a Wednesday, so you have to go back to the Sunday of that week to begin totaling. I have a table of weeks from 1990 up until 2030 that I utilize for this.
The report I wrote here (of which this post is A VERY SIMPLIFIED SUBSET) was for a company of about 300 employees and it’s working fantastic for them. In that case, I had the luxury that I had written the entire time entry and payroll system, so I could change it to suit my needs. I wish I could post some code but I can’t because as a consultant I don’t own it. I would really recommend that you NOT use a single SQL query to do this, but rather iterate through entries in code. I had to do so in this case due to some business requirements.
Dave,
Your code works great. I am trying to figure out how to add to your code so that I can get last 4 weeks OT Avg. I have two parameters , workeddate and departmentcode. So when I choose a date from workeddate and departmencode, it should give me last 4 weeks Avg. from the workeddate.
Thank you for your help.
Hi Dave,
Can I get an answer for the request below?
Your code works great. I am trying to figure out how to add to your code so that I can get last 4 weeks OT Avg. I have two parameters , workeddate and departmentcode. So when I choose a date from workeddate and departmencode, it should give me last 4 weeks Avg. from the workeddate.
Thank you for your help.
Do you already have it broken out into weeks? You can’t do so unless you can get something like:
week 1, employee 1, num ot hours
week 2, employee 1, num ot hours
week 3, employee 1, num ot hours
week 4, employee 1, num ot hours
Are you able to get it into a format like that? The code I had above didn’t include any week partitioning.
Hi Dave,
Thank you for your response. No I havent broken it into weeks. I am not sure how to do it. Is is possible for you to write it? In my code I am getting last 4 weeks data from whatever todays date is.
Thank you for your help in advanced.
jk–
You break it into weeks by creating a calendar table and pre-populating it with the data. You then join to that table to determine which week the time entry occured in.