Archive for March, 2010

Using SQL To Accrue Values (Overtime in Time Entries)

Posted in SQL on March 15th, 2010 by Dave Andrews – 11 Comments

I’ve made it an exercise recently to write SQL code that will determine overtime hours and regular hours given a list of time entries. I wanted to do this through a single (well, nested) SQL query and to not use cursors to iterate through the entries.

The problem is that we have a table of time entries, the date the entries were made, and the employee who made them and we need to accrue the hours that employee worked in a given week. This is simple to do with iterators in code or cursors in T-SQL, but what fun would that be?

I’m going to simplify the problem by only looking at a single week of data. In a larger solution which looks over a larger range of dates such as a month you will need a subquery which assigns each week a number and group by that week number. In my full blown implementation of this, I used row_number() over (order by startdate) given a list of weeks with a startdate and enddate. But that is for a later post.

Let’s look at our week’s worth of data, which has 2 employees.

Table TimeEntries

TimeEntryID EmployeeID DateWorked Hours
1 1 2010-01-04 8
2 1 2010-01-05 10
3 1 2010-01-06 7.5
4 1 2010-01-07 10
5 1 2010-01-08 10
6 1 2010-01-09 4
7 2 2010-01-04 10
8 2 2010-01-05 10
9 2 2010-01-06 10
10 2 2010-01-07 10

As you can infer from the above, employee 1 worked overtime (more than 40 hours) starting on the 8th, and employee 2 worked no overtime, since the employee worked exactly 40 hours in that week.

Problem 1: Multiple Records in a Day The first problem we must overcome is calculating the total number of hours worked in a given day. This is because we can have an employee who puts in 4 hours in the morning and 4 hours in the afternoon. If we don’t have this in a single record, it will be difficult down the line when our complicated joins and subqueries come in.

So we build this query which will sum up the hours in a given day. We will use this in a couple of places.

1
2
3
4
5
6
7
8
9
SELECT
	te.EmployeeId,
	te.DateWorked,
	SUM(te.Hours) TotalHoursInDay
FROM
	TimeEntries te
GROUP BY
	te.EmployeeId,
	te.DateWorked

Very simple, we’re just summing up the hours per day. We’re going to use this query in several places.

Problem 2: Finding out which days are in overtime. The next thing we need to overcome is to come up with a means of determining which days should be in overtime. This is where “accrual” of sorts comes into play. We join each day’s record to every record before it. We then group that record and sum up the hours previous to the given day.

Let’s look at the query.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
SELECT
	te.EmployeeId,
	te.TotalHoursInDay,
	te.DateWorked,
	sum(previous.TotalHoursInDay) HoursUpToThisPoint,
	case when te.TotalHoursInDay + SUM(previous.TotalHoursInDay) > 40 then 'Y'
		 else 'N'
	end InOvertime
FROM
	(
		SELECT
			te.EmployeeId,
			te.DateWorked,
			SUM(te.Hours) TotalHoursInDay
		FROM
			TimeEntries te
		GROUP BY
			te.EmployeeId,
			te.DateWorked
	) te
	LEFT JOIN
	(
		SELECT
			te.EmployeeId,
			te.DateWorked,
			SUM(te.Hours) TotalHoursInDay
		FROM
			TimeEntries te
		GROUP BY
			te.EmployeeId,
			te.DateWorked
	) previous
		ON	te.EmployeeId = previous.EmployeeId AND
			te.DateWorked > previous.DateWorked
GROUP BY
	te.EmployeeId,
	te.TotalHoursInDay,
	te.DateWorked

The first subquery, te, creates a list of each day for each employee and the total hours for that day. This is to get the total current hours for the employee for the given day. We join that to the second subquery, previous. The Previous query generates a total number for all days PRIOR to the given day. (The logic comes from our join condition, te.DateWorked > previous.DateWorked)

The InOvertime flag will be used later on to determine overtime hours. It lets us know when the employee has crossed the threshold. That could probably be taken out, but it simplifies the logic later on.

So we get these results:

EmployeeID TotalHoursInDay DateWorked HoursUpToThisPoint InOvertime
1 8 2010-01-04 NULL N
1 10 2010-01-05 8 N
1 7.5 2010-01-06 18 N
1 10 2010-01-07 25.5 N
1 10 2010-01-08 35.5 Y
1 4 2010-01-08 45.5 Y
2 10 2010-01-04 NULL N
2 10 2010-01-05 10 N
2 10 2010-01-06 20 N
2 10 2010-01-07 30 N

As you can see, this subquery determines for us the total number of hours up until the given day, and when combined with the hours worked in the given day we know if the employee is in overtime or not.

Let’s put one more wrapper around this which will total up the hours and tell us how many hours were regular hours and how many were overtime.

Here is our final query.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
SELECT
	EmployeeId,
	sum(case when InOvertime = 'N' then TotalHoursInDay
	         when InOvertime = 'Y' AND HoursUpToThisPoint < 40 then  40 - HoursUpToThisPoint
		 else 0
	end) RegularHours,
	sum(case when InOvertime = 'Y' then
		 case when HoursUpToThisPoint > 40 then TotalHoursInDay
			  else HoursUpToThisPoint + TotalHoursInDay - 40
		 end
		 else 0
	end) OTHours
FROM
(
	SELECT
		te.EmployeeId,
		te.TotalHoursInDay,
		te.DateWorked,
		sum(previous.TotalHoursInDay) HoursUpToThisPoint,
		case when te.TotalHoursInDay + SUM(previous.TotalHoursInDay) > 40 then 'Y'
			 else 'N'
		end InOvertime
	FROM
		(
			SELECT
				te.EmployeeId,
				te.DateWorked,
				SUM(te.Hours) TotalHoursInDay
			FROM
				TimeEntries te
			GROUP BY
				te.EmployeeId,
				te.DateWorked
		) te
		LEFT JOIN
		(
			SELECT
				te.EmployeeId,
				te.DateWorked,
				SUM(te.Hours) TotalHoursInDay
			FROM
				TimeEntries te
			GROUP BY
				te.EmployeeId,
				te.DateWorked
		) previous
			ON	te.EmployeeId = previous.EmployeeId AND
				te.DateWorked > previous.DateWorked
	GROUP BY
		te.EmployeeId,
		te.TotalHoursInDay,
		te.DateWorked
) HoursQuery
GROUP BY
	HoursQuery.EmployeeId
EmployeeId RegularHours OTHours
1 40 9.5
2 40 0

What we have done here is add some logic that acts on whether or not the a given time entry is in overtime. We group by the employee so we can aggregate hours.

If the record is in overtime, it might mean the employee is already over 40 hours or it might mean that the current record puts the employee over 40 hours. So we have to add a check that will take the difference as regular hours, since the hours that take the employee up to 40 are still regular hours.

The second check is to calculate overtime. If the time entry puts the employee over 40, then take the amount over 40 as overtime. If the employee is already over 40, then the entire entry is overtime.

I hope this post helps you have the mindset of doing as much as possible with your queries, rather than writing code to calculate values which can easily be queried.