In week 26 of Workout Wednesday for Power BI, I asked people to calculate the age of Nobel laureates at the time they received the award. I provided some logic, but I didn’t prescribe how to create the age calculation. This inspired a couple of questions and a round of data validation as calculating age may be trickier than you think. In this post, I’ll explore some of the ways people have calculated age in Power BI and the edge cases where those calculations may not work.
In my solution video for Workout Wednesday, I used Power Query to calculate age. This was inspired by several blog posts and videos I had seen previously. There is an Age menu option in the Power Query editor under Date.
When you select a date column and use that Age option, it calculates the duration between the selected date and the current date in days. You must then replace the current date with the second date column. Next you can choose Total Years under Duration, which divides the days by 365. Finally, you must round that number down to the next integer to get years.
If you follow Ruth’s video, you can do all of that in one step that creates a custom column with the final age value.
Number.RoundDown(Duration.TotalDays([Date2] - [Date1])/365)
That is the most common option in Power Query as there is no DateDiff function.
There are a few options for calculating age in DAX. Some people use the DATEDIFF function.
Age DateDiff = DATEDIFF([Date1],[Date2],YEAR)
Another way I have seen is to use YEARFRAC function.
Age YearFrac = INT ( YEARFRAC ( [Date1], [Date2], 1 ) )
The way Marco Russo suggests is to use QUOTIENT.
Age Quotient (DAX): Age Quotient = VAR Birthdate = [Date1] VAR ThisDay = [Date2] VAR IntBirthdate = YEAR ( Birthdate ) * 10000 + MONTH ( Birthdate ) * 100 + DAY ( Birthdate ) VAR IntThisDay = YEAR ( ThisDay ) * 10000 + MONTH ( ThisDay ) * 100 + DAY ( ThisDay ) VAR Age = QUOTIENT ( IntThisDay - IntBirthdate, 10000 ) VAR CheckedAge = DIVIDE ( Age, NOT ISBLANK ( Birthdate ) ) RETURN CheckedAge
As Marco points out, many people were using YEARFRAC, but there is a bug in the DAX implementation that causes it to occasionally return an incorrect answer for this purpose.
Checking the Numbers
I created a Power BI file to demonstrate the differences in these four calculations. You can download the file here. The image below displays the results in several tests. For each row, I’m using Date1 as the birthdate and Date2 as the “as of” date. You’ll notice that I focused on leap years for a few cases.
There are six of ten date ranges that have different results across the different calculation methods.
In the second row, the Power Query age calculation says that Feb 29 to Feb 28 in the following year is a full year. This may or may not be what you want depending on your requirements. I’m noting the difference so you can be aware. A similar thing occurs in the fifth row going from Feb 29, 2016 to Feb 28, 2020, and again on the 9th row going from March 1, 2019 to Feb 29, 2020.
On the third row, notice that the DAX DATEDIFF function calculates Feb 29 to Feb 27 of the following year to be a full year, despite it being a day or two short. Depending on what you do with leap years, you might consider Feb 29 to Feb 28 in the following year to be a full year, but that third row result means DATEDIFF is probably not the calculation I want. We see a similar result going from March 1 to Feb 28 of the following year.
YEARFRAC calculates that Feb 29 to Feb 28 in the following year is not a full year, which may be desirable. But it counts Feb 29, 2016 to Feb 29, 2020 as only three years. And we see that March 1, 2000 to March 1, 2021 is only counted as 20 years. So even without starting on a leap year, we get some incorrect results. Small numbers seem to be correct until it gets to about 13 years.
Using the QUOTIENT function provides what I consider to be the most correct results. It calculates Feb 29 to Feb 28 of the following year to be less than a year. It calculates Feb 29, 2016 to Feb 28, 2016 to be three years and not four. And it calculates March 1 to Feb 29 of the following year to be less than a year.
Which to use?
The QUOTIENT formula produces the most accurate results if you don’t want Feb 29 to Feb 28 the next year to be counted as a year. DATEDIFF and YEARFRAC produce too many incorrect results for me to ever suggest using them. Since there is a DAX option that produces more correct answers, I would just go for QUOTIENT instead of either of these two.
UPDATE: There is a better alternative! Imke Feldmann reminded me that there is an Number.IntegerDivide function in Power Query. So let’s take the logic from Marco’s DAX calculation and move it to Power Query:
(BirthDate as date, EndDate as date) =>
BirthDateInt = Date.Year(BirthDate)10000 + Date.Month(BirthDate)100 + Date.Day(BirthDate),
EndDateInt = Date.Year(EndDate)10000 + Date.Month(EndDate)100 + Date.Day(EndDate),
Age = Number.IntegerDivide((EndDateInt - BirthDateInt),10000)
The Power Query custom column created by invoking this function should produce better compression than a DAX calculated column. This might not be significant for a small dataset, but we should be efficient when we can.