Couret.Venter: Difference between revisions

From BattleActs Wiki
Jump to navigation Jump to search
 
(28 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''Reading''': Couret, J. and Venter, G., "Using Multi-Dimensional Credibility to Estimate Class Frequency Vectors in Workers Compensation"
'''Reading''': Couret, J. and Venter, G., "Using Multi-Dimensional Credibility to Estimate Class Frequency Vectors in Workers Compensation"


'''Synopsis''': To follow...
'''Synopsis''': It can be difficult to have credible data for low frequency events. In this article Couret & Venter look at Workers' Compensation and how different types of injury severities can be used to improve the predictive power of the data as there is often only a small (possibly random) difference between an accident being say fatal or resulting in a permanent injury.
 
The article comes in three parts. First we introduce the equations which define the multi-dimensional credibility setup. Then we discuss how to estimate the required credibilities in order to produce estimates of the Workers' Compensation class mean by injury type. Lastly, we look at testing the effectiveness of the multi-dimensional credibility technique.


==Study Tips==
==Study Tips==
Still to come...
<!--This article is best read after ''[[Robertson.HazardGroups]]'' because it makes use of many of the Workers' Compensation terms introduced there.
-->
This is a challenging article as it is densely written so we recommend you read the wiki before turning to the source material. If you get stuck on part of the article, skip ahead for a bit and circle back later. It takes two or three readings for the concepts and algebra to come together.  


Read this after [[Robertson.HazardGroups]]
The material hasn't come up very often lately and there is considerable variation in how the CAS has previously tested it. Make sure you can do all of the prior exam questions and know the concepts well. However, if your exam has a question on this material, you may want to leave it towards the end of the exam to focus on easier points first.


'''Estimated study time''': 3 hours ''(not including subsequent review time)''
'''Estimated study time''': 16 Hours ''(not including subsequent review time)''


==BattleTable==
==BattleTable==
Line 14: Line 18:
Based on past exams, the '''main things''' you need to know ''(in rough order of importance)'' are:
Based on past exams, the '''main things''' you need to know ''(in rough order of importance)'' are:


* fact A...
* Be able to apply the <span style="color:green;">'''Sum of Squared Errors Test'''</span> to a given set of results.
* fact B...
* Be able to apply the <span style="color:blue;">'''Multi-dimensional credibility equations'''</span> to the data.
* Be able to apply the '''Quintiles Test''' in the context of this paper.
* Be able to briefly describe Couret & Venter's results.
 
{|class="wikitable"
|-
| Questions from the Fall 2019 exam are held out for practice purposes. (They are included in the CAS practice exam.)
|}


: {| class='wikitable' style='width: 1000px;'
: {| class='wikitable' style='width: 1000px;'


|-
|-
! style='width: 175px;' | reference !! style='width: 175px;' | part (a) !! style='width: 175px;' | part (b) !! style='width: 175px;' | part (c) !! part (d)
! style='width: 175px;' | reference !! style='width: 200px;' | part (a) !! style='width: 200px;' | part (b) !! style='width: 200px;' | part (c) !! part (d)
 
|- style="border-bottom: 2px solid;"
|| [https://www.battleacts8.ca/8/pdf/Exam_(2015_2-Fall)/(2015_2-Fall)_(05).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2015.Fall #5)'''</span>
|| <span style="color:blue;">'''Multi-dimensional Credibility'''</span> <br> - calculate ratio
|| '''Quintile Test''' <br> - describe process
|| <span style="color:green;">'''Sum of Squared Errors Test'''</span> <br> - shortcomings
| style="background-color: lightgrey;" |


|-
|-
|| [https://www.battleacts8.ca/8/pdf/Exam_(2015_2-Fall)/(2015_2-Fall)_(05).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2015.Fall #05)'''</span>
|| [https://www.battleacts8.ca/8/pdf/Exam_(2014_2-Fall)/(2014_2-Fall)_(01).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2014.Fall #1)'''</span>
||  
|| '''Quintile Test''' <br> - evaluate
||  
|| <span style="color:green;">'''Sum of Squared Errors Test'''</span> <br> - apply &amp; recommend
||
| style="background-color: lightgrey;" |
||
| style="background-color: lightgrey;" |


|-
|- style="border-bottom: 2px solid;"
|| [https://www.battleacts8.ca/8/pdf/Exam_(2014_2-Fall)/(2014_2-Fall)_(04).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2014.Fall #04)'''</span>
|| [https://www.battleacts8.ca/8/pdf/Exam_(2014_2-Fall)/(2014_2-Fall)_(04).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2014.Fall #4)'''</span>
||  
|| '''Statistical Considerations''' <br> - apply in context
||  
| style="background-color: lightgrey;" |  
||
| style="background-color: lightgrey;" |
||
| style="background-color: lightgrey;" |


|-
|- style="border-bottom: 2px solid;"
|| [https://www.battleacts8.ca/8/pdf/Exam_(2014_2-Fall)/(2014_2-Fall)_(01).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2014.Fall #01)'''</span>
|| [https://www.battleacts8.ca/8/pdf/Exam_(2013_2-Fall)/(2013_2-Fall)_(03).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2013.Fall #3)'''</span>
||  
|| '''Holdout Sample''' <br> - describe purpose
||  
|| '''Holdout Sample''' <br> - recommend
||
|| <span style="color:green;">'''Sum of Squared Errors Test'''</span> <br> - evaluate
||
| style="background-color: lightgrey;" | ''Trends'' <br> - ''[[Mahler.Credibility]]''


|-
|- style="border-bottom: 2px solid;"
|| [https://www.battleacts8.ca/8/pdf/Exam_(2013_2-Fall)/(2013_2-Fall)_(03).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2013.Fall #03)'''</span>
|| [https://www.battleacts8.ca/8/pdf/Exam_(2012_2-Fall)/(2012_2-Fall)_(05).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2012.Fall #5)'''</span>
||  
|| <span style="color:blue;">'''Multi-dimensional Credibility'''</span> <br> - determine appropriateness
||  
|| '''Expected Loss''' <br> - calculate
||
| style="background-color: lightgrey;" |
||
| style="background-color: lightgrey;" |


|-
|}
|| [https://www.battleacts8.ca/8/pdf/Exam_(2012_2-Fall)/(2012_2-Fall)_(05).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2012.Fall #05)'''</span>
||
||
||
||


<!-- ******** BattleBar Code ******** -->
{|style="border: solid; color:lightgrey; border-radius:10px; border-width:2px; align:center;"
|-
|-
|| [https://www.battleacts8.ca/8/pdf/Exam_(2011_2-Fall)/(2011_2-Fall)_(02).pdf <span style='font-size: 12px; background-color: yellow; border: solid; border-width: 1px; border-radius: 5px; padding: 2px 5px 2px 5px; margin: 5px;'>E</span>] <span style='color: red;'>'''(2011.Fall #02)'''</span>
<!-- ******** Full BattleQuiz ******** -->
||
|style="padding:2px"|[https://battleacts8.ca/8/FC.php?selectString=**&filter=both&sortOrder=natural&colorFlag=allFlag&colorStatus=allStatus&priority=importance-high&subsetFlag=miniQuiz&prefix=Couret&suffix=Venter&section=all&subSection=all&examRep=all&examYear=all&examTerm=all&quizNum=all<span style="font-size: 20px; background-color: lightgreen; border: solid; border-width: 1px; border-radius: 10px; padding: 2px 10px 2px 10px; margin: 10px;">'''Full BattleQuiz]'''</span>
||
<!--                                    -->
||
<!-- ******** Excel BattleQuiz ******** -->
||
<!--
|style="padding:2px"|[https://battleacts8.ca/8/FC.php?selectString=**&filter=both&sortOrder=natural&colorFlag=allFlag&colorStatus=allStatus&priority=importance-high&subsetFlag=miniQuiz&prefix=Couret&suffix=Venter&section=all&subSection=all&examRep=all&examYear=all&examTerm=all&quizNum=all<span style="color: red; font-size: 20px; background-color: lightgreen; border: solid; border-width: 1px; border-radius: 10px; border-color: darkblue; padding: 2px 10px 2px 10px; margin: 0px;">'''''Excel BattleQuiz]'''''</span>
-->
<!--                                    -->
<!-- ******** Excel PowerPack Files ******** -->
|style="padding:2px"|[[BattleActs_PowerPack#ppCouret| <span style="color: white; font-size: 12px; background-color: indigo; border: solid; border-width: 2px; border-radius: 10px; border-color: indigo; padding: 1px 3px 1px 3px; margin: 0px;">'''''Excel Files '''''</span>]]
<!--                                    -->
<!-- ******** Forum ******** -->
|style="padding:2px"|[https://battleacts8.ca/8/forum/index.php?p=/categories/couret-venter<span style="font-size: 12px; background-color: lightgrey; border: solid; border-width: 1px; border-radius: 10px; padding: 2px 10px 2px 10px; margin: 0px;">'''Forum'''</span>]
<!--                                    -->
<!-- ******** Formula Sheet ******** -->
<!--
|style="padding:2px"|[https://battleacts8.ca/8/forum/categories/couret-venter<span style="font-size: 12px; color: darkblue; background-color: lightblue; border: solid; border-width: 1px; border-radius: 10px; padding: 2px 10px 2px 10px; margin: 0px;">'''Formula Sheet'''</span>]
-->
|}
|}
<span style="color: red;">'''You must be <u>logged in</u> or this will not work.'''</span>


==In Plain English!==
==In Plain English!==
Claim counts for workers compensation classes are unreliable for serious injuries because of the low frequencies involved. However, serious injury types are correlated with other injuries as the situations which cause fatal (F), permanent total (PT), and major permanent partial (Major) injuries are usually similar. It is generally only a small difference in the situation that results in a significantly different outcome. So a class with a lot of major injuries probably has a higher than average likelihood for permanent total and fatal injuries.
Claim counts for workers compensation classes are unreliable for serious injuries because of the low frequencies involved. However, serious injury types are correlated with other injuries as the situations which cause fatal (F), permanent total (PT), and major permanent partial (Major) injuries are usually similar. A small change in the situation may result in a significantly different outcome. So a class with a lot of major injuries probably has a higher than average likelihood for permanent total and fatal injuries.
 
Couret &amp; Venter derive a multivariate correlated credibility by estimating the population mean for each injury type by class using a linear function of the sample means for all of the injury types in the class. The coefficients of the linear function are estimated by minimizing the expected squared error.
 
They apply this method to ratios of claim counts by injury type to temporary total impairment (TT) claim counts. That is, they treat a temporary total injury as an exposure which could have produced a higher severity claim (F, PT, Major, or Minor). Let ''V'', ''W'', ''X'', and ''Y'' be the <u>observed ratios</u> for injury types F, PT, Major, and Minor. The paper assumes the distribution of claim counts by injury type is parametrizable for each class but the parameters are unknown. Let ''v<sub>i</sub>'', ''w<sub>i</sub>'', ''x<sub>i</sub>'', and ''y<sub>i</sub>'' be the population (hypothetical) mean ratios. Rather than writing out all of the highly similar equations for each of the four serious injury types we focus on the permanent total injuries (PT) which uses the variable ''W''. You should be able to translate the equations from permanent total injuries to any of the other serious injury types.
 
The observed sample claim count ratio of permanent total (PT) to temporary total (TT) for class ''i'' at time ''t'' is given by <math>m_{i,t}\cdot W_{i,t}=\displaystyle\sum_{j=1}^{m_{i,t}}\left(w_i+\epsilon_{j,t}\right)</math>. Here, there are <math>m_{i,t}</math> TT claims, and the <math>\epsilon_{j,t}</math> are independent perturbations with mean zero and standard deviation <math>\sigma_{W_i}</math> which vary by class but not time. Hence each TT claim is considered an exposure which may or may not produce a PT claim.
 
Rearranging the equation gives <math>W_{i,t}=w_i+\displaystyle\frac{1}{m_{i,t}}\cdot\sum_{j=1}^{m_{i,t}}\epsilon_{j,t}</math> and <math>Var(W_{i,t}\;|\; w_i)=\displaystyle\frac{\sigma^2_{W_i}}{m_{i,t}}</math>.  


Couret and Venter derive a multivariate correlated credibility by estimating the population mean for each injury type by class using a linear function of the sample means for all of the injury types in the class. The coefficients of the linear function are estimated by minimizing the expected squared error.
Hence, the more TT claims, the smaller the random fluctuations of the annual observed class ratio <math>W_{i,t}</math> from its population mean since <math>\frac{1}{m_{i,t}}</math> goes to 0 as <math>m_{i,t}</math> increases.


They apply this method to ratios of claim counts by injury type to temporary total impairment (TT) claim counts. That is, they treat a temporary total injury as an exposure which could have produced a higher severity claim (F, PT, Major, or Minor). Let ''V'', ''W'', ''X'', and ''Y'' be the <u>observed ratios</u> for injury types F, PT, Major, and Minor. The paper assumes the distribution of clami counts by injury type is parametrizable for each class but the parameters are unknown. Let ''v<sub>i</sub>'', ''w<sub>i</sub>'', ''x<sub>i</sub>'', and ''y<sub>i</sub>'' be the population (hypothetical) mean ratios. Then the observed sample claim count ratio of permanent total (PT) to temporary total (TT) for class ''i'' at time ''t'' is given by <math>M_{i,t}*W_{i,t}=\sum_{j=1}^{m_{i,t}}w_i+\epsilon_{i,t}</math>. Here, there are <math>m_{i,t}</math> TT claims, and the <math>\epsilon_{i,t}</math> are independent perturbations with mean zero and standard deviation <math>\sigma_{W_i}</math> which vary by class but not time. Hence each TT claim is considered an exposure which may or may not produce a PT claim.
{|class="wikitable"
|-
|<span style="color:#236AB9;">'''Key Assumption'''</span>
The variance of F, PT, Major and Minor claims decreases as the number of TT claims increases. That is, the more TT claims you have, the more of the other claim types you should have and hence their variance should decrease.
|}


Rearranging the equation gives <math>W_{i,t}=w_i+\sum_{j=1}^{m_{i,t}}\frac{\epsilon_{i,t}}{m_{i,t}}</math> and <math>Var(W_{i,t}\;|\; w_{i,t})=\frac{\sigma^2_{W_i}}{m_{i,t}}</math>. Hence, the more TT claims, the smaller the random fluctuations of the annual observed class ratio <math>W_{i,t}</math> from its population mean.  
Let ''W<sub>i</sub>'' be the sample class mean ratio <u>over all time</u>. Assume there are ''N'' independent time periods. Then <math>W_i=\displaystyle\frac{\sum_{t=1}^N m_{i,t}W_{i,t}}{\sum_{t=1}^N m_{i,t}}</math>.


A key assumption of Couret and Venter's analysis is the variance of F, PT, Major and Minor claims decreases as the number of TT claims increases. That is, the more TT claims you have, the more of the other claim types you should have and hence their variance should decrease.
Let ''m<sub>i</sub>'' be the sum of <math>m_{i,t}</math> over all time, and ''m'' be the sum over all classes ''i'' of ''m<sub>i</sub>''. Then <math>Var(W_i \;|\; w_i)=\displaystyle\frac{\sigma^2_{W_i}}{m_i}</math>.


Let ''W<sub>i</sub>'' be the sample class mean ratio over all time. Assume there are ''N'' independent time periods. Then <math>W_i=\frac{\sum_{t=1}^N m_{i,t}W_{i,t}}{\sum_{t=1}^N m_{i,t}}</math>.
The same calculations are also performed for each of the 7 hazard groups. These will become complements of credibility. The hazard groups use a subscript ''h'', so <math>V_h</math> is the observed mean fatal claims for hazard group ''h''.


Let ''m<sub>i</sub>'' be the sum of <math>m_{i,t}</math> over all time, and ''m'' be the sum over all classes ''i'' of ''m<sub>i</sub>''. Then <math>Var(W_i \;|\; w_i)=\frac{\sigma^2_{W_i}}{m_i}</math>.
By forming a linear combination of the sample means for the injury types within class ''i'', we account for correlations between injury types. For instance, focusing on fatal claims, ''V'', we are estimating the population mean for fatal claims in class ''i'', <math>v_i</math>, by the following equation: <math>\hat{v_i}=V_h + b_{v,i}(V_i-V_h)+c_{v,i}(W_i-W_h)+d_{v,i}(X_i-X_h)+e_{v,i}(Y_i-Y_h)</math>.  


The same calculations are also performed for each of the 7 hazard groups discussed in [[Robertson.HazardGroups]]. These will become complements of credibility.
Remember, <math>V_h</math>'' is the observed mean fatal claims for the hazard group ''h'' which contains class ''i''.  


By forming a linear combination of the sample means for the injury types within the class, we account for correlations between injury types. That is, we are estimating ''V<sub>i</sub>'' by the following equation: <math>\hat{V_i}=V_h + b_{v,i}(V_i-V_h)+c_{v,i}(W_i-W_h)+d_{v,i}(X_i-X_h)+e_{v,i}(Y_i-Y_h)</math>. Here, ''V<sub>h</sub>'' is the true mean of the hazard group which contains class ''i''. Similar equations are formed for the class estimates of ''W'', ''X'', and ''Y''.  
Similar equations are formed for the class estimates of ''W'', ''X'', and ''Y''.  


So far everything has been quite abstract. Let's remedy that by looking at question 5a on the 2015 exam. '''Insert 2015.Q5 PDF'''
The coefficients <math>b_{v,i}, c_{v,i}, d_{v,i}, e_{v,i}, \ldots, b_{y,i}, c_{y,i}, d_{y,i}, e_{y,i}</math> are the multi-dimensional credibilities we need to estimate. Observe that if there was no correlation between the injury types then the credibilities are zero between injury types and the above equations reduce to credibility weighting the observed injury type mean for the class against the observed injury type mean for the hazard group containing the class.


The coefficients <math>b_{v,i}, c_{v,i}, d_{v,i}, e_{v,i}, \ldots, b_{y,i}, c_{y,i}, d_{y,i}, e_{y,i}</math> are the credibilities we need to estimate.
''Alice: "So far everything has been quite abstract. Let's remedy that by looking at part (a.) of the following example."''
: <span class="newwin">[https://www.battleacts8.ca/8/pdf/Q5_2015.pdf <span style="color: white; font-size: 12px; background-color: green; border: solid; border-width: 2px; border-radius: 10px; border-color: green; padding: 1px 3px 1px 3px; margin: 0px;">'''''Set up and Solve the Multi-dimensional Credibility Equation'''''</span>]</span>


Couret &amp; Venter use multivariate B&uuml;hlmann-Straub credibility which minimizes the least square error. We want to minimize the expected squared error over all classes between the linear combination and the hypothetical mean. That is, minimize <math>E\left[\left(a+bV_i+cW_i+dX_i+eY_i-w_i\right)^2\right]</math>.
===Credibility Considerations===
Couret &amp; Venter use multivariate B&uuml;hlmann-Straub credibility which minimizes the least square error. We want to minimize the expected squared error over all classes between the linear combination and the hypothetical mean. That is, in the case of permanent total injuries (variable ''w''), minimize <math>E\left[\left(a+bV_i+cW_i+dX_i+eY_i-w_i\right)^2\right]</math>.


The coefficients <math>a, b, c, d, e</math> are determined as follows:
The coefficients <math>a, b, c, d, e</math> are determined as follows:
Line 101: Line 144:


Repeating for ''c'', ''d'', and ''e'' yields three similar equations. Taken together, they form the following matrix equation:
Repeating for ''c'', ''d'', and ''e'' yields three similar equations. Taken together, they form the following matrix equation:
<math>\left(\begin{array}{c}Cov(V_i,w_i)\\Cov(W_i,w_i)\\Cov(X_i,w_i)\\Cov(Y_i,w_i)\end{array}\right)=C\cdot\left(\begin{array}{c}b\\c\\d\\e\end{array}\right)</math>, where ''C'' is the covariance matrix of the class by injury-type sample means.
<math>\left(\begin{array}{c}Cov(V_i,w_i)\\Cov(W_i,w_i)\\Cov(X_i,w_i)\\Cov(Y_i,w_i)\end{array}\right)=C\cdot\left(\begin{array}{c}b_{w,i}\\c_{w,i}\\d_{w,i}\\e_{w,i}\end{array}\right)</math>, where ''C'' is the covariance matrix of the class by injury-type sample means.


The difficulty now is in estimating the covariances. Couret &amp; Venter note <math>Var(V_i\;|\;v_i)=\frac{\sigma^2_{V_i}}{m_i}</math>, where <math>\sigma^2_{V_i}</math> is the process variance for ''V<sub>i</sub>''. The unconditional variance of ''V<sub>i</sub>'' is <math>\frac{EPV_V}{m_i}+VHM_V</math>. The latter term is the ''variance of hypothetical means'' for ''V'' which is the variance of the means ''v<sub>i</sub>'' for the unobserved classes of ''V''. The first term is the ''expected process variance'' which is <math>E[\sigma^2_{V_i}]</math>. Note this is independent of ''i'' because the expectation is over all classes.
The difficulty now is in estimating the covariances. Couret &amp; Venter note <math>Var(V_i\;|\;v_i)=\displaystyle\frac{\sigma^2_{V_i}}{m_i}</math>, where <math>\sigma^2_{V_i}</math> is the process variance for ''V<sub>i</sub>''. The unconditional variance of ''V<sub>i</sub>'' is <math>\frac{EPV_V}{m_i}+VHM_V</math>. The latter term is the ''variance of hypothetical means'' for ''V'' which is the variance of the means ''v<sub>i</sub>'' for the unobserved classes of ''V''. The first term is the ''expected process variance'' which is <math>E[\sigma^2_{V_i}]</math>. Note this is independent of ''i'' because the expectation is over all classes.


The key assumption here is the observed injury ratios for any year for each type of injury are the class injury ratio plus a random, independent perturbation. Couret &amp; Venter conclude it is sufficient to estimate the off-diagonal elements by the sample covariances.
{|class="wikitable"
|-
|<span style="color:#236AB9;">'''Key Assumption'''</span>
The observed injury ratios for any year for each type of injury are the class injury ratio plus a random, independent perturbation.  
|}


The leading diagonal of the matrix is estimated by <math>\frac{EPV_V}{m_i}</math> for ''Var(V<sub>i</sub>)'' etc. Consequently, formulas are required to estimate ''EPV'' and ''VHM''. The paper uses formulas due to Dean (2005). In the following, recall a hat denotes an estimate.
Couret &amp; Venter conclude it is sufficient to estimate the off-diagonal elements by the sample covariances.


===Formulas for ''EPV'' and ''VHM''===
The first entry on the leading diagonal of the matrix is <math>Var(V_i)=\frac{EPV_V}{m_i}+VHM_V</math>. Subsequent leading diagonal entries follow by replacing ''V'' with ''W'', ''X'' or ''Y''.
<math>\hat{EPV_V}=\frac{\sum_{i=1}^R\sum_{t=1}^N m_{it}\left(V_{it}-V_i\right)^2}{R(N-1)}</math> and <math>\hat{VHM_V}=\frac{\sum_{i=1}^Rm_i\left(V_i-V\right)^2-(R-1)\cdot\hat{EPV_V}}{m-\frac{1}{m}\cdot\sum_{i=1}^rm_i^2}</math>
 
Consequently, formulas are required to estimate ''EPV'' and ''VHM''. The paper uses formulas due to Dean (2005). In the following, recall a hat denotes an estimate.
 
====Formulas for ''EPV'' and ''VHM''====
<math>\hat{EPV_V}=\frac{\sum_{i=1}^R\sum_{t=1}^N m_{it}\left(V_{it}-V_i\right)^2}{R(N-1)}</math> and <math>\hat{VHM_V}=\frac{\sum_{i=1}^Rm_i\left(V_i-V\right)^2-(R-1)\cdot\hat{EPV_V}}{m-\frac{1}{m}\cdot\sum_{i=1}^Rm_i^2}</math>


Lastly, <math>Cov(V_i,W_i)=\sum_{i=1}^R\frac{\left(V_i-V_h\right)\cdot\left(W_i-W_h\right)\cdot m_i}{m_h}</math>.
Lastly, <math>Cov(V_i,W_i)=\sum_{i=1}^R\frac{\left(V_i-V_h\right)\cdot\left(W_i-W_h\right)\cdot m_i}{m_h}</math>.


Here, ''R'' is the number of classes in the hazard group containing class ''i''. ''N'' is the total number of classes over all hazard groups.
Here, ''R'' is the number of classes in the hazard group that contains class ''i''. ''N'' is the number of years used in the data set.


You should make sure you have these formulas memorized and know how to apply them.
You should make sure you have these formulas memorized and know how to apply them.


Note that ''V'' is a weighted average of the ''V<sub>i</sub>'' 's with weights ''m<sub>i</sub>''. Also, the expected process variance (EPV) is estimated using the sample variance from each class. This may be negative. If that happens, then set it equal to 0. When it is set equal to 0, the expected process variance accounts for all of the observed variation. That is, there are no individual risk differences.
Note that ''V'' is a weighted average of the ''V<sub>i</sub>'' 's with weights ''m<sub>i</sub>''. Also, the variance of hypothetical means (VHM) is estimated using the sample variance from each class. This may be negative. If that happens, then set it equal to 0. When it is set equal to 0, the expected process variance accounts for all of the observed variation. That is, there are no individual risk differences.
 
[https://battleacts8.ca/8/FC.php?selectString=**&filter=both&sortOrder=natural&colorFlag=allFlag&colorStatus=allStatus&priority=importance-high&subsetFlag=miniQuiz&prefix=Couret&suffix=Venter&section=all&subSection=all&examRep=all&examYear=all&examTerm=all&quizNum=1<span style="font-size: 20px; background-color: aqua; border: solid; border-width: 1px; border-radius: 10px; padding: 2px 10px 2px 10px; margin: 10px;">'''mini BattleQuiz 1]'''</span> <span style="color: red;">'''You must be <u>logged in</u> or this will not work.'''</span>


===Performance Testing===
===Performance Testing===
This section of the paper is best skim-read after reading the key points below.
This section of the paper is best skim-read after reading the key points below.


Couret &amp; Venter had 7 policy years of untrended and undeveloped workers compensation data available. They discarded the most recent year as they believed it to be too immature. The data is examined in two ways: by hazard group/class and by injury type. They use the sum of square errors to measure the performance of the predictions. The predictions are made three ways: by using the hazard group mean, using the even report years to predict the odd report years, and using the credibility process.
Couret &amp; Venter had 7 policy years of untrended and undeveloped workers compensation data available. They discarded the most recent year as they believed it to be too immature. The data is examined in two ways: by hazard group/class and by injury type. They use the sum of square errors to measure the performance of the predictions. The predictions are made three ways: by using the hazard group mean, using the "raw" class mean, and using the multi-dimensional credibility process.
 
To facilitate testing, a holdout sample is created using the data for the odd report years. So the hazard group means, class means and multi-dimensional credibility process is all calculated on the even report years.


Couret &amp; Venter note that incident ratios are impacted by unknown effects due to changes in the portfolio of individual insurance policies over time. They also note there is considerable volatility in the class ratios at the state level which means improving the estimate of the mean may only produce a small reduction in the sum of squared errors.
Couret &amp; Venter note that incident ratios are impacted by unknown effects due to changes in the portfolio of individual insurance policies over time. They also note there is considerable volatility in the class ratios at the state level which means improving the estimate of the mean may only produce a small reduction in the sum of squared errors.
Line 129: Line 184:
The multi-dimensional credibility procedure is designed to minimize the expected deviation between the true class mean and its sample estimator over the same period. By using the even years to predict the odd years (a form of holdout sample), there is a disconnect between the minimized expectation and the sample statistic.
The multi-dimensional credibility procedure is designed to minimize the expected deviation between the true class mean and its sample estimator over the same period. By using the even years to predict the odd years (a form of holdout sample), there is a disconnect between the minimized expectation and the sample statistic.


Three testing approaches taken by Couret &amp; Venter
Three testing approaches taken by Couret &amp; Venter (shown for fatal injuries):
# Hazard group method: <math>SSE=\sum_{\mbox{all classes}}(V_h-V_{i,\mbox{holdout}})^2</math>
# Hazard group method: <math>SSE=\sum_{\mbox{all classes}}(V_h-V_{i,\mbox{holdout}})^2</math>
# Raw class data method: <math>SSE=\sum_{\mbox{all classes}}(V_i-V_{i,\mbox{holdout}})^2</math>
# Raw class data method: <math>SSE=\sum_{\mbox{all classes}}(V_i-V_{i,\mbox{holdout}})^2</math>
Line 136: Line 191:
The lowest sum of squared errors (SSE) is best.
The lowest sum of squared errors (SSE) is best.


While this sounds complicated, it is relatively easy to apply. Let's walk through 2011 Q2 now. '''Insert 2011.Q2 PDF'''
''Alice: "This sounds complicated but don't worry - it's relatively easy to apply. Let's walk through an example now."''
: <span class="newwin">[https://www.battleacts8.ca/8/pdf/Q2_2011.pdf <span style="color: white; font-size: 12px; background-color: green; border: solid; border-width: 2px; border-radius: 10px; border-color: green; padding: 1px 3px 1px 3px; margin: 0px;">'''''Calculate the Sums of Squared Errors'''''</span>]</span>


Their credibility method performs only slightly better than the hazard group method. Possible reasons include class data being volatile across years and the estimators being fit to the training dataset rather than the test dataset. The latter point is really saying there could be material differences between even and odd years in the averages. A better approach is to normalize the data by hazard group by year to eliminate differences between hazard groups in the train vs test datasets.
Their credibility method performs only slightly better than the hazard group method. Possible reasons include class data being volatile across years and the estimators being fit to the training dataset rather than the test dataset. The latter point is really saying there could be material differences between even and odd years in the averages. A better approach is to normalize the data by hazard group by year to eliminate differences between hazard groups in the train vs test datasets.
Line 150: Line 206:
# For each of the three approaches taken by Couret &amp; Venter, analyze the sum of squared errors.
# For each of the three approaches taken by Couret &amp; Venter, analyze the sum of squared errors.


<u>Why use a quintiles test?</u>
====Why use a quintiles test?====


We could compute the sum of squared errors for each of the methods at the individual class level. However, even after grouping classes into hazard groups, there is a large amount of variation between classes. This noise makes it hard to determine if the credibility method is performing better. By grouping into quintiles and calculating sums of squared errors based on the quintile statistics, we reduce this variation so the data is more credible and we can make a better comparison of the methods.
We could compute the sum of squared errors for each of the methods at the individual class level. However, even after grouping classes into hazard groups, there is a large amount of variation between classes. This noise makes it hard to determine if the credibility method is performing better. By grouping into quintiles and calculating sums of squared errors based on the quintile statistics, we reduce this variation so the data is more credible and we can make a better comparison of the methods.


One way the CAS can test this material is to have you interpret the results of a quintiles test to determine whether or not you should use the multi-dimensional credibility technique or some other method. Take 5 minutes to attempt the following past exam question now. '''Insert 2012.Q5 PDF'''
One way the CAS can test this material is to have you interpret the results of a quintiles test to determine whether or not you should use the multi-dimensional credibility technique or some other method. Take 5 minutes to attempt the following past exam question now.  
 
: <span class="newwin">[https://www.battleacts8.ca/8/pdf/Q5_2012.pdf <span style="color: white; font-size: 12px; background-color: green; border: solid; border-width: 2px; border-radius: 10px; border-color: green; padding: 1px 3px 1px 3px; margin: 0px;">'''''Evaluate the Multi-dimensional Credibility Technique'''''</span>]</span>


Another way this can be tested is in parts b and c of 2015 Q5. Give them a try now! '''Insert 2015.Q5 PDF'''
Another way this can be tested is in parts b and c of 2015 Q5. Give them a try now!
: <span class="newwin">[https://www.battleacts8.ca/8/pdf/Q5_2015.pdf <span style="color: white; font-size: 12px; background-color: green; border: solid; border-width: 2px; border-radius: 10px; border-color: green; padding: 1px 3px 1px 3px; margin: 0px;">'''''Describe the Quintiles Test in the Context of Couret &amp; Venter'''''</span>]</span>


===Couret &amp; Venter's Results===
===Couret &amp; Venter's Results===
Line 162: Line 221:
Using the quintiles test, the multi-dimensional credibility procedure produced a much lower sum of squared errors for all hazard groups/injury types except for hazard group A. Couret &amp; Venter suggest this is because hazard group A is highly homogeneous so different injury types aren't prevalent or are not very predictive of other injury types.
Using the quintiles test, the multi-dimensional credibility procedure produced a much lower sum of squared errors for all hazard groups/injury types except for hazard group A. Couret &amp; Venter suggest this is because hazard group A is highly homogeneous so different injury types aren't prevalent or are not very predictive of other injury types.


==Pop Quiz Answers==
[https://battleacts8.ca/8/FC.php?selectString=**&filter=both&sortOrder=natural&colorFlag=allFlag&colorStatus=allStatus&priority=importance-high&subsetFlag=miniQuiz&prefix=Couret&suffix=Venter&section=all&subSection=all&examRep=all&examYear=all&examTerm=all&quizNum=2<span style="font-size: 20px; background-color: aqua; border: solid; border-width: 1px; border-radius: 10px; padding: 2px 10px 2px 10px; margin: 10px;">'''mini BattleQuiz 2]'''</span> <span style="color: red;">'''You must be <u>logged in</u> or this will not work.'''</span>
 
<!-- ******** BattleBar Code ******** -->
{|style="border: solid; color:lightgrey; border-radius:10px; border-width:2px; align:center;"
|-
<!-- ******** Full BattleQuiz ******** -->
|style="padding:2px"|[https://battleacts8.ca/8/FC.php?selectString=**&filter=both&sortOrder=natural&colorFlag=allFlag&colorStatus=allStatus&priority=importance-high&subsetFlag=miniQuiz&prefix=Couret&suffix=Venter&section=all&subSection=all&examRep=all&examYear=all&examTerm=all&quizNum=all<span style="font-size: 20px; background-color: lightgreen; border: solid; border-width: 1px; border-radius: 10px; padding: 2px 10px 2px 10px; margin: 10px;">'''Full BattleQuiz]'''</span>
<!--                                    -->
<!-- ******** Excel BattleQuiz ******** -->
<!--
|style="padding:2px"|[https://battleacts8.ca/8/FC.php?selectString=**&filter=both&sortOrder=natural&colorFlag=allFlag&colorStatus=allStatus&priority=importance-high&subsetFlag=miniQuiz&prefix=Couret&suffix=Venter&section=all&subSection=all&examRep=all&examYear=all&examTerm=all&quizNum=all<span style="color: red; font-size: 20px; background-color: lightgreen; border: solid; border-width: 1px; border-radius: 10px; border-color: darkblue; padding: 2px 10px 2px 10px; margin: 0px;">'''''Excel BattleQuiz]'''''</span>
-->
<!--                                    -->
<!-- ******** Excel PowerPack Files ******** -->
|style="padding:2px"|[[BattleActs_PowerPack#ppCouret| <span style="color: white; font-size: 12px; background-color: indigo; border: solid; border-width: 2px; border-radius: 10px; border-color: indigo; padding: 1px 3px 1px 3px; margin: 0px;">'''''Excel Files '''''</span>]]
<!--                                    -->
<!-- ******** Forum ******** -->
|style="padding:2px"|[https://battleacts8.ca/8/forum/index.php?p=/categories/couret-venter<span style="font-size: 12px; background-color: lightgrey; border: solid; border-width: 1px; border-radius: 10px; padding: 2px 10px 2px 10px; margin: 0px;">'''Forum'''</span>]
<!--                                    -->
<!-- ******** Formula Sheet ******** -->
<!--
|style="padding:2px"|[https://battleacts8.ca/8/forum/categories/couret-venter<span style="font-size: 12px; color: darkblue; background-color: lightblue; border: solid; border-width: 1px; border-radius: 10px; padding: 2px 10px 2px 10px; margin: 0px;">'''Formula Sheet'''</span>]
-->
|}
<span style="color: red;">'''You must be <u>logged in</u> or this will not work.'''</span>

Latest revision as of 11:32, 17 June 2024

Reading: Couret, J. and Venter, G., "Using Multi-Dimensional Credibility to Estimate Class Frequency Vectors in Workers Compensation"

Synopsis: It can be difficult to have credible data for low frequency events. In this article Couret & Venter look at Workers' Compensation and how different types of injury severities can be used to improve the predictive power of the data as there is often only a small (possibly random) difference between an accident being say fatal or resulting in a permanent injury.

The article comes in three parts. First we introduce the equations which define the multi-dimensional credibility setup. Then we discuss how to estimate the required credibilities in order to produce estimates of the Workers' Compensation class mean by injury type. Lastly, we look at testing the effectiveness of the multi-dimensional credibility technique.

Study Tips

This is a challenging article as it is densely written so we recommend you read the wiki before turning to the source material. If you get stuck on part of the article, skip ahead for a bit and circle back later. It takes two or three readings for the concepts and algebra to come together.

The material hasn't come up very often lately and there is considerable variation in how the CAS has previously tested it. Make sure you can do all of the prior exam questions and know the concepts well. However, if your exam has a question on this material, you may want to leave it towards the end of the exam to focus on easier points first.

Estimated study time: 16 Hours (not including subsequent review time)

BattleTable

Based on past exams, the main things you need to know (in rough order of importance) are:

  • Be able to apply the Sum of Squared Errors Test to a given set of results.
  • Be able to apply the Multi-dimensional credibility equations to the data.
  • Be able to apply the Quintiles Test in the context of this paper.
  • Be able to briefly describe Couret & Venter's results.
Questions from the Fall 2019 exam are held out for practice purposes. (They are included in the CAS practice exam.)
reference part (a) part (b) part (c) part (d)
E (2015.Fall #5) Multi-dimensional Credibility
- calculate ratio
Quintile Test
- describe process
Sum of Squared Errors Test
- shortcomings
E (2014.Fall #1) Quintile Test
- evaluate
Sum of Squared Errors Test
- apply & recommend
E (2014.Fall #4) Statistical Considerations
- apply in context
E (2013.Fall #3) Holdout Sample
- describe purpose
Holdout Sample
- recommend
Sum of Squared Errors Test
- evaluate
Trends
- Mahler.Credibility
E (2012.Fall #5) Multi-dimensional Credibility
- determine appropriateness
Expected Loss
- calculate
Full BattleQuiz Excel Files Forum

You must be logged in or this will not work.

In Plain English!

Claim counts for workers compensation classes are unreliable for serious injuries because of the low frequencies involved. However, serious injury types are correlated with other injuries as the situations which cause fatal (F), permanent total (PT), and major permanent partial (Major) injuries are usually similar. A small change in the situation may result in a significantly different outcome. So a class with a lot of major injuries probably has a higher than average likelihood for permanent total and fatal injuries.

Couret & Venter derive a multivariate correlated credibility by estimating the population mean for each injury type by class using a linear function of the sample means for all of the injury types in the class. The coefficients of the linear function are estimated by minimizing the expected squared error.

They apply this method to ratios of claim counts by injury type to temporary total impairment (TT) claim counts. That is, they treat a temporary total injury as an exposure which could have produced a higher severity claim (F, PT, Major, or Minor). Let V, W, X, and Y be the observed ratios for injury types F, PT, Major, and Minor. The paper assumes the distribution of claim counts by injury type is parametrizable for each class but the parameters are unknown. Let vi, wi, xi, and yi be the population (hypothetical) mean ratios. Rather than writing out all of the highly similar equations for each of the four serious injury types we focus on the permanent total injuries (PT) which uses the variable W. You should be able to translate the equations from permanent total injuries to any of the other serious injury types.

The observed sample claim count ratio of permanent total (PT) to temporary total (TT) for class i at time t is given by [math]m_{i,t}\cdot W_{i,t}=\displaystyle\sum_{j=1}^{m_{i,t}}\left(w_i+\epsilon_{j,t}\right)[/math]. Here, there are [math]m_{i,t}[/math] TT claims, and the [math]\epsilon_{j,t}[/math] are independent perturbations with mean zero and standard deviation [math]\sigma_{W_i}[/math] which vary by class but not time. Hence each TT claim is considered an exposure which may or may not produce a PT claim.

Rearranging the equation gives [math]W_{i,t}=w_i+\displaystyle\frac{1}{m_{i,t}}\cdot\sum_{j=1}^{m_{i,t}}\epsilon_{j,t}[/math] and [math]Var(W_{i,t}\;|\; w_i)=\displaystyle\frac{\sigma^2_{W_i}}{m_{i,t}}[/math].

Hence, the more TT claims, the smaller the random fluctuations of the annual observed class ratio [math]W_{i,t}[/math] from its population mean since [math]\frac{1}{m_{i,t}}[/math] goes to 0 as [math]m_{i,t}[/math] increases.

Key Assumption

The variance of F, PT, Major and Minor claims decreases as the number of TT claims increases. That is, the more TT claims you have, the more of the other claim types you should have and hence their variance should decrease.

Let Wi be the sample class mean ratio over all time. Assume there are N independent time periods. Then [math]W_i=\displaystyle\frac{\sum_{t=1}^N m_{i,t}W_{i,t}}{\sum_{t=1}^N m_{i,t}}[/math].

Let mi be the sum of [math]m_{i,t}[/math] over all time, and m be the sum over all classes i of mi. Then [math]Var(W_i \;|\; w_i)=\displaystyle\frac{\sigma^2_{W_i}}{m_i}[/math].

The same calculations are also performed for each of the 7 hazard groups. These will become complements of credibility. The hazard groups use a subscript h, so [math]V_h[/math] is the observed mean fatal claims for hazard group h.

By forming a linear combination of the sample means for the injury types within class i, we account for correlations between injury types. For instance, focusing on fatal claims, V, we are estimating the population mean for fatal claims in class i, [math]v_i[/math], by the following equation: [math]\hat{v_i}=V_h + b_{v,i}(V_i-V_h)+c_{v,i}(W_i-W_h)+d_{v,i}(X_i-X_h)+e_{v,i}(Y_i-Y_h)[/math].

Remember, [math]V_h[/math] is the observed mean fatal claims for the hazard group h which contains class i.

Similar equations are formed for the class estimates of W, X, and Y.

The coefficients [math]b_{v,i}, c_{v,i}, d_{v,i}, e_{v,i}, \ldots, b_{y,i}, c_{y,i}, d_{y,i}, e_{y,i}[/math] are the multi-dimensional credibilities we need to estimate. Observe that if there was no correlation between the injury types then the credibilities are zero between injury types and the above equations reduce to credibility weighting the observed injury type mean for the class against the observed injury type mean for the hazard group containing the class.

Alice: "So far everything has been quite abstract. Let's remedy that by looking at part (a.) of the following example."

Set up and Solve the Multi-dimensional Credibility Equation

Credibility Considerations

Couret & Venter use multivariate Bühlmann-Straub credibility which minimizes the least square error. We want to minimize the expected squared error over all classes between the linear combination and the hypothetical mean. That is, in the case of permanent total injuries (variable w), minimize [math]E\left[\left(a+bV_i+cW_i+dX_i+eY_i-w_i\right)^2\right][/math].

The coefficients [math]a, b, c, d, e[/math] are determined as follows:

Differentiating the equation with respect to a and setting equal to 0 yields [math]a=-E\left[bV_i+cW_i+dX_i+eY_i-w_i\right][/math]. Substituting this back into the linear combination gives: [math]E[w_i]+b\left(V_i-E[V_i]\right)+c(W_i-E[W_i])+d(X_i-E[X_i])+e(Y_i-E[Y_i])[/math].

Since [math]E[w_i][/math] is unconditional, it applies across all classes and so can be used to estimate a hazard group mean. Hence, if a class has no credibility, (i.e. [math]b=c=d=e=0[/math]) the hazard group ratio is used.

Also note c corresponds to the traditional credibility factor Z when only injury type PT is considered. This is because [math]cW_i+(1-c)E[W_i]=E[w_i]+c(W_i-E[W_i])[/math] as [math]E[w_i]=E[W_i][/math].

Differentiating the linear combination with respect to b and setting equal to 0 yields [math]aE[V_i]+E[V_i(bV_i+cW_i+dX_i+eY_i-w_i)]=0[/math]. Substituting in the previous expression for a and rearranging gives the following equation: [math]Cov(V_i,W_i)=bVar(V_i)+cCov(V_i,w_i)+dCov(V_i,X_i)+eCov(V_i,Y_i)[/math].

Repeating for c, d, and e yields three similar equations. Taken together, they form the following matrix equation: [math]\left(\begin{array}{c}Cov(V_i,w_i)\\Cov(W_i,w_i)\\Cov(X_i,w_i)\\Cov(Y_i,w_i)\end{array}\right)=C\cdot\left(\begin{array}{c}b_{w,i}\\c_{w,i}\\d_{w,i}\\e_{w,i}\end{array}\right)[/math], where C is the covariance matrix of the class by injury-type sample means.

The difficulty now is in estimating the covariances. Couret & Venter note [math]Var(V_i\;|\;v_i)=\displaystyle\frac{\sigma^2_{V_i}}{m_i}[/math], where [math]\sigma^2_{V_i}[/math] is the process variance for Vi. The unconditional variance of Vi is [math]\frac{EPV_V}{m_i}+VHM_V[/math]. The latter term is the variance of hypothetical means for V which is the variance of the means vi for the unobserved classes of V. The first term is the expected process variance which is [math]E[\sigma^2_{V_i}][/math]. Note this is independent of i because the expectation is over all classes.

Key Assumption

The observed injury ratios for any year for each type of injury are the class injury ratio plus a random, independent perturbation.

Couret & Venter conclude it is sufficient to estimate the off-diagonal elements by the sample covariances.

The first entry on the leading diagonal of the matrix is [math]Var(V_i)=\frac{EPV_V}{m_i}+VHM_V[/math]. Subsequent leading diagonal entries follow by replacing V with W, X or Y.

Consequently, formulas are required to estimate EPV and VHM. The paper uses formulas due to Dean (2005). In the following, recall a hat denotes an estimate.

Formulas for EPV and VHM

[math]\hat{EPV_V}=\frac{\sum_{i=1}^R\sum_{t=1}^N m_{it}\left(V_{it}-V_i\right)^2}{R(N-1)}[/math] and [math]\hat{VHM_V}=\frac{\sum_{i=1}^Rm_i\left(V_i-V\right)^2-(R-1)\cdot\hat{EPV_V}}{m-\frac{1}{m}\cdot\sum_{i=1}^Rm_i^2}[/math]

Lastly, [math]Cov(V_i,W_i)=\sum_{i=1}^R\frac{\left(V_i-V_h\right)\cdot\left(W_i-W_h\right)\cdot m_i}{m_h}[/math].

Here, R is the number of classes in the hazard group that contains class i. N is the number of years used in the data set.

You should make sure you have these formulas memorized and know how to apply them.

Note that V is a weighted average of the Vi 's with weights mi. Also, the variance of hypothetical means (VHM) is estimated using the sample variance from each class. This may be negative. If that happens, then set it equal to 0. When it is set equal to 0, the expected process variance accounts for all of the observed variation. That is, there are no individual risk differences.

mini BattleQuiz 1 You must be logged in or this will not work.

Performance Testing

This section of the paper is best skim-read after reading the key points below.

Couret & Venter had 7 policy years of untrended and undeveloped workers compensation data available. They discarded the most recent year as they believed it to be too immature. The data is examined in two ways: by hazard group/class and by injury type. They use the sum of square errors to measure the performance of the predictions. The predictions are made three ways: by using the hazard group mean, using the "raw" class mean, and using the multi-dimensional credibility process.

To facilitate testing, a holdout sample is created using the data for the odd report years. So the hazard group means, class means and multi-dimensional credibility process is all calculated on the even report years.

Couret & Venter note that incident ratios are impacted by unknown effects due to changes in the portfolio of individual insurance policies over time. They also note there is considerable volatility in the class ratios at the state level which means improving the estimate of the mean may only produce a small reduction in the sum of squared errors.

The multi-dimensional credibility procedure is designed to minimize the expected deviation between the true class mean and its sample estimator over the same period. By using the even years to predict the odd years (a form of holdout sample), there is a disconnect between the minimized expectation and the sample statistic.

Three testing approaches taken by Couret & Venter (shown for fatal injuries):

  1. Hazard group method: [math]SSE=\sum_{\mbox{all classes}}(V_h-V_{i,\mbox{holdout}})^2[/math]
  2. Raw class data method: [math]SSE=\sum_{\mbox{all classes}}(V_i-V_{i,\mbox{holdout}})^2[/math]
  3. Credibility method: [math]SSE=\sum_{\mbox{all classes}}(\hat{V_i}-V_{i,\mbox{holdout}})^2[/math]

The lowest sum of squared errors (SSE) is best.

Alice: "This sounds complicated but don't worry - it's relatively easy to apply. Let's walk through an example now."

Calculate the Sums of Squared Errors

Their credibility method performs only slightly better than the hazard group method. Possible reasons include class data being volatile across years and the estimators being fit to the training dataset rather than the test dataset. The latter point is really saying there could be material differences between even and odd years in the averages. A better approach is to normalize the data by hazard group by year to eliminate differences between hazard groups in the train vs test datasets.

Another key point is their decision to test on ranked portfolios of state-class combinations. This is based on the quintiles test.

The Quintiles Test

We'll apply the quintiles test to an injury type - we'll use V (Fatal injuries) and fixed hazard group.

  1. Split the data into a training data set and a test data set (also known as the holdout sample).
  2. Sort the classes in the training data set into ascending order based on the estimate vi. Superimpose this ordering on the classes in the test data set.
  3. Group the classes into quintiles using the TT injury counts. That is, split the ordered data set into 5 pieces such that each piece contains approximately the same number of TT injuries.
  4. Calculate Vquintile and Vquintile, holdout. Calculate [math]\hat{v}_\mbox{quintile}[/math] and [math]\hat{v_h}[/math] as weighted averages of [math]\hat{v_i}[/math] using the TT injury counts for the weights.
  5. For each of the three approaches taken by Couret & Venter, analyze the sum of squared errors.

Why use a quintiles test?

We could compute the sum of squared errors for each of the methods at the individual class level. However, even after grouping classes into hazard groups, there is a large amount of variation between classes. This noise makes it hard to determine if the credibility method is performing better. By grouping into quintiles and calculating sums of squared errors based on the quintile statistics, we reduce this variation so the data is more credible and we can make a better comparison of the methods.

One way the CAS can test this material is to have you interpret the results of a quintiles test to determine whether or not you should use the multi-dimensional credibility technique or some other method. Take 5 minutes to attempt the following past exam question now.

Evaluate the Multi-dimensional Credibility Technique

Another way this can be tested is in parts b and c of 2015 Q5. Give them a try now!

Describe the Quintiles Test in the Context of Couret & Venter

Couret & Venter's Results

Using the quintiles test, the multi-dimensional credibility procedure produced a much lower sum of squared errors for all hazard groups/injury types except for hazard group A. Couret & Venter suggest this is because hazard group A is highly homogeneous so different injury types aren't prevalent or are not very predictive of other injury types.

mini BattleQuiz 2 You must be logged in or this will not work.

Full BattleQuiz Excel Files Forum

You must be logged in or this will not work.