Interpretable Machine Learning

Final Group Project

Author

Munir Eberhardt Hiabu

Published

March 27, 2024

The data:

Install the R-package CASdatasets.

install.packages("CASdatasets", repos = "http://cas.uqam.ca/pub/", type="source")

Every group will work with one of the datasets freMPL1-4.

library(CASdatasets)

data(freMPL1) ### Group 1-4
#data(freMPL2) ### Group 5-7
#data(freMPL3) ### Group 8-10
#data(freMPL4) ### Group 11-13

#### ONLY for freMPL3& freMPL4
freMPL3 <- subset( freMPL3 , select = -DeducType )  
freMPL4 <- subset( freMPL4 , select = -DeducType ) 

### In the following freMPL1--4 will simply be called freMPL

Split you data into training and test set:

library(splitTools) 
set.seed(2024) 
ind <- partition(freMPL1$ClaimInd, p = c(train = 0.8, test = 0.2)) #### train and test have the same claim frequency
train <- freMPL[ind$train, ] 
test <- freMPL[ind$test, ]
  • Using the training data, train an algorithm that estimates the technical price (that is the conditional expectation of claimAmount given Exposure=1). You should apart from accuracy also take into consideration that you want to be able to explain your final model well.

  • Explain your final model.

  • What are your estimates for the following rows of your test data: 11386, 12286, 2119, 2238, 27833, 27988.

  • Explain the predictions of the following rows in your test data: 1386, 12286, 2119, 2238, 27833, 27988.

You should submit an html file (one file per group generated from R Markdown) where you justify and explain your modelling and present your results.