I use the following paper: Yardstick Competition for Multi-product Hospitals An Analysis of the Proposed Dutch Yardstick Mechanism
This paper introduces varying models that model different aspects of competition within healthcare markets.
The model that I will be using in this analysis is introduced on page 56 and 57 on the topic of collective or individual qualities of healthcare.
The paper goes into the question whether it is possible and/or desirable to have different quality levels across insuerers and hospitals. It is important to note that I will be using the same model except I will use quantity instead of quality (this model with quantity was also used in a healthcare course at tilburg university). This is because the paper differentiates between quality where a healthy person will demand a lower quality than an unhealthy person, this same idea can be used with quantity where a healthy person has less demand for healthcare, while an unhealthy person has a higher demand for healthcare. I will shortly explain what the paper discusses and afterwards I will continue with this model except that everything will be related to quantity instead of quality.
The paper introduces a model in which an insurance company can provide an insurance package to consumers. The consumers have a benefit function which depends on the quality(quantity) they receive. Fianlly there is a cost function which also depends on the quality(quantity).
The distinction they make in the paper is between collective and individual regimes.
Since the paper only shortly introduces this theoretical model the literature is very short and they only conclude the following:
Research Question: How does adverse selection affect the quantity of health that is provided in the simple healthcare model and population healthcare model through rationing and information rent?
In this question the following defintions are used:
The model in the paper consists of different healthcare benefits between consumers, but it does not go into depth about how the healthcare insurers in the market provide different possible quantity packages to maximize their own profit. The difference here is that in practice in the healthcare markets people are often provided with different insurance packages. With the analysis here I will simulate the market with different levels of health of the consumers to see how the insurers will be able to maximize their profits. The research question of this analysis is interesting mainly because it shows how insurers will react in this model when the simulated market is closer to the reality. It will also show how adverse selection and thus information rent and rationing can affect equilibrium quantities. The role of rationing and infromation rent give valueable insights into tools for insurers to counter adverse selection by the consumer.
In the healthcare market insurers can provide insurance packages for consumers which depend on the health level of the consumer. In this analysis I look at profit maximization of insurers under different market circumstances. From this analysis I conclude the following points:
With different consumers having different levels of health it is beneficial for the insurer to provide different packages as choices for those consumers.
When consumers are free to choose all packages in the market adverse selection becomes possible in the market. Insurers can counter this adverse selection by introducing information rent for the price of the unhealthy package. This information rent reduces the price for the unhealty package such that the benefit that the unhealthy consumers get from choosing the unhealthy package is equal or higher than the benefit they get from swapping to the healthy package.
When information rent is included in the profit (objective) function of the insurer the insurer can decrease (ration) the quantity provided in the healthy package such that the infromation rent decreases.
The analysis also shows that the model without adverse selection and information rent can be generalized such that it can be used on a sample population.
I will introduce the model in the paper, but instead of using quality in healthcare I will use quantity in healthcare.
The model of the paper consists of a very simple benefit and cost model based on the quantity provided.
The benefit function is a function with decreasing marginal benefit. The marginal benefit is decreasing because for every consumer the first levels of quantity are more important to your health benefit than the extra quantity when you already have a lot of healthcare. Think about the average person who benefits a lot of regular checkups or simepl doctor visits, but they will not get a lot of extra benefit of ehalthcare they barely need such as going to the docotor every day.
Benefit function: $$ Benefit_i (q_i, H_i) = H_i*log(q_i) $$ This benefit function consists of the following aspects:
The cost function is a function with increasing marginal costs. The marginal cost is increasing because in healthcare the supply of healthcare is limited, meaning that for more healthcare provided the marginal cost increases because less doctors are avaialble. For example when there is little demand (q) then there are still many hospital beds available, but in the case where demand (q) is already high such that almost all hospital beds are filled already then the extra hospital patient costs a lot because doctors and nurses already have full schedules.
The cost function: $$ Cost_i (q_i) = costlevel^q_i $$ This cost function consists of the following aspects:
In this model, with only one insurer, the insurer acts as a monopolist and will thus be able to maximize their profits where marginal costs equal marginal benefits: $MC = MB$.
I will extend this model first to the simple healthcare market in which there are two consumers who can choose two different insurance packages from 1 insurer. In this model the price that an insurer can ask for an insurance package is equal to the benefit that the consumer will get from that package.
The profit maximimizing insurer will have the following profit function in the case of the simple healthcare market where each consumers is only allowed to choose the package that fits their optimal quantity best (so a market where first degree price discrimination is possible and the consumer is not allowed to choose another package):
$$ Profit_{insurer} = [Benefit(q_{healthy}, H_{healthy}) - Cost(q_{healthy})] + [[Benefit(q_{unhealthy}, H_{unhealthy}) - Cost(q_{unhealthy})] $$However in a more realistic market the consumers have to choose which package they want from the insurance packages that are provided. In that case the health insurer still wants that the unhealthy types choose the package that is optimal for the unhealthy types simply because they will then pay for the unhealthy package at optimal quantity of the unhealthy type.
To make sure that the unhealthy type choose the package for the unhealthy optimal quantity and the healthy types choose their own package the objective function of the insurer has to change. It has to include information rent in the price of the package for the unhealthy. Information rent is a reduction of the price of the unhealthy package such that the extra benefit of the extra quantity of the unhealthy package is equal or higher than the benefit the unhealthy types can get from choosing the healthy package. In the code below I grpahically explain the information rent.
The objective function of the insurer with infromation rent:
$$ InformationRent = Benefit(q_{healthy}, H_{unhealthy}) - Benefit(q_{healthy}, H_{healthy}) $$$$ Profit_{insurer} = [Benefit(q_{healthy}, H_{healthy}) - Cost(q_{healthy})] + [[Benefit(q_{unhealthy}, H_{unhealthy}) - Cost(q_{unhealthy}) - InformationRent] $$Filling in the information rent into the profit function gives:
$$ Profit_{insurer} = [Benefit(q_{healthy}, H_{healthy}) - Cost(q_{healthy})] + [[Benefit(q_{unhealthy}, H_{unhealthy}) - Cost(q_{unhealthy})] - [Benefit(q_{healthy}, H_{unhealthy}) - Benefit(q_{healthy}, H_{healthy})]] $$With the profit function that includes information rent the insurer can actually change the optimal quantity for the healthy to reduce the infromation rent loss on the unhealthy. This decrease in the quantity for the healthy is called rationing. In this analysis I will use scipy.optimize to show that the optimal quantity changes in the model with and without information rent, this optimization will then show that it is optimal for the insurer to ration the quantity of the healthy to increase the profit on the unhealthy.
My final extension to the model in the paper is the population model in which I will create a population sample of consumers with different health levels out of a distribution which is closer to reality where most people need little healthcare and few people need a lot of healthcare.
With this population sample I will create an objective function for the insurer in which the insurer optimizes its profit over the complete sample.
This objective function is the optimization of the sum of profit over the full sample:
in which $z$ is the length of the sample and $i$ is the consumer.
$$ \sum_{i}^{z} Profit_i $$In the population model the $Profit_i$ will be calculated using the optimal quantity of each consumer based on their health factor. When their optimal quantity is below the quantity of the unhealthy pakcage they will choose the healthy package and when it is above the unhealthy quantity they will choose the unhealthy package.
This means that the $Profit_i$ in the objective function above consists of the following, in which $H_i$ is the health factor of consumer $i$:
$$ q_{optimal} < q_{unhealthy} : Profit_i = Benefit(q_{healthy}, H_{i}) - Cost(q_{healthy}) $$$$ q_{optimal} >= q_{unhealthy} : Profit_i = Benefit(q_{unhealthy}, H_{i}) - Cost(q_{unhealthy}) $$import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats as st
from scipy import optimize
import seaborn as sns
I create the benefit function which is depenend on quantity and the health level which in the code is named as health_factor. In this analysis I use functions such that the health factor can be between 1 and 10 in which a health factor of 1 is someone who needs very little healthcare and 10 is someone who is very unhealthy.
The Benefit function is a function with decreasing marginal utility. The function is: $Benefit(q, H) = H * ln(q+1)$
The cost function is a function with increasing marginal costs. The cost level in the market is set to the number of $e$ such that the function becomes: $Cost(q) = e^{q/40} -1$
#Defining the Utility function at time t, it returns negative such that scipy minimize can be used.
def Benefit(q, Health_factor):
return Health_factor*np.log(q+1)
def Cost(q):
return (np.exp(q/40) - 1)
Below I create q such that it becomes a numpy array consisting of 0 to 160 with steps of 1.
q = np.linspace(0,160,160)
In the code below I create a visualization of the market. I start with configuring the plot size.
Then in the for loop I create the different benefit lines with different health factors. The for loop loops over the range(1,11) which is 1 up to 10 with steps of 1. Such that I get the Health factor values of 1 to 10. I plot these lines in the for loop with the q as x value and the benefit of the combination of q and H as the y value. I also give these lines the labels of Benefit at each health factor.
I then also plot the cost function where q is the x value and the cost at q is the y value. This line has the color red and the label Costs.
The plt.legend is needed to show the legend of different lines. And the label of the y axis is simply the value of the benefit or the cost.
plt.figure(figsize=(16, 8), dpi=80)
for H in range(1, 11):
plt.plot(q, Benefit(q,H), label = f"Benefit H={H}")
plt.plot(q, Cost(q), color = "red", label = "Costs")
plt.xlabel("Quantity (q)")
plt.ylabel("Value")
plt.legend()
plt.show()
Below I create the first objective function to find the optimal point where the insurer can maximize the difference between benefit (thus the price the insurer can ask to the consumer) and the cost at the level of q. This obejctive function then finds the optimal level of q such that this difference (and thus the profit of the insurer) is maximized.
The objective function returns the negative of benefit - costs because the optimization function of scipy.minimize_scalar minimizes a function to find the optimum value.
def objective_function(q):
return -(Benefit(q, Health_factor) - Cost(q))
Below I create a for loop which loops over the values 1 to 10 where in the loop the health factor is set to the value of the loop and then the optimal q is calculated of that health factor. As a result you can see that the optimal q is increasing with the health factor.
for i in range (1,11):
Health_factor = i
print("The optimal q with a Health factor of ",Health_factor," is: ",optimize.minimize_scalar(objective_function).x)
The optimal q with a Health factor of 1 is: 22.04946882287033 The optimal q with a Health factor of 2 is: 33.56611349110644 The optimal q with a Health factor of 3 is: 41.5100503475735 The optimal q with a Health factor of 4 is: 47.6340206478508 The optimal q with a Health factor of 5 is: 52.64051156611696 The optimal q with a Health factor of 6 is: 56.88631665631836 The optimal q with a Health factor of 7 is: 60.57884633148013 The optimal q with a Health factor of 8 is: 63.849852728277206 The optimal q with a Health factor of 9 is: 66.78847404917059 The optimal q with a Health factor of 10 is: 69.4579427894475
Above is shown that the optimal q is increasing with the health factor. Below I also make a graph to visualize this. In the code below:
I initialize 2 empty lists for the values I later plot.
I then loop over np.arange(1,11,0.01) which means that I loop over the values between 1 and 10 with steps of 0.01. np.arange is used because numpy.arange allows for a float value as step value (in this case the 0.01). I then add these values of the health factor and the outcome op the optimize function to the lists.
I finally plot these lists where the x values are the health factors and the y values are the value of the optimal q.
List_optimal_q_per_health_factor = []
List_health_factor = []
for i in np.arange (1,11,0.01):
Health_factor = i
List_health_factor.append(i)
List_optimal_q_per_health_factor.append(optimize.minimize_scalar(objective_function).x)
plt.figure(figsize=(8, 6), dpi=80)
plt.plot(List_health_factor, List_optimal_q_per_health_factor)
plt.xlabel("Health factor")
plt.ylabel("optimal q")
plt.show()
The graph shows that the optimal quantity increases witht he health factor, but the marginal increase is decreasing, which means that the slope of the line is decreasing. Every increase in the health factor results in a lower increase in the optimal quantity.
People can have different health factors. This health factor is the measure of how unhealthy a person is and thus whether a person needs more healthcare.
In the above part I calculated the optimal q in the case that one person was in the market. I now continue with the simple healthcare model where there are two consumers and the insurer can provide two packages for the consumers.
Because Healthcare is a market in which an insured person cannot be turned away from the hospital, so there is incentive for both the healthy and unhealthy types to choose the cheapest package. However it is possible for the insurer to provide different packages with different prices such that each consumer chooses the correct package of healthy and unhealthy types.
To make sure that unhelathy patients do not choose the cheaper package the insurer can introduce information rent on the price of the more expensive package. This information rent is a reduction of the price of the package of the unhealthy types such that the extra benefit they get from choosing the unhealthy package is equal or more than the value they would get when choosing the healthy package.
To make sure that this is the case the information rent would be set at the value of:
(Benefit at q_healthy with health factor of unhealthy) - (Benefit at q_healthy with health factor of healthy)
I now shortly introduce this information rent using a graphical explanation:
I use health factor 1 as a healthy type and health factyor 3 as an unhealthy type.
In the graph I create the benefit lines of both and the cost line of the insurer. Then the dashed lines show the optimal q levels for both. The information rent is the purple line.
Further explanation continuous at the next graph where the information rent is then applied on the unhealthy types.
Health_factor = 1
Optimal_q_at_H1 = optimize.minimize_scalar(objective_function).x
Health_factor = 3
Optimal_q_at_H3 = optimize.minimize_scalar(objective_function).x
plt.figure(figsize=(10, 6), dpi=80)
plt.plot(q,Benefit(q,1), label = 'Health factor = 1') #plot of benefit with health factor 1
plt.plot(q,Benefit(q,3), label = 'Health factor = 3') #plot of benefit with health factor 2
plt.plot(q,Cost(q)) #plot of cost
#Below are the lines at the optimal q values for each health factor line
# plt.vlines create a vertical line at an x value with a ymin (start) and a ymax (end) of the line. The ymax here is the value of the outcome of the benefit funtion at a certain q with a certain health factor.
plt.vlines(x = Optimal_q_at_H1, ymin = 0, ymax = Benefit(Optimal_q_at_H1,1), label = "optimal q at H = 1", linestyle = ":" , colors = 'coral')
plt.vlines(x = Optimal_q_at_H3, ymin = 0, ymax = Benefit(Optimal_q_at_H3,3), label = "optimal q at H = 3", linestyle = ":" , colors = 'brown')
#Below is the line which shows what the value of the information rent should be
plt.vlines(x = Optimal_q_at_H1, ymin = Benefit(Optimal_q_at_H1,1), ymax = Benefit(Optimal_q_at_H1,3), label = "Information Rent", color = "purple", linewidth = 2)
plt.xlim(0,120) #limit x axis from 0 to 100
plt.ylim(0,15) # limit y axis from 0 to 13
plt.legend()
plt.xlabel('Quantity (q)')
plt.ylabel('value')
plt.show()
In the graph above you can see how the infromation rent is calculated. In the graph below I continue with almost the same graph expect that the infromation rent is then applied on the price of the package of the unhealthy types.
Resulting in a new price for the unhealthy package with the information rent such that the unhealthy type will choose the unhealthy package.
Health_factor = 1
Optimal_q_at_H1 = optimize.minimize_scalar(objective_function).x
Health_factor = 3
Optimal_q_at_H3 = optimize.minimize_scalar(objective_function).x
#calcualting the value of the information rent
information_rent = Benefit(Optimal_q_at_H1,3) - Benefit(Optimal_q_at_H1,1)
plt.figure(figsize=(18, 11), dpi=80)
plt.plot(q,Benefit(q,1), label = 'Health factor = 1') #plot of benefit with health factor 1
plt.plot(q,Benefit(q,3), label = 'Health factor = 3') #plot of benefit with health factor 2
plt.plot(q,Cost(q)) #plot of cost
#Below are the lines at the optimal q values for each health factor line
# plt.vlines create a vertical line at an x value with a ymin (start) and a ymax (end) of the line. The ymax here is the value of the outcome of the benefit funtion at a certain q with a certain health factor.
plt.vlines(x = Optimal_q_at_H1, ymin = 0, ymax = Benefit(Optimal_q_at_H1,1), label = "optimal q at H = 1", linestyle = ":", colors = 'coral')
plt.vlines(x = Optimal_q_at_H3, ymin = 0, ymax = Benefit(Optimal_q_at_H3,3), label = "optimal q at H = 3", linestyle = ":", colors = 'brown')
#Below is the line which shows what the value of the information rent should be
plt.vlines(x = Optimal_q_at_H1, ymin = Benefit(Optimal_q_at_H1,1), ymax = Benefit(Optimal_q_at_H1,3), label = "Information Rent Calculation", color = "purple", linewidth = 2, linestyle = ':')
#The infromation rent part that is subtracted from the original price of teh unhealthy types
plt.vlines(x = Optimal_q_at_H3, ymin = Benefit(Optimal_q_at_H3,3)-information_rent, ymax = Benefit(Optimal_q_at_H3,3), label = "Information Rent Substraction", color = "purple", linewidth = 2)
#below I create the lines of the prices of the packages
plt.hlines(y = Benefit(Optimal_q_at_H1,1), xmin = 0, xmax = Optimal_q_at_H1, label = "optimal price at H = 1", linestyle = ":", colors = 'dimgrey')
plt.hlines(y = Benefit(Optimal_q_at_H3,3), xmin = 0, xmax = Optimal_q_at_H3, label = "price at H = 3 without information rent", linestyle = ":", colors = 'silver')
plt.hlines(y = Benefit(Optimal_q_at_H3,3)-information_rent, xmin = 0, xmax = Optimal_q_at_H3, label = "new price at H = 3 with information rent", linestyle = ":", linewidth = 2, colors = 'navy')
plt.xlim(0,120) #limit x axis from 0 to 100
plt.ylim(0,15) # limit y axis from 0 to 13
plt.legend()
plt.xlabel('Quantity (q)')
plt.ylabel('value')
plt.show()
Below I write a function which automatically calculates the information rent in the simple healthcare market where only 2 different health factors are in the market.
def information_rent(Health_factor_healthy, Health_factor_unhealthy):
Health_factor = Health_factor_healthy #set health factor to health facto healthy to find optimal_q for q_healthy
res = optimize.minimize_scalar(objective_function) # optimizer to find q_healthy
q = res.x #set q to q_healthy
return Benefit(q,Health_factor_unhealthy) - Benefit(q, Health_factor_healthy) # return the infromation rent at q_healthy with two health factors
information_rent(1,6) #example calucaltion with health factor 1 and 6
18.748702633700553
In the above two graphs the model of quantity in the healthcare market including infromation rent is explained. To add further depth I will now explain the mechanism which we will see in the optimization functions. Because when q_healthy is reduced the information rent becomes smaller, meaning that the reduction of profit in the unhealthy types also becomes smaller. This is what is called rationing. In the next section the optimization functions are used and they will show that the q_healthy will decrease such that the profits will increase.
Below I continue by creating function to optimize the profit of the insurer in the simple healthcare market with information rent.
I start by creating the objective function of the insurer which consists of the profit on the healthy and the profit on the unhealthy.
the profit on the healthy consists of the benefit and the costs at q_healthy
the profit on the unhealthy not only consists of the benefit and the costs, but also of the information rent. Note that the information rent is needed because otherwise the unhealthy types could choose the package of the healthy type.
Further explanation can be found in the comments in the code.
def objective_function_insurer(params):
q_healthy, q_unhealthy = params #the parameter which the optimize function will optimize, in this case the quantities of the two packages
information_rent_q_healthy = Benefit(q_healthy ,Health_factor_unhealthy) - Benefit(q_healthy, Health_factor_healthy) #the calculation of the information rent
Profit_on_healthy = Benefit(q_healthy, Health_factor_healthy) - Cost(q_healthy) # the calculation of the profit on the healthy at q_healthy
Profit_on_unhealthy = Benefit(q_unhealthy, Health_factor_unhealthy) - information_rent_q_healthy - Cost(q_unhealthy) # the calculation of the profit on the unhealthy at q_unhealthy with the information rent
return -(Profit_on_healthy + Profit_on_unhealthy) #the funtion returns the negative of the total profit because the optimize function optimizes the negative of the function.
I further define the bounds and the example values for the example optimization
bnds = ((0,160),(0,160)) #bounds used in optimzation. In this case both q_healthy and q_unhealthy cannot go below 0 and above 160
#defining the needed parameters of the objective_function_insurer:
Health_factor_healthy = 2
Health_factor_unhealthy = 3
#optimizing the profit of the insurer under the constraints and bounds given the choice functions of the healthy and unhealthy types
optimize_with_information_rent = optimize.minimize(objective_function_insurer, [10,40],method='Nelder-Mead', bounds = bnds)
optimize_with_information_rent
final_simplex: (array([[22.04945857, 41.51002997], [22.04945898, 41.51010004], [22.04940566, 41.51007456]]), array([-11.82860443, -11.82860443, -11.82860443])) fun: -11.828604426011676 message: 'Optimization terminated successfully.' nfev: 87 nit: 45 status: 0 success: True x: array([22.04945857, 41.51002997])
The outcome shows for the example parameters that for the health factor 2 and 3 in teh market the q_healthy should be 22.05 and the q_unhealthy should be 41.51
Health_factor = 2
optimal_q_H2 = optimize.minimize_scalar(objective_function).x
print("optimal q without infromation rent in the model at health factor ",Health_factor,": ",optimize.minimize_scalar(objective_function).x)
Health_factor = 3
print("optimal q without infromation rent in the model at health factor ",Health_factor,": ",optimize.minimize_scalar(objective_function).x)
print("optimal q with information rent in the model at health factor ",Health_factor,": ",optimize_with_information_rent.x[0])
print("optimal q with information rent in the model at health factor ",Health_factor,": ",optimize_with_information_rent.x[1])
print("This shows that the introduction of information rent in the model results in a decrease in q_healthy of: ",optimal_q_H2-optimize_with_information_rent.x[0])
optimal q without infromation rent in the model at health factor 2 : 33.56611349110644 optimal q without infromation rent in the model at health factor 3 : 41.5100503475735 optimal q with information rent in the model at health factor 3 : 22.049458570686852 optimal q with information rent in the model at health factor 3 : 41.51002996598079 This shows that the introduction of information rent in the model results in a decrease in q_healthy of: 11.51665492041959
See the output above for an example of rationing in the model when infromation rent is introduced.
I will now continue with a graphical layout of the result per combination of health factors.
I first create a dataframe of health factor combinations of 0 to 10.
#create two empty lists for both health factors
H_healthy_list = []
H_unhealthy_list = []
#create two lists with all the possible combinations of 0 to 10 with steps of 1
for i in range(1,11):
for z in range(1,11):
H_healthy_list.append(i)
H_unhealthy_list.append(z)
#create dataframe with healthy and unhealthy health factors list
combinations_df = pd.DataFrame({'H_healthy': H_healthy_list,
'H_unhealthy': H_unhealthy_list})
This dataframe consists of all possible combinations of 1 to 10. Keep in mind that only the part where the health factors are equal or when the healthy health factor is lower than teh unhealthy factor is correct data. That is why I will use the for loop in the next code part such that all cases where the unhealthy factor is lower than the healthy factor will not be used. Everything that is calculated will be appended to list such that the data can then be added to a dataframe.
#initializing lists which later will be added as columns to the dataframe
Optimal_q_healthy_without_inforent_list = []
Optimal_q_unhealthy_without_inforent_list = []
Optimal_q_healthy_with_inforent_list = []
Optimal_q_unhealthy_with_inforent_list = []
Information_rent_list = []
for index, row in combinations_df.iterrows(): #loop over the rows in the dataframe to calculate all of teh following functions for each combination
if row['H_unhealthy'] < row['H_healthy']: #setting everything to 0 when unhealthy is lower than healthy because this is nto possible
Optimal_q_healthy_without_inforent_list.append(0)
Optimal_q_unhealthy_without_inforent_list.append(0)
Optimal_q_healthy_with_inforent_list.append(0)
Optimal_q_unhealthy_with_inforent_list.append(0)
Information_rent_list.append(0)
else:
Health_factor_healthy = row['H_healthy']
Health_factor_unhealthy = row['H_unhealthy']
Health_factor = Health_factor_healthy #set health factor for original objective function without information rent
Optimal_q_healthy_without_inforent_list.append( optimize.minimize_scalar(objective_function).x) #optimal q without information rent for the healthy type
Health_factor = Health_factor_unhealthy #set health factor for original objective function without information rent
Optimal_q_unhealthy_without_inforent_list.append(optimize.minimize_scalar(objective_function).x) #optimal q without information rent for the unhealthy type
optimize_with_information_rent = optimize.minimize(objective_function_insurer, [10,40],method='Nelder-Mead', bounds = bnds) #optimize function with information rent
Optimal_q_healthy_with_inforent_list.append(optimize_with_information_rent.x[0]) #optimal q with information rent for the healthy type
Optimal_q_unhealthy_with_inforent_list.append(optimize_with_information_rent.x[1]) #optimal q with information rent for the unhealthy type
#I also add a column with the information rent of each case
Information_rent_list.append(Benefit(optimize_with_information_rent.x[0] ,Health_factor_unhealthy) - Benefit(optimize_with_information_rent.x[0], Health_factor_healthy))
combinations_df['Optimal_q_healthy_without_inforent'] = Optimal_q_healthy_without_inforent_list #create new column from list
combinations_df['Optimal_q_unhealthy_without_inforent'] = Optimal_q_unhealthy_without_inforent_list #create new column from list
combinations_df['Optimal_q_healthy_with_inforent'] = Optimal_q_healthy_with_inforent_list #create new column from list
combinations_df['Optimal_q_unhealthy_with_inforent'] = Optimal_q_unhealthy_with_inforent_list #create new column from list
combinations_df['Information_rent'] = Information_rent_list #create new column from list
With the dataframe complete, see below the head of the dataframe, I will now continue to create two extra columns:
I then show the head of the dataframe
combinations_df['Q_healthy_rationing'] = combinations_df['Optimal_q_healthy_without_inforent'] - combinations_df['Optimal_q_healthy_with_inforent']
combinations_df['Relative_Q_healthy_rationing'] = combinations_df['Q_healthy_rationing'] / combinations_df['Optimal_q_healthy_without_inforent']
combinations_df.head(5)
H_healthy | H_unhealthy | Optimal_q_healthy_without_inforent | Optimal_q_unhealthy_without_inforent | Optimal_q_healthy_with_inforent | Optimal_q_unhealthy_with_inforent | Information_rent | Q_healthy_rationing | Relative_Q_healthy_rationing | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | 1 | 22.049469 | 22.049469 | 22.049493 | 22.049447 | 0.0 | -0.000024 | -0.000001 |
1 | 1 | 2 | 22.049469 | 33.566113 | 0.000000 | 33.566148 | 0.0 | 22.049469 | 1.000000 |
2 | 1 | 3 | 22.049469 | 41.510050 | 0.000000 | 41.510018 | 0.0 | 22.049469 | 1.000000 |
3 | 1 | 4 | 22.049469 | 47.634021 | 0.000000 | 47.634043 | 0.0 | 22.049469 | 1.000000 |
4 | 1 | 5 | 22.049469 | 52.640512 | 0.000000 | 52.640540 | 0.0 | 22.049469 | 1.000000 |
heatmap_data_informationrent = np.reshape(list(combinations_df['Information_rent']), (10,10)) #reshaping the column into a 10 by 10 matrix of combinations of ehalth factors
index_Health_factors = range(1,11) #health factors as index for the matrix
#create dataframe with column and index as health factors and the data is the matrix with a rounded value of 2 decimals.
informationrent_matrix = pd.DataFrame(heatmap_data_informationrent,columns=index_Health_factors,index=index_Health_factors).round(2)
plt.figure(figsize=(12, 10)) #set figsize
sns.set(font_scale=1) #set fontsize in figure
sns.heatmap(informationrent_matrix, #data
cmap='coolwarm', #colorscheme
annot=True, #number values in the heatmap are shown
fmt='.5g', #formatting code for annotations to correctly show decimals
vmax=15) #maximum value in heatmap to scale correctly
plt.title('Heatmap of information rent per health factor combination',fontsize=17) # set title
plt.xlabel('Health Factor of Unhealthy',fontsize=12) #set xlabel
plt.ylabel('Health Factor of Healthy',fontsize=12) #set ylabel
Text(84.5, 0.5, 'Health Factor of Healthy')
The left bottom half of the heatmap is 0 because there the healthy is equal or above the unhealthy. In the right upper half the information rent for the combinations is visible, and the highest information rent is where the healthy factor is 6 and the unhealthy factor is 10.
I repeat the code of the information rent heatmap, but then take the rationing column as data. For explanation of the code see the above heatmap of information rent.
heatmap_data_rationing = np.reshape(list(combinations_df['Q_healthy_rationing']), (10,10))
index_Health_factors = range(1,11)
rationing_matrix = pd.DataFrame(heatmap_data_rationing,columns=index_Health_factors,index=index_Health_factors).round(1)
plt.figure(figsize=(12, 10))
sns.set(font_scale=1)
sns.heatmap(rationing_matrix,
cmap='coolwarm',
annot=True,
fmt='.5g',
vmax=60)
plt.title('Heatmap of rationing of q_healthy per health factor combination',fontsize=17)
plt.xlabel('Health Factor of Unhealthy',fontsize=12)
plt.ylabel('Health Factor of Healthy',fontsize=12)
Text(84.5, 0.5, 'Health Factor of Healthy')
The left bottom half again is 0 because there the healthy health factor is equal or above the unehalthy ehalth factor. The right upper half shows the value of rationing of the quantity of the healthy package. The rationing is maximum at the combination where the healthy health factor is 5 and the unehalthy health factor is 10.
The above heatmap shows the absolute values of rationing, but teh relative rationing relative to the optimal quantity of the ehalthy without infromation rent will show how much the infromation rent decreases the optimal quantity. The below heatmap shows the share of original quantity that is rationed, which means:
I repeat the code of the information rent heatmap, but then take the rationing column as data. For explanation of the code see the above heatmap of information rent.
heatmap_data_relative_rationing = np.reshape(list(combinations_df['Relative_Q_healthy_rationing']), (10,10))
index_Health_factors = range(1,11)
relative_rationing_matrix = pd.DataFrame(heatmap_data_relative_rationing,columns=index_Health_factors,index=index_Health_factors).round(1)
plt.figure(figsize=(12, 10))
sns.set(font_scale=1)
sns.heatmap(relative_rationing_matrix,
cmap='coolwarm',
annot=True,
fmt='.5g',
vmax=1)
plt.title('Heatmap of relative rationing of q_healthy per health factor combination',fontsize=17)
plt.xlabel('Health Factor of Unhealthy',fontsize=12)
plt.ylabel('Health Factor of Healthy',fontsize=12)
Text(84.5, 0.5, 'Health Factor of Healthy')
The heatmap shows that in many combinations the quantity of the healthy is fully rationed. Which means that in those combination profit is maximized for the insurer when the quantity of the healthy is set at 0 such that the information rent is 0.
I continue by creating a representative population from a skewed distribution where most people have a low health factor and with decreasing probabiity someone has a higher health factor. Which in practice is also teh case because most people only need little healthcare each year and few people need a lot of healthcare.
I then also create a function which optimizes the profit of the insurer (by changing q_healthy and q_unhealthy) based on the fact that the people have a choice whether they choose the package for the healthy and the package for the unhealthy types.
Using scipy.stats (st) skewnorm.pdf I create a probability density function (see graph) which has a lot of lower values and with decreasing probability has higher values.
skewnorm uses the existing array x and creates a normal distribution out of it with in the case below the middle point of the normal distribution at 0 and the scale parameter (how spread out the distribution is) is set at 2. The "a" parameter is set to 0 such that it is teh same as a normal distribution.
x = np.linspace(0, 10, 1000) #creates a numpy array consisting of values between 0 and 10 where the total data points is 1000 meaning that the step value is 0.01.
y1 = st.skewnorm.pdf(x, a= 0, loc = 0, scale = 2)
plt.plot(x, y1) #plot of distribution
plt.xlabel('number')
plt.ylabel('Density')
plt.show()
With this representative population I now continue to get a random sample from that distribution using numpy random choice.
np.random.seed(10) #set random seed such that the random sample is the same when rerunning the analysis
samples_health_factor = np.random.choice(x, size=1000, p=y1/np.sum(y1)) #random sample from x of size 1000 with probability of the pdf created above (y1)
From these samples I create a dataframe and from each of the Health factors of the sample I calculate the optimal q. This optimal q I then add to the dataframe.
df_samples = pd.DataFrame(data = {'samples_health_factor': samples_health_factor}) #Create dataframe with one column of samples_health_factor with the data of samples_health_factor
optimal_q_list = [] #initialize list
for i in df_samples['samples_health_factor']: #loop through rows in df
Health_factor = i #set health factor as i to calculate optimal q using the objective function
optimal_q_result = optimize.minimize_scalar(objective_function, bounds = (0, 160), method='bounded') #optimize teh objective function using minimize_scalar
optimal_q_list.append(float(optimal_q_result.x)) #appending the x attribute of the optimize outcome (the optimal q) to the optimal_q list
df_samples['optimal_q'] = optimal_q_list #adding the optimal_q list to the dataframe
df_samples.head() #head of dataframe as check that everything worked
samples_health_factor | optimal_q | |
---|---|---|
0 | 2.402402 | 37.053853 |
1 | 0.050050 | 0.954780 |
2 | 1.801802 | 31.660280 |
3 | 2.292292 | 36.144567 |
4 | 1.341341 | 26.595653 |
I continue with the sample of population to create a function that is able to calculate the maximum profit for the insurers when the insurer is able to provide two different packages on the market.
I start of with the objective fuinction for the insurer.
The objective function optimizes the two parameters of q_healthy and q_unhealthy. Then it continuous with calculating the profit for the insurer. The profit of the insurer depends on the number of people who will choose the healthy package and the number of people who will choose the unhealthy package. Keep in mind that this is without information rent.
the below profit is calculated using a for loop which will loop over the entire sample.
scenario 1:the optimal quantity of the person is below the q_unhealthy then the person will choose the pakcage of q_healthy and the profit will be the price at q_healthy and the consumers health factor minus teh cost at q_healthy.
scenario 2: the optimal quantity is above the level of q_unhealthy, then the consumer will choose q_unhealthy and the profit of the insurer will consist of the benefit at q_unhealthy with the ehalkth factor of the consumer minus the cost at q_unhealthy.
def objective_function_insurer_population(params):
q_healthy, q_unhealthy = params #optimize parameters
profit = 0 #initialize profit at 0
for index, row in df_samples.iterrows(): #for loop that loops over the rows in the dataframe
if row['optimal_q'] < q_unhealthy: #when optimal q is lower than q_unhealthy
profit_i = Benefit(q_healthy, row['samples_health_factor']) - Cost(q_healthy)
else:
profit_i = Benefit(q_unhealthy, row['samples_health_factor']) - Cost(q_unhealthy)
profit += profit_i #profit is the existing profiting + the profit of the current row in the for loop, this profit will increase the further teh loop is in the dataframe.
return -(profit) #returns the negative of the profit because the optimizer optimizes the negative of the function.
bnds = ((0,160),(0,160)) # again bounds that both are between 0 and 160
#optimizing the profit of the insurer under the bounds given the choice functions of the healthy and unhealthy types
optimize_population = optimize.minimize(objective_function_insurer_population, [10,40], method='Nelder-Mead', bounds = bnds) #10 and 40 are the initial guess and the method is nelder mead
optimize_population
final_simplex: (array([[20.32466095, 33.86141037], [20.32457644, 33.86141037], [20.32469517, 33.86141036]]), array([-4369.47396675, -4369.47396673, -4369.4739667 ])) fun: -4369.473966745163 message: 'Optimization terminated successfully.' nfev: 176 nit: 92 status: 0 success: True x: array([20.32466095, 33.86141037])
The outcome here is that q_healthy should be 19.04 and q_unhealthy should be 30.33
Important: With the current assumptions information rent does not play a role in this analysis because I assume that the insurers can perfectly price their package per consumer, which means that each consumer has a different price (which is equal to their benefit at q and their Health factor) because they all ahve different health factors. This also means that there is no difference health factor to use as a lower health factor to calcualte the infromation rent with, keep in mind that the calcualtion fo teh information rent is as follows: Benefit(q_healthy ,Health_factor_unhealthy) - Benefit(q_healthy, Health_factor_healthy).
As a result of the above mentioned limitation fo teh information rent it is not possible to create a clear population model including information rent, however what I do below is an example of a model how it could be in the case that the price of the healthy package would be set at the benefit at the mean of all health factors that fall into the healthy group and the price of the unhealthy package would be set at the benefit at the mean of all health factors in the unhealthy group.
In this case the person will choose the package he is willing to pay for. Which means that the person will choose the package for healthy types as long as their q_optimal is below q_unhealthy. Then for the insurer the profit would be the sum of these two scenario's 2 scenarios:
The information rent in this case is calculated at q_healthy using the mean health factor of all the individuals that fall into the healthy group and the mean health factor of all individuals that fall into the unhealthy group.
The information rent uses the mean of all the people that choose the healthy or unhealthy package, so in the code it filters the dataframe on teh optimal_q column whether teh eprson is part of the healthy or unhealthy group and then takes the mean of the health factor column of the dataframe.
def objective_function_insurer_population_with_information_rent(params):
q_healthy, q_unhealthy = params
profit = 0
for index, row in df_samples.iterrows():
if row['optimal_q'] < q_unhealthy:
profit_i = Benefit(q_healthy, df_samples[df_samples['optimal_q']<q_unhealthy]['samples_health_factor'].mean()) - Cost(q_healthy)
else: #Profit of the unhealthy is benefit-cost-information rent.
profit_i = Benefit(q_unhealthy, df_samples[df_samples['optimal_q']>q_unhealthy]['samples_health_factor'].mean()) - Cost(q_unhealthy) - (Benefit(q_healthy, df_samples[df_samples['optimal_q']>q_unhealthy]['samples_health_factor'].mean()) - Benefit(q_healthy, df_samples[df_samples['optimal_q']<q_unhealthy]['samples_health_factor'].mean()))
profit += profit_i
return -(profit)
bnds = ((0,160),(0,160)) #again the bounds that both parameters ahve to be between 0 and 160
#optimizing the profit of the insurer under the constraints and bounds given the choice functions of the healthy and unhealthy types
optimize_population_with_information_rent = optimize.minimize(objective_function_insurer_population_with_information_rent, [10,40],method='Nelder-Mead', bounds = bnds)
optimize_population_with_information_rent
final_simplex: (array([[ 29.13075711, 134.81815466], [ 29.13075712, 134.81806335], [ 29.13075712, 134.81816103]]), array([-4242.49040235, -4242.49040235, -4242.49040235])) fun: -4242.490402354217 message: 'Optimization terminated successfully.' nfev: 122 nit: 55 status: 0 success: True x: array([ 29.13075711, 134.81815466])
The output of the optimization shows that the optimal q_healthy is 28.49 and the optimal q_unhealthy is 137.52
This means that the whole population will choose the healthy package and that that with this objective function also is the best for the insurer. This is because the q_unhealthy at 137.52 is far above the maximum optimal_q in the data, meaning that no one will choose the unhealthy package. See below the maximum value of optimal_q in the data
df_samples['optimal_q'].max()
58.44503446321638
Research Question: How does adverse selection affect the quantity of health that is provided in the simple healthcare model and population healthcare model through rationing and information rent?
In this question the following defintions are used:
The analysis shows that:
With different consumers having different levels of health it is beneficial for the insurer to provide different packages as choices for those consumers.
However by providing these different choices adverse selection is possible, because of this it is necessary that information rent is introduced in the model such that the healthy types choose the healthy package and the unhealthy types choose the unhealthy package.
The introduction of information rent shows that with that in the objective function of the insurer the quantity provided to the healthy types will decrease such that the information rent will decrease and total profits increase. This means that when insurers will limit adverse selection using information rent that it will result in rationing of the healthy package.
Finally the analysis also shows possible outcomes of optimal qauntities for a sample of a population, but when information rent is added in that model it is still unsure how the infromation rent can be calcualted when the health levels of all consumers differ.
The analysis that I did in this notebook is all based on theory and does not use any actual data. The objective of this analysis was to show the model discussed in the paper but then extend it such that the model comes closer to reality. I did this by adding adverse selection in the simple model, and then add into that the option of information rent. The analysis then showed how information rent resulted in rationing of the healthy quantity. Finally I created a model which optimizes the quantity of a healthy and unhealthy package for a complete population which was based on a distribtution where most people have a low demand for healthcare and few people have a high demand for healthcare.
This means that this theoretical extension of the model still needs a lot of research to see whether it is representative of what happens in reality.
Also for the final part of the python code where optimum quantities for a full population with ifnromation rent is discussed also there more research is needed whether information rent would actually play a role and what sort of prices the insurer can ask for the packages.