Generating model points for ASL#

This notebook is modified from generate_model_points.ipynb and generates the sample model points for the BasicTermASL_ME model using random numbers.

Columns:

point_id: Model point identifier
age_at_entry: Issue age. The samples are distributed uniformly from 20 to 59.
sex: “M” or “F” to indicate policy holder’s sex. Not used by default.
policy_term: Policy term in years. The samples are evenly distriubted among 10, 15 and 20.
policy_count: The number of policies. Uniformly distributed from 0 to 100.
sum_assured: Sum assured. The samples are uniformly distributed from 10,000 to 1,000,000.
issue_date: Issue date. Dates are pandas.Timestamp objects.
payment_freq: Premium payment frequency as the number of payments in a year.
payment_term: Payment term in years. Set equal to or shorter than the policy term.

Number of model points:

10000

Click the badge below to run this notebook online on Google Colab. You need a Google account and need to be logged in to it to run this notebook on Google Colab.

The next code cell below is relevant only when you run this notebook on Google Colab. It installs lifelib and creates a copy of the library for this notebook.

[1]:

import sys, os

if 'google.colab' in sys.modules:
    lib = 'basiclife'; lib_dir = '/content/'+ lib
    if not os.path.exists(lib_dir):
        !pip install lifelib
        import lifelib; lifelib.create(lib, lib_dir)

    %cd $lib_dir

[2]:

import numpy as np
from numpy.random import default_rng  # Requires NumPy 1.17 or newer
import pandas as pd

rng = default_rng(12345)

# Number of Model Points
point_size = 10000

# Issue Age (Integer): 20 - 59 year old
age_at_entry = rng.integers(low=20, high=60, size=point_size)

# Sex (Char)
Sex = [
    "M",
    "F"
]

sex = np.fromiter(map(lambda i: Sex[i], rng.integers(low=0, high=len(Sex), size=point_size)), np.dtype('<U1'))

# Policy Term (Integer): 10, 15, 20
policy_term = rng.integers(low=0, high=3, size=point_size) * 5 + 10


# Sum Assured (Float): 10000 - 1000000
sum_assured = np.round((1000000 - 10000) * rng.random(size=point_size) + 10000, -3)

# Issue Date:
# For existing policies, issue dates are set so that the policies are in-force on 2022/1/1.
# For new business policie,issue dates are within 3 years from 2022/1/1.

dur_offset = (policy_term + 3) * 12 * rng.random(size=point_size) - 36
mth_offset = (dur_offset // 1).astype(int)
day_offset = 1 - (dur_offset - mth_offset)
issue_mth = pd.Series([pd.Period('2022-01', 'M')] * point_size) -1 - mth_offset
day_offset = (np.ceil(issue_mth.dt.days_in_month * day_offset)).astype(int)
issue_date = (issue_mth.dt.to_timestamp().dt.to_period('D') -1 + day_offset).dt.to_timestamp().to_numpy()

# Policy Count (Integer):
policy_count = np.rint(100 * rng.random(size=point_size)).astype(int)

# Payment Frequency
payment_freq = rng.choice([1, 2, 12], size=point_size)

# Premium Payment Term
short_paidup = pd.Series(rng.choice([True, False], size=point_size))
polterm = pd.Series(policy_term)
payment_term = polterm.mask(short_paidup & (polterm==10), 5).mask(short_paidup & (polterm>=15), 10).to_numpy()

[3]:

import pandas as pd

attrs = [
    "age_at_entry",
    "sex",
    "policy_term",
    "policy_count",
    "sum_assured",
    "issue_date",
    "payment_freq",
    "payment_term"
]

data = [
    age_at_entry,
    sex,
    policy_term,
    policy_count,
    sum_assured,
    issue_date,
    payment_freq,
    payment_term
]

model_point_table = pd.DataFrame(dict(zip(attrs, data)), index=range(1, point_size+1))
model_point_table.index.name = "policy_id"
model_point_table

[3]:

	age_at_entry	sex	policy_term	policy_count	sum_assured	issue_date	payment_freq	payment_term
policy_id
1	47	M	10	86	622000.0	2021-12-15	1	5
2	29	M	20	56	752000.0	2004-07-02	2	20
3	51	F	10	83	799000.0	2020-10-02	12	10
4	32	F	20	72	422000.0	2011-08-05	1	10
5	28	M	15	99	605000.0	2017-05-22	2	10
...	...	...	...	...	...	...	...	...
9996	47	M	20	25	827000.0	2008-12-01	1	10
9997	30	M	15	81	826000.0	2008-01-13	1	15
9998	45	F	20	10	783000.0	2009-11-07	2	10
9999	39	M	20	9	302000.0	2021-01-22	12	10
10000	22	F	15	18	576000.0	2008-03-16	2	10

10000 rows × 8 columns

[4]:

model_point_table.to_excel("model_point_table.xlsx")