Generating model points with duration#

This notebook is modified from generate_model_points.ipynb and generates the sample model points for the BasicTerm_SE and BasicTerm_ME model, by using random numbers. The model ponints have the duration_mth attribute, which indicates how many months elapsed from the issue of each model point to time 0. Negative duration_mth indicate future new business.

Columns:

  • point_id: Model point identifier

  • age_at_entry: Issue age. The samples are distributed uniformly from 20 to 59.

  • sex: “M” or “F” to indicate policy holder’s sex. Not used by default.

  • policy_term: Policy term in years. The samples are evenly distriubted among 10, 15 and 20.

  • policy_count: The number of policies. Uniformly distributed from 0 to 100.

  • sum_assured: Sum assured. The samples are uniformly distributed from 10,000 to 1,000,000.

  • duration_mth: Months elapsed from the issue til t=0. Negative values indicate future new business. Uniformly distributed from -36 to 12 times policy_term.

Number of model points:

  • 10000

Click the badge below to run this notebook online on Google Colab. You need a Google account and need to be logged in to it to run this notebook on Google Colab. Run on Google Colab

The next code cell below is relevant only when you run this notebook on Google Colab. It installs lifelib and creates a copy of the library for this notebook.

[1]:
import sys, os

if 'google.colab' in sys.modules:
    lib = 'basiclife'; lib_dir = '/content/'+ lib
    if not os.path.exists(lib_dir):
        !pip install lifelib
        import lifelib; lifelib.create(lib, lib_dir)

    %cd $lib_dir
[2]:
import numpy as np
from numpy.random import default_rng  # Requires NumPy 1.17 or newer

rng = default_rng(12345)

# Number of Model Points
MPCount = 10000

# Issue Age (Integer): 20 - 59 year old
age_at_entry = rng.integers(low=20, high=60, size=MPCount)

# Sex (Char)
Sex = [
    "M",
    "F"
]

sex = np.fromiter(map(lambda i: Sex[i], rng.integers(low=0, high=len(Sex), size=MPCount)), np.dtype('<U1'))

# Policy Term (Integer): 10, 15, 20
policy_term = rng.integers(low=0, high=3, size=MPCount) * 5 + 10


# Sum Assured (Float): 10000 - 1000000
sum_assured = np.round((1000000 - 10000) * rng.random(size=MPCount) + 10000, -3)

# Duration in month (Int): -36 < Duration(mth) < Policy Term in month
duration_mth = np.rint((policy_term + 3) * 12 * rng.random(size=MPCount) - 36).astype(int)

# Policy Count (Integer): 1
policy_count = np.rint(100 * rng.random(size=MPCount)).astype(int)
[3]:
import pandas as pd

attrs = [
    "age_at_entry",
    "sex",
    "policy_term",
    "policy_count",
    "sum_assured",
    "duration_mth"
]

data = [
    age_at_entry,
    sex,
    policy_term,
    policy_count,
    sum_assured,
    duration_mth
]

model_point_table = pd.DataFrame(dict(zip(attrs, data)), index=range(1, MPCount+1))
model_point_table.index.name = "policy_id"
model_point_table
[3]:
age_at_entry sex policy_term policy_count sum_assured duration_mth
policy_id
1 47 M 10 86 622000.0 1
2 29 M 20 56 752000.0 210
3 51 F 10 83 799000.0 15
4 32 F 20 72 422000.0 125
5 28 M 15 99 605000.0 55
... ... ... ... ... ... ...
9996 47 M 20 25 827000.0 157
9997 30 M 15 81 826000.0 168
9998 45 F 20 10 783000.0 146
9999 39 M 20 9 302000.0 11
10000 22 F 15 18 576000.0 166

10000 rows × 6 columns

[4]:
model_point_table.to_excel("model_point_table.xlsx")