The BasicTerm_SC Model#
Overview#
The BasicTerm_SC
model is a variant of BasicTerm_S
that
is optimized for generating a compiled model using
Cython with modelx-cython.
Like the original model, BasicTerm_SC
can be exported as a pure Python model
(also called a “nomx” model), which is written as standard Python objects and does not depend on modelx.
This exported model can then be compiled with Cython using the modelx-cython package
(see the relevant blog post
for more details on modelx-cython).
A compiled version of BasicTerm_SC
runs about 7–8 times faster
than its nomx model but yields the same results as BasicTerm_S
.
With future improvements to modelx-cython, it is expected that the compiled model
will run even faster. Although BasicTerm_S
can also be compiled
by modelx-cython, it achieves only about twice the speed of its nomx model,
because it was not written to take full advantage of Cython optimizations.
The BasicTerm_SC
model thus serves as an example of
how to optimize a modelx model for generating a fast compiled version.
Optimization Strategy#
BasicTerm_SC
is derived from BasicTerm_S
by applying the following changes
to make its cythonized model run faster:
Use of primitive types To accelerate cash flow projection over a large number of model points, high-level Python objects (such as strings, lists, dictionaries, and pandas DataFrames) are removed from the
Projection
space. Formulas in that space are instead written to operate primarily on primitive numeric types, such asint
andfloat
.Separate data and projection Cells for reading input data from files have been moved to the
Data
space. Input data (such as policy attributes) is held in NumPy arrays instead of pandas DataFrames.Parameterize with array indices The
Projection
space is parameterized by an array index,idx
(rather thanpoint_id
) to identify model points.idx
is used as a key to look up the value of the selected model point in the arrays of policy attributes.
Basic Usage#
This section explains the steps to create the cythonized version of the BasicTerm_SC model.
Install C compiler, Cython, and modelx-cython#
In addition to lifelib and modelx, you need to install Cython and modelx-cython. Because Cython requires a C compiler, install one if necessary by following the instructions in Cython’s official documentation.
Then install Cython and modelx-cython using pip:
> pip install Cython
> pip install modelx-cython
Or using conda, if you’re using Anaconda:
> conda install Cython
> conda install modelx-cython
Copy the basiclife library#
Create your own copy of the basiclife library by following the steps in the Quick Start page. Within the copied folder, you will find a BasicTerm_SC subfolder that contains the model.
Because BasicTerm_SC is a modelx model, you can load and run it from IPython or Spyder (with the spyder-modelx plugin).
For example, in IPython with your current directory set to the location of BasicTerm_SC:
>>> import modelx as mx
>>> model = mx.read_model("BasicTerm_SC")
If you’re using Spyder, open the MxExplorer pane, right-click on an empty area, choose Read Model, then select the BasicTerm_SC folder.
Export and compile the model#
First, export the modelx model to create a pure-Python (nomx) model.
Use the export
method on the model object as follows (IPython example):
>>> model.export("BasicTerm_SC_nomx")
This command exports the model as a Python package named “BasicTerm_SC_nomx” in the current directory.
You can test the exported (nomx) model by importing it and accessing its cells. For instance:
>>> from BasicTerm_SC_nomx import mx_model
>>> mx_model.Projection[0].result_pv()
>>> mx_model.Projection[9999].result_pv()
(Here, mx_model
is the nomx model object.)
Cythonize the model#
The modelx-cython package provides a shell command named mx2cy
.
To generate a compiled version of BasicTerm_SC
, change to the
directory containing the exported nomx model (e.g., the folder with BasicTerm_SC_nomx)
and run:
> mx2cy BasicTerm_SC
To see help options, run:
> mx2cy --help
usage: mx2cy [-h] [--sample SAMPLE] [--spec SPEC] [--setup SETUP] [--translate-only | --compile-only] model_path
Translate an exported modelx model into Cython and compile it.
positional arguments:
model_path Path to an exported modelx model to translate into Cython
options:
-h, --help show this help message and exit
--sample SAMPLE Path to a sample file to run for collecting type information (default: sample.py)
--spec SPEC Path to a spec file for setting parameters (default: spec.py)
--setup SETUP Path to a setup file for Cython (default: setup.py)
--translate-only Perform translation only (default: False)
--compile-only Perform compilation only (default: False)
The mx2cy
command requires two files, sample.py
and spec.py
, both included in
the basiclife library. When run, mx2cy
does the following:
Executes
sample.py
to collect run-time type information forBasicTerm_SC_nomx
.Outputs a folder BasicTerm_SC_nomx_cy, containing Cython-translated files.
Compiles the translated files into a binary module using Cython, producing the compiled model in BasicTerm_SC_nomx_cy.
Test the compiled model#
Use the run_sc.py
script to test speed and memory usage. For the compiled (Cython) model:
> python run_sc.py --cython
{'value': 1448.9630534601538, 'mem_use': 477.5078125, 'time': 1.5274626000027638}
For comparison, run the nomx version:
> python run_sc.py --nomx
{'value': 1448.9630534601563, 'mem_use': 2050.42578125, 'time': 11.355696699989494}
The script calculates the present value of net cash flows for 10,000 model points and outputs the total present value, the maximum memory usage, and the time taken.
Model Specifications#
BasicTerm_SC
consists of two spaces: Data
and Projection
.
The
Data
space contains references to input files and cells for loading input data. InBasicTerm_S
, these tasks were done in the Projection space. In BasicTerm_SC, the model point table is kept as a pandas DataFrame inmodel_point_table
, and each attribute (e.g., policy_term) is stored in a NumPy array. The NumPy array index is used for identifying model points.The
Projection
space contains the projection logic for a single model point, parameterized by an array indexidx
(instead ofpoint_id
). Formulas here reference the NumPy arrays inData
.
The Data Space#
The Data
space is for reading
input data from internal files, and provides the data
to Projection
in the form
of numpy arrays. Below is the list the references
for input data and associated files in this space.
Reference |
Input File |
---|---|
model_point_table.xlsx |
|
mort_table.xlsx |
|
disc_rate_ann.xlsx |
Parameters and References
(In all the sample code below,
the global variable BasicTerm_SC refers to the
BasicTerm_SC
model.)
- model_point_table#
This reference holds model point data as a pandas DataFrame read from the internal associated file, model_point_table.xlsx.
>>> BasicTerm_SC.Data.model_poit_table age_at_entry sex policy_term policy_count sum_assured point_id 1 47 M 10 1 622000 2 29 M 20 1 752000 3 51 F 10 1 799000 4 32 F 20 1 422000 5 28 M 15 1 605000 ... .. ... ... ... 9996 47 M 20 1 827000 9997 30 M 15 1 826000 9998 45 F 20 1 783000 9999 39 M 20 1 302000 10000 22 F 15 1 576000 [10000 rows x 5 columns]
The columns of the DataFrame represents model point attributes, such as
age_at_entry
,sex
,policy_term
,policy_count
andsum_assured
. The columns are then referenced from the cells with the same names. Each of the cells then returns its attribute for all model points as a numpy array. UnlikeBasicTerm_S
,point_id
is not used as the model point identifier. Instead, the array index is used.See also
- disc_rate_ann#
This refernce holds annual discount rates by duration as a pandas Series read from the internal associated file, model_point_table.xlsx.
>>> BasicTerm_SC.Data.disc_rate_ann year 0 0.00000 1 0.00555 2 0.00684 3 0.00788 4 0.00866 146 0.03025 147 0.03033 148 0.03041 149 0.03049 150 0.03056 Name: disc_rate_ann, Length: 151, dtype: float64
This is referenced from
disc_rate_ann_array()
, which converts the Series into a numpy array, and is referenced from theProjection
space.See also
- mort_table#
This reference holds a mortality table by age and duration as a DataFrame. The table is read from the associated internal file, mort_table.xlsx.
>>> BasicTerm_SC.Data.mort_table 0 1 2 3 4 5 Age 18 0.000231 0.000254 0.000280 0.000308 0.000338 0.000372 19 0.000235 0.000259 0.000285 0.000313 0.000345 0.000379 20 0.000240 0.000264 0.000290 0.000319 0.000351 0.000386 21 0.000245 0.000269 0.000296 0.000326 0.000359 0.000394 22 0.000250 0.000275 0.000303 0.000333 0.000367 0.000403 .. ... ... ... ... ... ... 116 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 117 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 118 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 119 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 120 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 [103 rows x 6 columns]
This is referenced from
mort_table_array()
, which converts the DataFrame into a numpy array.mort_table_array()
adds rows filled withnan
to align the row index with age.mort_table_array()
is referenced from theProjection
space.See also
Model point data#
The model point data is stored in an Excel file named model_point_table.xlsx under the library directory.
Age at entry |
|
|
Sex |
Policy term |
|
Policy counts |
|
|
Point ID |
Sum assured |
Assumption data#
The mortality table is stored in mort_table.xlsx under the model folder and read
into mort_table
as a DataFrame. mort_table_array()
converts this DataFrame
into a NumPy array, adding rows filled with nan
to align row indices with ages.
mort_table_array()
is then used by
mort_rate()
in the Projection space.
The discount rate data is stored in disc_rate_ann.xlsx under the model folder
and read into disc_rate_ann
as a Series, which is then converted into a NumPy array.
Mortality table as a numpy array |
|
Annual discount rates |
The Projection Space#
The Projection
space includes
projection logic for individual model points.
Projection
is derived from
basiclife.BasicTerm_S.Projection
by applying changes
to make its compiled model run faster.
Policy attributes and other input data are read from
Data
, which is referenced as
data
in Projection
.
In Data
,
policy attributes, such as policy_term()
,
are returned as 1-dimensional numpy arrays.
Consequently, Projection
is parameterized with
idx
, which represents the array index to identify model points.
Parameters and References
(In all the sample code below,
the global variable BasicTerm_SC refers to the
BasicTerm_SC
model.)
- idx#
Array index to identify a model point. Policy attributes, such as
policy_term()
, are returned as 1-dimensional numpy arrays inData
.idx
is defined as a Reference, and its value is used for determining the selected model point. By default,0
is assigned. To select another model point, assign its array index to it:>>> BasicTerm_SC.Projection.idx = 2
idx
is also defined as the parameter of theProjection
Space, which makes it possible to create dynamic child space for multiple model points:>>> BasicTerm_SC.Projection.parameters ('idx',) >>> BasicTerm_SC.Projection[1] <ItemSpace BasicTerm_SC.Projection[1]> >>> BasicTerm_SC.Projection[2] <ItemSpace BasicTerm_SC.Projection[2]>
Projection parameters#
This model represents new business, with all model points issued at time 0. The
time step is monthly. Cash flows and other time-dependent variables are indexed by t
.
Flows that accumulate throughout period t
(until t+1
) have indices t
,
while balance items indexed by t
represent the value at that exact time.
|
Projection length in months |
|
Duration in force in years |
Model point data#
The same model_point_table.xlsx file under the model folder is referenced to obtain model point data such as ages, sum assured, and terms.
|
The attained age at time t. |
The age at entry of the selected model point |
|
|
The sex of the selected model point |
The sum assured of the selected model point |
|
The policy term of the selected model point. |
Assumptions#
mort_rate()
reads annual mortality rates from
mort_table_array()
in Data
and converts them to monthly rates via
mort_rate_mth()
.
disc_rate_mth()
reads annual discount rates from
disc_rate_ann_array()
in Data
and converts them to monthly discount factors via disc_factor()
.
lapse_rate()
is defined as a simple function of policy duration.
expense_acq()
is the acquisition expense per policy at t=0
,
and expense_maint()
is the annual maintenance expense per policy,
inflated at a constant rate (inflation_rate()
).
|
Mortality rate to be applied at time t |
Monthly mortality rate to be applied at time t |
|
|
Discount factor at time |
Monthly discount rate |
|
|
Lapse rate |
Acquisition expense per policy |
|
Annual maintenance expense per policy |
|
The inflation factor at time t |
|
Inflation rate |
Policy values#
By default, the death benefit for each policy (claim_pp()
) equals sum_assured
.
All model points pay monthly premiums for the entire policy term.
The monthly premium per policy (premium_pp()
) is calculated as
(1 + loading_prem) * net_premium_pp
,
where the net premium is set so that the present value of net premiums equals
the present value of claims. This product has no surrender value.
|
Claim per policy |
Net premium per policy |
|
Loading per premium |
|
Monthly premium per policy |
Policy decrement#
Initially, each model point is assumed to have one policy in force. The in-force policies decrease by lapses and deaths each month, and any remaining policies at the end of the policy term reach maturity and exit.
|
Number of death occurring at time t |
|
Number of policies in-force |
Initial Number of Policies In-force |
|
|
Number of lapse occurring at time t |
Number of maturing policies |
Cashflows#
Cashflows consist of acquisition expenses at t=0
, maintenance expenses thereafter,
commissions, premiums, and claims. Commissions are assumed to be 100% of premium during
the first policy year and zero afterward.
|
Claims |
|
Commissions |
|
Premium income |
|
Acquisition and maintenance expenses |
|
Net cashflow |
Present values#
Cells whose names begin with pv_
compute present values of various flows.
Although pols_if()
is not itself a cashflow, it is used as an annuity factor
in net_premium_pp()
.
Present value of claims |
|
Present value of commissions |
|
Present value of expenses |
|
Present value of net cashflows. |
|
Present value of policies in-force |
|
Present value of premiums |
|
Check present value summation |
Results#
result_cf()
returns a DataFrame of monthly cashflows and decrements for a
selected model point:
>>> result_cf()
Premiums Claims ... Policies Death Policies Exits
0 94.840000 34.180793 ... 0.000055 0.008742
1 94.005734 33.880120 ... 0.000054 0.008665
2 93.178806 33.582091 ... 0.000054 0.008588
3 92.359153 33.286684 ... 0.000054 0.008513
4 91.546710 32.993876 ... 0.000053 0.008438
.. ... ... ... ... ...
116 62.432465 63.534771 ... 0.000102 0.001107
117 62.317757 63.418038 ... 0.000102 0.001105
118 62.203260 63.301519 ... 0.000102 0.001103
119 62.088973 63.185215 ... 0.000102 0.001101
120 0.000000 0.000000 ... 0.000000 0.000000
[121 rows x 8 columns]
result_pv()
returns the present values of these cashflows,
along with each flow’s percentage relative to the present value of premiums:
>>> result_pv()
Premiums Claims Expenses Commissions Net Cashflow
PV 8251.931435 5501.074678 748.303591 1084.601434 917.951731
% Premium 1.000000 0.666641 0.090682 0.131436 0.111241
Result table of cashflows |
|
Result table of present value of cashflows |