4. Profiling and Optimizing the Model#

Last Update: 19 August 2023

This notebook demonstrates how to profile a modelx model and subsequently optimize it by using CashValue_ME_EX1 as an example. The optimized model, named CashValue_ME_EX4, is included in the savings library. It is approximately 3 to 4 times faster than CashValue_ME_EX1, though performance may vary depending on the specific hardware environment.

This approach to profiling and optimization is not exclusive to this model; it can be applied to any models in lifelib that use modelx and heavily depend on pandas. The profiling process is carried out through the following steps:

Measuring run time: Use the timeit standard library to measure the run time of CashValue_ME_EX1.
Profiling the Model: Utilize the start_stacktrace, get_stacktrace, and stop_stacktrace functions of modelx to create a profile of the model.
Analyzing Profiling Results: Review the output to identify which cells are consuming the most time.

In our example, we will find that the most time-consuming cells are those heavily relying on pandas operations. This will lead us to the following optimization strategy:

Replace Pandas with Numpy: We will concentrate on replacing pandas DataFrames and Series with numpy arrays to avoid certain time-consuming pandas operations.

It’s important to recognize that these optimizations will come with trade-offs:

Readability: Switching from pandas objects to numpy arrays may reduce the data’s readability, as pandas objects often provide more descriptive indexes than integer-indexed numpy arrays.
Functionality: While pandas offers more complex operations on tabular data, the decision between using numpy for speed or pandas for ease of use will be a choice that needs careful consideration.

Ultimately, these insights and optimization methods will guide us toward a model that balances performance with functionality, based on specific project needs.

Click the badge below to run this notebook online on Google Colab. You need a Google account and need to be logged in to it to run this notebook on Google Colab.

The next code cell below is relevant only when you run this notebook on Google Colab. It installs lifelib and creates a copy of the library for this notebook.

[1]:

import sys, os

if 'google.colab' in sys.modules:
    lib = 'savings'; lib_dir = '/content/'+ lib
    if not os.path.exists(lib_dir):
        !pip install lifelib
        import lifelib; lifelib.create(lib, lib_dir)

    %cd $lib_dir

Measuring the runtime of the model#

The code below loads CashValue_ME_EX1, assigns it to ex1.

[2]:

import timeit
import pandas as pd
import modelx as mx

[3]:

ex1 = mx.read_model('CashValue_ME_EX1')

In the default configuration of CashValue_ME_EX1, we have it set up for 1 model point across 10,000 scenarios. This means the computational demand is equivalent to running the calculations for 10,000 model points in just one scenario.

The default model point is identified by the product spec id (spec_id) A, which doesn’t include any surrender charge. However, for the purpose of this example, we’re interested in observing how surrender charge rates are determined based on different product specs. So, we’ll be adjusting the model point table to refer to a more diverse set of model points with varying product specs.

[4]:

ex1.Projection.model_point_table = ex1.Projection.model_point_moneyness  # Set multiple model points
ex1.Projection.model_point_table['spec_id'] = ['A', 'B', 'C', 'D', 'A', 'B', 'C', 'D', 'A'] # Set various spec IDs
ex1.Projection.model_point_moneyness

[4]:

	spec_id	age_at_entry	sex	policy_term	policy_count	sum_assured	duration_mth	premium_pp	av_pp_init	accum_prem_init_pp
point_id
1	A	20	M	10	100	500000	0	500000	0	0
2	B	20	M	10	100	500000	0	475000	0	0
3	C	20	M	10	100	500000	0	450000	0	0
4	D	20	M	10	100	500000	0	425000	0	0
5	A	20	M	10	100	500000	0	400000	0	0
6	B	20	M	10	100	500000	0	375000	0	0
7	C	20	M	10	100	500000	0	350000	0	0
8	D	20	M	10	100	500000	0	325000	0	0
9	A	20	M	10	100	500000	0	300000	0	0

The product specs by spec_id are defined in product_spec_table. The is_wl column indicates whether each type is whole life or not. To save run time and memory, let’s set is_wl to False for all the specs.

[5]:

ex1.Projection.product_spec_table['is_wl'] = False
ex1.Projection.product_spec_table

[5]:

	premium_type	has_surr_charge	surr_charge_id	load_prem_rate	is_wl
spec_id
A	SINGLE	False	NaN	0.00	False
B	SINGLE	True	type_1	0.00	False
C	LEVEL	False	NaN	0.10	False
D	LEVEL	True	type_3	0.05	False

For the same reason, we reduce the number of scenarios from 10,000 to 1000.

[6]:

ex1.Projection.scen_size = 1000

Now let’s see how much time the model takes for a run. The code below calculates result_pv() by measuring the run time by timeit. number=1 indicates the run is performed only once.

[7]:

timeit.timeit('ex1.Projection.result_pv()', globals=globals(), number=1)

[7]:

0.8517356999218464

Let’s output the mean of the present value of net cashflows of ex1, as we want to check it against the result of the optimized model.

[8]:

ex1.Projection.result_pv()['Net Cashflow'].mean()

[8]:

44386401.300826795

Profiling the run#

To profile ex1, we use modelx’s feature to trace a run. modelx offers 3 functions, start_stacktrace, get_stacktrace and stop_stacktrace, to start, output and stop tracing the call stack during a run. The code block below is an idiomatic expression for using the functions:

try:
    mx.start_stacktrace(maxlen=None)
    ex1.Projection.result_pv()
    df = pd.DataFrame.from_dict(
        mx.get_stacktrace(summarize=True), orient="index")
finally:
    mx.stop_stacktrace()
    ex1.clear_all()

In this example, we want more concise output on what cells are taking time and how much, so we define our custom function that profiles and reports a run using the code block above.

[9]:

def get_time_info(model):

    try:
        mx.start_stacktrace(maxlen=None)
        model.Projection.result_pv()
        df = pd.DataFrame.from_dict(
            mx.get_stacktrace(summarize=True), orient="index")
    finally:
        mx.stop_stacktrace()
        model.clear_all()

    # Remove model and space names from index
    prefixlen = len(model.name + '.Projection.')
    df.index = [s[prefixlen:] for s in df.index]

    # Add duration %
    total = df['duration'].sum()
    df['dur_perc'] = df['duration'] * 100 / total
    df = df[['calls', 'duration', 'dur_perc']]

    return df.sort_values(['dur_perc'], ascending=False)

The code below performs a profile run, and output 10 most time-consuming cells.

[10]:

ex1.clear_all() # Clear the result from the previous run
get_time_info(ex1).iloc[:10]

UserWarning: call stack trace activated
UserWarning: call stack trace deactivated

[10]:

	calls	duration	dur_perc
premium_pp(t)	121	0.168039	18.700078
surr_charge_rate(t)	121	0.152671	16.989868
net_amt_at_risk(t)	121	0.064587	7.187456
claim_pp(t, kind)	242	0.054906	6.110144
pv_claims(kind)	4	0.046301	5.152598
pols_new_biz(t)	121	0.039833	4.432807
pols_lapse(t)	121	0.034054	3.789694
pols_if_at(t, timing)	364	0.032744	3.643927
claims(t, kind)	484	0.032253	3.589191
expenses(t)	121	0.031707	3.528459

The output tells that surr_charge_rate(t) is consuming time the most, which is more than 40% of the total run time. Its fomula looks like below.

Optimizing the model#

surr_charge_rate(t) represents the surrener charge rates to be applied at time t. The surrender charge rates are defined by rate ID (such as type_1) and duration, and stored in surr_charge_table as a DataFrame.

[11]:

ex1.Projection.surr_charge_table

[11]:

	type_1	type_2	type_3
duration
0	0.10	0.08	0.05
1	0.09	0.07	0.04
2	0.08	0.06	0.03
3	0.07	0.05	0.02
4	0.06	0.04	0.01
5	0.05	0.03	0.00
6	0.04	0.02	0.00
7	0.03	0.01	0.00
8	0.02	0.00	0.00
9	0.01	0.00	0.00
10	0.00	0.00	0.00

surr_charge_table_stacked() transforms the DataFrame into a Series by combining the row and column indexes of the DataFrame into a MultiIndex.

[12]:

ex1.Projection.surr_charge_table_stacked().head()

[12]:

        duration
type_1  0           0.10
        1           0.09
        2           0.08
        3           0.07
        4           0.06
dtype: float64

surr_charge_rate(t) looks up surr_charge_table_stacked() using a MultiIndex key, which is created from surr_charge_id() and duration(t) and other cells as in the formula definition below.

def surr_charge_rate(t):
    idx = pd.MultiIndex.from_arrays(
        [has_surr_charge() * surr_charge_id(),
         np.minimum(duration(t), surr_charge_max_idx())])

    return surr_charge_table_stacked().reindex(idx, fill_value=0).set_axis(
        model_point().index, inplace=False)

Many of the formulas in our model rely heavily on pandas operations. However, we’ve found that we can achieve the same results at a significantly lower computational cost by refactoring these formulas. The key is to use numpy arrays in place of pandas DataFrames or Series.

For instance, the formula surr_charge_rate(t) can be reformulated as:

def surr_charge_rate(t):
    ind_row = np.minimum(duration(t), surr_charge_max_idx())
    return surr_charge_table.values.flat[
        surr_charge_table_column() + ind_row * len(surr_charge_table.columns)]

Here, we introduce a new function, surr_charge_table_column(), which is defined as:

def surr_charge_table_column():
    return surr_charge_table.columns.searchsorted(
        surr_charge_id(), side='right') - 1

In this revised version, surr_charge_rate(t) outputs a numpy array for each t rather than a Series. Similarly, other parts of our model can be adjusted to yield numpy arrays instead of Series or DataFrames.

Furthermore, a contributor conducted an in-depth analysis of the present value logic within the model. He found a way to express it more succinctly, making it both clearer and faster compared to the original implementation. Even though the present value for each cash flow item is computed only once, leading to minimal gains in runtime improvement, this refactoring makes the model more efficient overall.

All these refinements have been integrated into CashValue_ME_EX4, which is now part of our library. For a comprehensive comparison between the original and updated versions, you can view the link below:

View the full comparison here

In many of the updated formulas, you might notice the .values expression at the end. This property, when invoked on a DataFrame or Series, returns the contained data as a numpy array, bypassing the DataFrame or Series container.

With all that said, let’s now evaluate the performance of our optimized model.

[13]:

ex4 = mx.read_model('CashValue_ME_EX4')
timeit.timeit('ex4.Projection.result_pv()', globals=globals(), number=1)

[13]:

0.23794609995093197

[14]:

ex4.Projection.result_pv()['Net Cashflow'].mean()

[14]:

44386401.30082682

[15]:

ex4.clear_all() # Clear the result from the previous run
get_time_info(ex4).iloc[:10]

UserWarning: call stack trace activated
UserWarning: call stack trace deactivated

[15]:

	calls	duration	dur_perc
premium_pp(t)	121	0.031252	12.218230
surr_charge_rate(t)	121	0.025487	9.964533
av_pp_at(t, timing)	485	0.024317	9.507138
pv_claims(kind)	4	0.018868	7.376668
pols_lapse(t)	121	0.015675	6.128363
std_norm_rand()	1	0.015664	6.123889
pols_new_biz(t)	121	0.015619	6.106272
pols_if_at(t, timing)	364	0.015589	6.094620
pv_commissions()	1	0.013675	5.346495
maint_fee_pp(t)	121	0.010906	4.264016