4. Profiling and Optimizing the Model#
Last Update: 19 August 2023
This notebook demonstrates how to profile a modelx model and subsequently optimize it by using CashValue_ME_EX1
as an example. The optimized model, named CashValue_ME_EX4
, is included in the savings
library. It is approximately 3 to 4 times faster than CashValue_ME_EX1
, though performance may vary depending on the specific hardware environment.
This approach to profiling and optimization is not exclusive to this model; it can be applied to any models in lifelib that use modelx and heavily depend on pandas. The profiling process is carried out through the following steps:
Measuring run time: Use the
timeit
standard library to measure the run time ofCashValue_ME_EX1
.Profiling the Model: Utilize the
start_stacktrace
,get_stacktrace
, andstop_stacktrace
functions of modelx to create a profile of the model.Analyzing Profiling Results: Review the output to identify which cells are consuming the most time.
In our example, we will find that the most time-consuming cells are those heavily relying on pandas operations. This will lead us to the following optimization strategy:
Replace Pandas with Numpy: We will concentrate on replacing pandas DataFrames and Series with numpy arrays to avoid certain time-consuming pandas operations.
It’s important to recognize that these optimizations will come with trade-offs:
Readability: Switching from pandas objects to numpy arrays may reduce the data’s readability, as pandas objects often provide more descriptive indexes than integer-indexed numpy arrays.
Functionality: While pandas offers more complex operations on tabular data, the decision between using numpy for speed or pandas for ease of use will be a choice that needs careful consideration.
Ultimately, these insights and optimization methods will guide us toward a model that balances performance with functionality, based on specific project needs.
Click the badge below to run this notebook online on Google Colab. You need a Google account and need to be logged in to it to run this notebook on Google Colab.
The next code cell below is relevant only when you run this notebook on Google Colab. It installs lifelib and creates a copy of the library for this notebook.
[1]:
import sys, os
if 'google.colab' in sys.modules:
lib = 'savings'; lib_dir = '/content/'+ lib
if not os.path.exists(lib_dir):
!pip install lifelib
import lifelib; lifelib.create(lib, lib_dir)
%cd $lib_dir
Measuring the runtime of the model#
The code below loads CashValue_ME_EX1
, assigns it to ex1
.
[2]:
import timeit
import pandas as pd
import modelx as mx
[3]:
ex1 = mx.read_model('CashValue_ME_EX1')
In the default configuration of CashValue_ME_EX1
, we have it set up for 1 model point across 10,000 scenarios. This means the computational demand is equivalent to running the calculations for 10,000 model points in just one scenario.
The default model point is identified by the product spec id (spec_id
) A
, which doesn’t include any surrender charge. However, for the purpose of this example, we’re interested in observing how surrender charge rates are determined based on different product specs. So, we’ll be adjusting the model point table to refer to a more diverse set of model points with varying product specs.
[4]:
ex1.Projection.model_point_table = ex1.Projection.model_point_moneyness # Set multiple model points
ex1.Projection.model_point_table['spec_id'] = ['A', 'B', 'C', 'D', 'A', 'B', 'C', 'D', 'A'] # Set various spec IDs
ex1.Projection.model_point_moneyness
[4]:
spec_id | age_at_entry | sex | policy_term | policy_count | sum_assured | duration_mth | premium_pp | av_pp_init | accum_prem_init_pp | |
---|---|---|---|---|---|---|---|---|---|---|
point_id | ||||||||||
1 | A | 20 | M | 10 | 100 | 500000 | 0 | 500000 | 0 | 0 |
2 | B | 20 | M | 10 | 100 | 500000 | 0 | 475000 | 0 | 0 |
3 | C | 20 | M | 10 | 100 | 500000 | 0 | 450000 | 0 | 0 |
4 | D | 20 | M | 10 | 100 | 500000 | 0 | 425000 | 0 | 0 |
5 | A | 20 | M | 10 | 100 | 500000 | 0 | 400000 | 0 | 0 |
6 | B | 20 | M | 10 | 100 | 500000 | 0 | 375000 | 0 | 0 |
7 | C | 20 | M | 10 | 100 | 500000 | 0 | 350000 | 0 | 0 |
8 | D | 20 | M | 10 | 100 | 500000 | 0 | 325000 | 0 | 0 |
9 | A | 20 | M | 10 | 100 | 500000 | 0 | 300000 | 0 | 0 |
The product specs by spec_id
are defined in product_spec_table
. The is_wl
column indicates whether each type is whole life or not. To save run time and memory, let’s set is_wl
to False
for all the specs.
[5]:
ex1.Projection.product_spec_table['is_wl'] = False
ex1.Projection.product_spec_table
[5]:
premium_type | has_surr_charge | surr_charge_id | load_prem_rate | is_wl | |
---|---|---|---|---|---|
spec_id | |||||
A | SINGLE | False | NaN | 0.00 | False |
B | SINGLE | True | type_1 | 0.00 | False |
C | LEVEL | False | NaN | 0.10 | False |
D | LEVEL | True | type_3 | 0.05 | False |
For the same reason, we reduce the number of scenarios from 10,000 to 1000.
[6]:
ex1.Projection.scen_size = 1000
Now let’s see how much time the model takes for a run. The code below calculates result_pv()
by measuring the run time by timeit
. number=1
indicates the run is performed only once.
[7]:
timeit.timeit('ex1.Projection.result_pv()', globals=globals(), number=1)
[7]:
0.8517356999218464
Let’s output the mean of the present value of net cashflows of ex1
, as we want to check it against the result of the optimized model.
[8]:
ex1.Projection.result_pv()['Net Cashflow'].mean()
[8]:
44386401.300826795
Profiling the run#
To profile ex1
, we use modelx’s feature to trace a run. modelx offers 3 functions, start_stacktrace
, get_stacktrace
and stop_stacktrace
, to start, output and stop tracing the call stack during a run. The code block below is an idiomatic expression for using the functions:
try:
mx.start_stacktrace(maxlen=None)
ex1.Projection.result_pv()
df = pd.DataFrame.from_dict(
mx.get_stacktrace(summarize=True), orient="index")
finally:
mx.stop_stacktrace()
ex1.clear_all()
In this example, we want more concise output on what cells are taking time and how much, so we define our custom function that profiles and reports a run using the code block above.
[9]:
def get_time_info(model):
try:
mx.start_stacktrace(maxlen=None)
model.Projection.result_pv()
df = pd.DataFrame.from_dict(
mx.get_stacktrace(summarize=True), orient="index")
finally:
mx.stop_stacktrace()
model.clear_all()
# Remove model and space names from index
prefixlen = len(model.name + '.Projection.')
df.index = [s[prefixlen:] for s in df.index]
# Add duration %
total = df['duration'].sum()
df['dur_perc'] = df['duration'] * 100 / total
df = df[['calls', 'duration', 'dur_perc']]
return df.sort_values(['dur_perc'], ascending=False)
The code below performs a profile run, and output 10 most time-consuming cells.
[10]:
ex1.clear_all() # Clear the result from the previous run
get_time_info(ex1).iloc[:10]
UserWarning: call stack trace activated
UserWarning: call stack trace deactivated
[10]:
calls | duration | dur_perc | |
---|---|---|---|
premium_pp(t) | 121 | 0.168039 | 18.700078 |
surr_charge_rate(t) | 121 | 0.152671 | 16.989868 |
net_amt_at_risk(t) | 121 | 0.064587 | 7.187456 |
claim_pp(t, kind) | 242 | 0.054906 | 6.110144 |
pv_claims(kind) | 4 | 0.046301 | 5.152598 |
pols_new_biz(t) | 121 | 0.039833 | 4.432807 |
pols_lapse(t) | 121 | 0.034054 | 3.789694 |
pols_if_at(t, timing) | 364 | 0.032744 | 3.643927 |
claims(t, kind) | 484 | 0.032253 | 3.589191 |
expenses(t) | 121 | 0.031707 | 3.528459 |
The output tells that surr_charge_rate(t)
is consuming time the most, which is more than 40% of the total run time. Its fomula looks like below.
Optimizing the model#
surr_charge_rate(t)
represents the surrener charge rates to be applied at time t
. The surrender charge rates are defined by rate ID (such as type_1
) and duration, and stored in surr_charge_table
as a DataFrame.
[11]:
ex1.Projection.surr_charge_table
[11]:
type_1 | type_2 | type_3 | |
---|---|---|---|
duration | |||
0 | 0.10 | 0.08 | 0.05 |
1 | 0.09 | 0.07 | 0.04 |
2 | 0.08 | 0.06 | 0.03 |
3 | 0.07 | 0.05 | 0.02 |
4 | 0.06 | 0.04 | 0.01 |
5 | 0.05 | 0.03 | 0.00 |
6 | 0.04 | 0.02 | 0.00 |
7 | 0.03 | 0.01 | 0.00 |
8 | 0.02 | 0.00 | 0.00 |
9 | 0.01 | 0.00 | 0.00 |
10 | 0.00 | 0.00 | 0.00 |
surr_charge_table_stacked()
transforms the DataFrame into a Series by combining the row and column indexes of the DataFrame into a MultiIndex.
[12]:
ex1.Projection.surr_charge_table_stacked().head()
[12]:
duration
type_1 0 0.10
1 0.09
2 0.08
3 0.07
4 0.06
dtype: float64
surr_charge_rate(t)
looks up surr_charge_table_stacked()
using a MultiIndex key, which is created from surr_charge_id()
and duration(t)
and other cells as in the formula definition below.
def surr_charge_rate(t):
idx = pd.MultiIndex.from_arrays(
[has_surr_charge() * surr_charge_id(),
np.minimum(duration(t), surr_charge_max_idx())])
return surr_charge_table_stacked().reindex(idx, fill_value=0).set_axis(
model_point().index, inplace=False)
Many of the formulas in our model rely heavily on pandas operations. However, we’ve found that we can achieve the same results at a significantly lower computational cost by refactoring these formulas. The key is to use numpy arrays in place of pandas DataFrames or Series.
For instance, the formula surr_charge_rate(t)
can be reformulated as:
def surr_charge_rate(t):
ind_row = np.minimum(duration(t), surr_charge_max_idx())
return surr_charge_table.values.flat[
surr_charge_table_column() + ind_row * len(surr_charge_table.columns)]
Here, we introduce a new function, surr_charge_table_column()
, which is defined as:
def surr_charge_table_column():
return surr_charge_table.columns.searchsorted(
surr_charge_id(), side='right') - 1
In this revised version, surr_charge_rate(t)
outputs a numpy array for each t
rather than a Series. Similarly, other parts of our model can be adjusted to yield numpy arrays instead of Series or DataFrames.
Furthermore, a contributor conducted an in-depth analysis of the present value logic within the model. He found a way to express it more succinctly, making it both clearer and faster compared to the original implementation. Even though the present value for each cash flow item is computed only once, leading to minimal gains in runtime improvement, this refactoring makes the model more efficient overall.
All these refinements have been integrated into CashValue_ME_EX4
, which is now part of our library. For a comprehensive comparison between the original and updated versions, you can view the link below:
In many of the updated formulas, you might notice the .values
expression at the end. This property, when invoked on a DataFrame or Series, returns the contained data as a numpy array, bypassing the DataFrame or Series container.
With all that said, let’s now evaluate the performance of our optimized model.
[13]:
ex4 = mx.read_model('CashValue_ME_EX4')
timeit.timeit('ex4.Projection.result_pv()', globals=globals(), number=1)
[13]:
0.23794609995093197
[14]:
ex4.Projection.result_pv()['Net Cashflow'].mean()
[14]:
44386401.30082682
[15]:
ex4.clear_all() # Clear the result from the previous run
get_time_info(ex4).iloc[:10]
UserWarning: call stack trace activated
UserWarning: call stack trace deactivated
[15]:
calls | duration | dur_perc | |
---|---|---|---|
premium_pp(t) | 121 | 0.031252 | 12.218230 |
surr_charge_rate(t) | 121 | 0.025487 | 9.964533 |
av_pp_at(t, timing) | 485 | 0.024317 | 9.507138 |
pv_claims(kind) | 4 | 0.018868 | 7.376668 |
pols_lapse(t) | 121 | 0.015675 | 6.128363 |
std_norm_rand() | 1 | 0.015664 | 6.123889 |
pols_new_biz(t) | 121 | 0.015619 | 6.106272 |
pols_if_at(t, timing) | 364 | 0.015589 | 6.094620 |
pv_commissions() | 1 | 0.013675 | 5.346495 |
maint_fee_pp(t) | 121 | 0.010906 | 4.264016 |