Deep IV Factor Model¶
The DIVFM module implements the Deep Implied Volatility Factor Model from Gauthier, Godin & Legros (2025). The IV surface on a given day is modelled as a linear combination of \(p\) fixed latent functions learned by a neural network:
where \(M = \frac{1}{\sqrt{\tau}}\log\!\left(\frac{K}{F_{t,\tau}}\right)\) is the time-scaled moneyness, \(\mathbf{f}\) is a feedforward neural network with fixed weights \(\theta\) shared across all days, and \(\boldsymbol{\beta}_t\) are daily coefficients fitted in closed form via OLS.
Inference (no torch required)¶
quantflow.options.divfm.DIVFMPricer
pydantic-model
¶
Bases: OptionPricerBase
Option pricer based on the Deep Implied Volatility Factor Model (DIVFM).
The IV surface on a given day is modelled as a linear combination of p fixed latent functions learned by a neural network:
where M = log(K/F) / sqrt(tau) is the time-scaled moneyness, f is implemented by DIVFMWeights, and beta_t are daily coefficients computed in closed form via OLS.
Call prices are derived from the IV surface via Black-Scholes.
Usage¶
- Train a DIVFMNetwork
and call its
to_weights()method to obtain a DIVFMWeights instance. - Construct this pricer with those weights.
- Call calibrate with the day's observed implied volatilities to fit beta_t.
- Use maturity, price etc. as normal.
Fields:
-
ttm(dict[int, MaturityPricer]) -
weights(DIVFMWeights) -
betas(FloatArray) -
extra(FloatArray | None)
weights
pydantic-field
¶
Extracted weights of the trained DIVFM network. No torch dependency required at inference time
extra
pydantic-field
¶
Current day's observable features X, shape (extra_features,). Broadcast across all grid points in _compute_maturity. Set automatically by calibrate() when extra is provided
reset
¶
maturity
¶
Get a MaturityPricer from cache or compute a new one and return it
Source code in quantflow/options/pricer.py
price
¶
Price a single option
This method will use the cache to get the maturity pricer if possible
| PARAMETER | DESCRIPTION |
|---|---|
option_type
|
Type of the option (call or put)
TYPE:
|
ttm
|
Time to maturity
TYPE:
|
strike
|
Strike price of the option
TYPE:
|
forward
|
Forward price of the underlying
TYPE:
|
Source code in quantflow/options/pricer.py
call_prices
¶
Price a batch of call options.
Options are grouped by their ttm so each unique maturity pricer is
retrieved (and cached) once and the corresponding log-strikes are
interpolated in a single vectorised np.interp call.
| PARAMETER | DESCRIPTION |
|---|---|
ttms
|
Vector of time to maturities
TYPE:
|
log_strikes
|
Vector of log-strikes log(K/F)
TYPE:
|
Source code in quantflow/options/pricer.py
plot3d
¶
Plot the implied vols surface
It requires plotly to be installed
Source code in quantflow/options/pricer.py
calibrate
¶
Fit daily OLS coefficients from observed implied volatilities.
Given a set of options observed on a single day, computes the closed-form OLS estimate:
beta_t = (F^T F)^{-1} F^T IV_t
where F is the (N, p) matrix of factor values from the network.
Parameters¶
moneyness_ttm: Shape (N,). Time-scaled moneyness M = log(K/F) / sqrt(tau). ttm: Shape (N,). Time-to-maturity tau in years. implied_vols: Shape (N,). Observed implied volatilities. extra: Shape (N, extra_features) or None. Additional features passed to the network (e.g. time-to-earnings-announcement).
Source code in quantflow/options/divfm/pricer.py
quantflow.options.divfm.DIVFMWeights
pydantic-model
¶
Bases: BaseModel
Extracted weights of a trained DIVFMNetwork.
Implements the full network forward pass in pure numpy so that DIVFMPricer has no torch dependency at inference time.
Obtain an instance from a trained network via DIVFMNetwork.to_weights.
Fields:
-
subnet_ttm(SubnetWeights) -
subnet_moneyness(SubnetWeights) -
subnet_joint(SubnetWeights | None) -
num_factors(int) -
extra_features(int)
subnet_joint
pydantic-field
¶
Weights for the joint (M, tau) sub-network (f_4 ... f_p). None when num_factors == 3
extra_features
pydantic-field
¶
Number of additional observable features X beyond (M, tau)
forward
¶
Compute factor values for a batch of options.
Parameters¶
moneyness_ttm: Shape (N,). Time-scaled moneyness M = log(K/F) / sqrt(tau). ttm: Shape (N,). Time-to-maturity tau in years. extra: Shape (N, extra_features) or None. Additional observable features.
Returns¶
FloatArray Shape (N, num_factors). Factor values [f_1, f_2, ..., f_p].
Source code in quantflow/options/divfm/weights.py
quantflow.options.divfm.weights.SubnetWeights
pydantic-model
¶
Bases: BaseModel
Extracted weights for one sub-network (hidden layers + output layer).
Fields:
-
layers(list[LayerWeights])
quantflow.options.divfm.weights.LayerWeights
pydantic-model
¶
Bases: BaseModel
Weights for a single linear layer with batch normalization.
Combines the linear transform, optional sigmoid activation, and batch normalization into one unit matching the structure of each block in DIVFMNetwork.
Fields:
-
weight(FloatArray) -
bias(FloatArray) -
bn_mean(FloatArray) -
bn_var(FloatArray) -
bn_gamma(FloatArray | None) -
bn_beta(FloatArray | None) -
bn_eps(float) -
apply_activation(bool)
bn_gamma
pydantic-field
¶
Batch norm learnable scale (gamma), shape (out,). None for fixed (affine=False) output normalization
bn_beta
pydantic-field
¶
Batch norm learnable shift (beta), shape (out,). None for fixed (affine=False) output normalization
apply_activation
pydantic-field
¶
Whether to apply sigmoid activation before batch norm. True for hidden layers, False for the output layer
Training (requires quantflow[ml])¶
quantflow.options.divfm.network.DIVFMNetwork
¶
Bases: Module
Neural network implementing the latent factor functions
Produces \(P\) factor functions with the following structural constraints (as in gauthier):
- \(f_1 = 1\) constant, not learned
- \(f_2(\tau, X)\) depends only on time-to-maturity and optional extra features X
- \(f_3(m)\) depends only on time-scaled moneyness
- \(f_4, ..., f_p (m, \tau, X)\) unrestricted
These structural constraints improve interpretability by associating each factor with a specific dimension of the implied volatility surface.
The network uses sigmoid activations throughout to ensure the implied volatility surface is twice continuously differentiable in the strike dimension, which is required for a well-defined risk-neutral density.
| PARAMETER | DESCRIPTION |
|---|---|
num_factors
|
Total number of factors p (including the constant \(f_1\)). Must be greater or equal 3 to satisfy the structural constraints
TYPE:
|
hidden_size
|
Number of neurons per hidden layer
TYPE:
|
num_hidden_layers
|
Number of hidden layers L - 2 (default 3 gives L=5 total)
TYPE:
|
extra_features
|
Number of additional observable features X beyond (M, tau), e.g. time-to-earnings-announcement
TYPE:
|
Source code in quantflow/options/divfm/network.py
subnet_ttm
instance-attribute
¶
subnet_ttm = _make_subnet(input_size=1 + extra_features, hidden_size=hidden_size, num_hidden_layers=num_hidden_layers, output_size=1)
subnet_moneyness
instance-attribute
¶
subnet_moneyness = _make_subnet(input_size=1, hidden_size=hidden_size, num_hidden_layers=num_hidden_layers, output_size=1)
subnet_joint
instance-attribute
¶
subnet_joint = _make_subnet(input_size=2 + extra_features, hidden_size=hidden_size, num_hidden_layers=num_hidden_layers, output_size=num_joint)
to_weights
¶
Extract network weights into a DIVFMWeights instance for torch-free inference.
Source code in quantflow/options/divfm/network.py
forward
¶
Compute factor values for a batch of options.
Returns shape (N, num_factors) with factor values [f_1, f_2, ..., f_p].
| PARAMETER | DESCRIPTION |
|---|---|
moneyness_ttm
|
Shape (N,). Time-scaled moneyness M = log(K/F) / sqrt(tau)
TYPE:
|
ttm
|
Shape (N,). Time-to-maturity tau in years
TYPE:
|
extra
|
Shape (N, extra_features) or None. Additional observable features X
TYPE:
|
Source code in quantflow/options/divfm/network.py
quantflow.options.divfm.trainer.DIVFMTrainer
¶
Training loop for DIVFMNetwork.
Implements the mini-batch procedure from Gauthier, Godin & Legros (2025): at each gradient step a random subset of days is sampled from the training set, OLS factor loadings are computed in closed form for each day, and the network weights theta are updated to minimise the total IV residual.
The OLS step is fully differentiable via the normal equations, so gradients flow through beta_t back into the network parameters theta.
| PARAMETER | DESCRIPTION |
|---|---|
network
|
The network to train
TYPE:
|
lr
|
Adam learning rate
TYPE:
|
batch_days
|
Number of days sampled per gradient step (J=64 in the paper)
TYPE:
|
weight_decay
|
L2 regularisation for Adam
TYPE:
|
ridge
|
Ridge penalty added to F^T F before solving the normal equations, for numerical stability
TYPE:
|
Source code in quantflow/options/divfm/trainer.py
step
¶
Perform a single gradient update step.
Samples batch_days distinct days, computes the OLS loss for each,
and updates the network weights.
Returns the total loss for this step.
| PARAMETER | DESCRIPTION |
|---|---|
days
|
Pool of training days to sample from
TYPE:
|
Source code in quantflow/options/divfm/trainer.py
evaluate
¶
Compute the average per-day loss without updating weights.
| PARAMETER | DESCRIPTION |
|---|---|
days
|
Days to evaluate on
TYPE:
|
Source code in quantflow/options/divfm/trainer.py
fit
¶
Train the network for num_steps gradient steps.
At each step, batch_days distinct days are sampled from days,
following the mini-batch procedure described in the paper.
Returns the list of per-step training losses.
| PARAMETER | DESCRIPTION |
|---|---|
days
|
Training days
TYPE:
|
num_steps
|
Number of gradient update steps
TYPE:
|
val_days
|
Optional validation days for loss monitoring
TYPE:
|
log_every
|
Print a progress line every this many steps (0 to disable)
TYPE:
|
Source code in quantflow/options/divfm/trainer.py
to_weights
¶
Extract the trained network into a DIVFMWeights instance ready for torch-free inference.
Source code in quantflow/options/divfm/trainer.py
quantflow.options.divfm.trainer.DayData
dataclass
¶
Option data for a single trading day.
Used as the unit of input for DIVFMTrainer. Each instance holds all options observed on one day.