The Volatility Premium and Black-Scholes Pricing

The Volatility Premium and Black-Scholes Pricing

The implied volatility is generally equal to or significantly greater than the forecasted volatility; for instance, the BSM implied volatility is, in general, an upward biased estimator. Indeed, by selling implied volatility a risk premium is provided because of the many expected and unexpected events that may occur. Moreover, market microstructure posits that implied volatility should be biased high because market makers profit from the bid-ask spread in the options by slightly raising their quotes (i.e., going slight long volatility exposure particularly on the downside). However, this absolutely doesn’t mean that it is always possible to profit by selling implied volatility

Volatility cannot be instantaneously measured; indeed, as an average, it needs to be estimated over time. Therefore, sample length  and frequency of sampling should be chosen considering long-term volatility as the objective of estimation: 

  • sample length is important because too little data increases noise due to augmented sampling error, but too much data relies on information no longer relevant to the current state of the market
  • high sampling frequencies introduce market microstructure’s issues (like the bid/ask bounce, relevant for prices but not so for long-term volatility). However, high-frequency returns are relevant to estimate market impact but low-frequency volatility is a better predictor of high-frequency volatility. Thus, for long-term forecasting  high-frequency data is generally not a good approach 

Various measurement methods can be employed to characterize long-term volatility: 

  • Close-to-Close Estimator has well-understood sampling properties, easy to correct bias and to convert to a form involving typical daily moves; however, it is an inefficient use of data and converges very slowly.
  • Parkinson Estimator employs the daily range to include additional information but it is really only appropriate for measuring the volatility of a GBM process because it cannot handle trends and jumps. Moreover, it systematically underestimates volatility. 
  • Garman-Klass Estimator is up to eight times more efficient than close-to-close estimator but is even more biased than the Parkinson estimator. 
  • Rogers-Satchell Estimator allows for the presence of trends but cannot deal with jumps.
  • Yang-Zhang Estimator is the most efficient and specifically designed to have minimum estimation error and handles both drift and jumps. The performance degrades to that of the close-to-close estimator when the process is dominated by jumps. 
  • First Exit Time Estimator uses fundamentally different information to traditional time-series methods, it is a natural estimator for online use, converges relatively quickly, and it is not affected by the presence of noisy data or jumps. However, it requires the use of high-frequency data.

The Volatility Premium

The implied volatility is generally equal to or significantly greater than the forecasted volatility; for instance, the BSM implied volatility is, in general, an upward biased estimator. Indeed, by selling implied volatility a risk premium is provided because of the many expected and unexpected events that may occur. Moreover, market microstructure posits that implied volatility should be biased high because market makers profit from the bid-ask spread in the options by slightly raising their quotes (i.e., going slight long volatility exposure particularly on the downside). However, this absolutely doesn’t mean that it is always possible to profit by selling implied volatility; instead, the premium should be at least evaluated by subtracting the usual implied/forecast spread. For example, the realized/implied spread for the S&P 500 can be approximated by the spread between the rolling 30-day close-to-close volatility and the VIX index (3.15 on average). For a position trader, context-adjusted volatility cones can help in monitoring volatility until it reaches a high and the market makers may prefer to close their positions in order to reduce their bid-ask spreads.

However, index volatility is generally shortable. A study by Coval and Shumway (2001) showed that selling straddles on equity indices gave a weekly return of 3 percent at a Sharpe ratio of 1.2. Hodges et al. (2003) showed that buying calls on the S&P 500 and FTSE 100 was unprofitable during the period from 1985 to 2002. More recently, Broadie, Chernov, and Johannes (2009) showed that even the inclusion of the 1987 crash failed to make put buying profitable. Moreover, results can be improved by taking advantage of the fact that the volatility premium is proportionally greater when the implied volatility is low. Another enhancement is to take advantage of the fact that selling options will do better in stable or declining VIX environments. If we only sell options when the VIX is below its moving average (an EWMA with a decay factor of 0.95) performances increase. However, dealing with periods when the VIX oscillates in a range (such as from 2000 to 2003) is not straightforward because the potential for parameter-induced bias is significant. For instance, the moving average (MA) filter outperforms the level filter mainly due to the results pre-2004. After this, both strategies see their equity levels double. Indeed, the MA choice worked well in the oscillatory VIX regime but it is not warranted that would hold in general.

The Correlation Premium

By selling index volatility and buying component volatility it is possible to determine a short correlation position (or dispersion trading), a profitable strategy with Sharpe ratios comparable to selling strangles (see, for example, Driessen, Maenhout, and Vilkov 2009). Dispersion trading suffers during market turmoil when correlations increase, that is, the same periods when selling index variance suffers.

The Skewness Premium 

Indices typically have a much steeper implied volatility skew than single stocks, partly because of a correlation effect. Much of the variance premium comes from selling the out of the money puts that trade at volatilities much higher than the at-the-money level. Consider the results of selling the 30 delta risk reversal (selling the put, buying the call and delta hedging). Kozhan, Neuberger, and Schneider (2011) examined the profitability of selling skew swaps. They show that for S&P 500 options (in the period from 1996 to 2009) about half of the excess return from selling out of the money puts is due to the correlation between returns and volatility: the realized skewness.

Known Forecasting Instruments

GARCH does capture some elements of the time evolution of variance and can be motivated by a fairly simple microstructure based argument (Sato and Takayasu 2002). However, GARCH(1, 1) models can only produce term structures that exponentially tend toward the long-term mean but cannot account for the humped volatility term structures that are often observed in the market, where, for example, the two-month volatility is above both the one-month and three-month levels. GARCH models can be calibrated using maximum likelihood methods; however, the resulting log-likelihood function will be very flat and, as a consequence, fitting a model is difficult because there will be very small changes in the likelihood for a wide range of parameters. Therefore, GARCH is not a good choice because: 

  • insufficient data, at least 1,000 data points are required
  • poor initial values for the parameters
  • persistent seasonality in the data
  • the model does not fit the data, for instance, because the fat tails in the return distribution are not due to GARCH effects so that using a normal distribution in the GARCH model doesn’t address the problem completely

Fundamental data can be substantial in providing additional information to forecast volatility. However, because fundamentals frequency is widely less than price, such information mostly benefits long forecasts. Sridharan (2012) shows that the market underestimates volatility for firms with high research and development costs, high cash flow volatility, and earnings management. Instead, the market overestimates volatility for firms with: large size, high return on assets, and high leverage.

Implied volatility rank simply gauges the current level of IV relative to the IV range over the past 52-weeks. For example, an IV rank of 50% means the current level of IV is right in the middle of the 52-week range (a current IV of 15% compared to the 52-week low of 10% and the 52-week high of 20%). IV percentile calculates the percentage of days in the past 52-weeks in which the IV was lower than the current level. In other words, an IV percentile of 80% means that 80% of the days in the past 52-weeks have had lower levels of IV.

Volatility cones graphically illustrate the ranges of volatility for different trading horizons; for instance, the close-to-close estimator (but there is no reason that another estimator could not be used) is employed over periods of 20 trading days, 40 trading days, 60 trading days, 120 trading days, and 240 trading days. These correspond closely to one calendar month, two months, three months, six months, and one year. The cone shows the tendency for short-term volatilities to fluctuate more widely than longer-dated volatilities. Although obvious, volatility measurements are prone to sampling error and this more dramatically affects shorter measurement periods. Note that by overlapping data an artificial degree of correlation in the estimates of volatility is introduced and determines a source of bias requiring an adjustment factor (Hodges and Tompkins, 2002). The volatility cone is useful for placing current market information (realized volatility, implied volatility, and the spread between them) into historical context. But it doesn’t place it in the context of the current overall market; thus, it is important to use the implied/realized spread of the index as a benchmark for the amount of edge required to trade.

A general scheme to trade volatility includes:

  • Rationale: volatility is more predictable than price, forecast volatility to determine a hedge and its confidence level
  • Place forecasts in context by using a volatility cone
  • Consider the state of the entire market to determine if the specific implied volatility under examination is at an extreme level or whether this is just an appropriate level
  • Selective an edge that is clear, measurable, and understandable

For instance:

  • the impact of news is perhaps the most important factor in projecting and forecasting realized volatility for stocks and futures; thus, strategies should be assessed by comparing implied volatility with the historical volatility distribution given by the volatility cone. 
  • a well-posed hedge consists of selling one-month implied volatility at 35% percent because it is in the 90th percentile for one-month volatility over the past two years; instead, selling 35% because GARCH is forecasting 20% realized volatility relies on a  point forecast conceptually less important than the possible distribution of results
  • selling Citigroup’s implied volatility at 39% when its realized is 26% with a 6% spread on the relevant index (S&P 500 implied volatility at 24 percent when its realized was 18 percent) does not seem a well-posed trade because the context appears to be not particularly favorable. However, note that whenever the volatility cone tells you that implied volatility is historically high, you will already be short it but at least you will now know that it might not be the best time to cover.

Hedging Issues

For extensively traded options (i.e., options on indexes or exchange futures) the Black-Scholes-Merton model used to quote prices and volatilites foes not introduce relevant distortions because the market set the prices. However, if you decide to assess the goodness of prices for arbitrage purposes, trade illiquid exotic options, or you want to hedge your options the BSM may not work as expected. Indeed, in these cases, the model is critical because the information used to trade (i.e., prices and hedge ratios) is model dependent.

Trading the Smile

Smile effects can be tradable if transaction costs are sufficiently small; indeed, for products with many strikes trades can focus on specific skewness (ratio spreads and risk reversals) and kurtosis (butterflies and condors) in order to find the best strikes to trade. The basis for such trades is the fact the shape of the implied volatility curve tends to be relatively stable in a given product (although the actual reason may be different among products); thus, the objective is to quantify the deviations from this equilibrium smile.

Visualizing Smile Dynamics

For any given underlying there is no single implied volatility; indeed, options at each different strike and maturity have their own implied volatility.

However, the at-the-money level of volatility accounts for the majority of change in the smile. Studies have shown that most of the movement of the smile is a simple shift of the level (about 75%) with some minor twist (about 15%). Volatility can be plotted in several ways:

  • As a function of delta: because delta is a function of strike and expiration it allows to compare different maturities with respect to a risk-neutral probability of finishing in the money; however, also delta is a function of volatility
  • As a function of delta and adjusted: in practice, market makers divide all of the volatilities in a given month by the at-the-money volatility of that month. The resulting curve allows to compare smiles across different expiries and to compare products within the same class; indeed, such a curve is remarkably constant through time.
  • As a function of expiration: in this case, at shorter expirations the curve is more volatile than at longer ones (on the right, if ascending on the X axis) because it is mainly driven by the short-term fluctuations of the underlying, whereas at the right the curve is flatter because it is driven by the long term volatility which is more stable in character


Two rule of thumb can be employed to describe the movement of the implied volatility smile: 

  • sticky strike (or fixed skew), used by the market when the underlying is range-bound, posits that the volatility of a given strike is unchanged as the underlying moves 
  • sticky delta (or swimming skew), used when the market is trending, posits that the smile moves with the underlying so that options with a given delta keep the same volatility 


Hull (2010) attributes the volatility smile to the non-normal Skewness and Kurtosis of stock returns. Skewness. If the distribution is symmetric the skewness will be zero (so, in particular, a normal distribution will have zero skewness). If the left tail of the distribution is more pronounced than the right tail, the distribution has negative skewness. Kurtosis is a measure of the degree of fat-tailedness of the distribution. Corrado and Su (1996) have extended the Black-Scholes formula to account for non-normal skewness and kurtosis in stock return distributions by removing the effect of this deviation by including it in the pricing formula. Indeed, the skewness and kurtosis coefficients are simultaneously estimated with an implied standard deviation. Brown and Robinson (2002) provide a typographic correction (i.e., the minus sign) to the expression for the skewness coefficient derived by Corrado-Su showing that the size of the absolute error in pricing using the incorrect formula varies with the moneyness of the option. With the corrected Corrado-Su it is possible to compare the implied volatilities with the properties of the realized distribution. Indeed, Corrado and Su (1996) performed a Gram-Charlier expansion of the return distribution, i.e. the parameters are time-invariant (unless the distribution really changes shape). The European call price has the form C = CBSM +μ3Q3 +(μ4 −3)Q4 where μ3 is the skewness of the returns μ4 is the kurtosis CBSM is the normal BSM call value; the put/call parity relationship can be used to find the value of the put. Notwithstanding, Corrado-Su and its correction might provide biased prices in volatile markets, as suggested by Jurczenkoy, Mailletz, and Negrea (2002).

By using quoted option prices to back out the option parameters it is possible to obtain for a given expiry an implied volatility, skewness, and kurtosis. Specifically, such parameters are found by minimizing the squared difference between the quoted market prices and the Corrado-Su formula’s prices. In this way, it is possible to maintain records of their normal ranges and values, to construct term structures of skewness and kurtosis, to directly compare the results with measurements of the realized moments, and to construct skew and kurtosis cones. However, the volatility in the Corrado-Su formula is not the at-the-money implied volatility, and the skew and kurtosis implied moments don’t affect immediately the implied smile; indeed, the higher moments are somewhat intertwined in their effects and it is not possible to directly attribute a shift in the shape of the curve to either skewness or kurtosis. The model of Corrado and Su is only directly applicable to European options, but it is fairly straightforward to implement the idea in a tree setting where we could apply U.S. (or more general) boundary conditions. Haug (2007b) provides code for both Edgeworth and Gram-Charlier trees.


Hilpisch, I. (2017). Derivatives Analytics with Python

Chance, D. M. (2003). Analysis of Derivatives for the CFA Program

Corrado, C. and Su, T. (1996). Skewness and Kurtosis in S&P 500 Index Returns Implied by Option Prices

Hull, J.C. and S. Basu, (2010), Options, Futures and Other Derivatives

Jarrow, R and Rudd, A. (1982), Implied Volatility Skews and Skewness and Kurtosis in Stock Option Prices

MacBeth, J.D. and L.J.Merville (1979). An empirical examination of the Black-Scholes call option pricing model

Sinclair, E. (2013). Volatility Trading

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

error: Hey, drop me a line if you want some content!!