This quirk of forecasting was first pointed out by one of the greatest 20th century economists, John Maynard Keynes, in the context of judging beauty contests where the winner is the one with the most votes. (C R Sasikumar)
The recently concluded Indian elections to the Lok Sabha threw up many surprises. Arguably, the biggest surprise was the difference between the actual results and the forecasts made by almost all national exit polls just three days earlier. Several of these exit polls forecast that the National Democratic Alliance (NDA) would get more than 360 out of the 543 seats with three of them, forecasting an upper range exceeding 400 seats for the NDA. All these polls also forecast that the Bharatiya Janata Party (BJP), the biggest constituent of the NDA, would significantly improve upon its 2019 tally of 303 seats. As it was, the NDA ended with 293 while the BJP got 240 seats thereby losing its individual majority in the Lok Sabha.
Why did the national pollsters get it so wrong? Even more importantly, why was there such unanimity amongst the pollsters? The conspiracy theorists have naturally had a field day with explanations ranging from malafide motivations to all pollsters using the same sample and other, even wilder, conjectures.
There may, however, be a more benign explanation for both the similarity of the projections and the big errors. Forecasting seats using exit polls of a random sample of actual voters involves two steps. First, the pollster has to use the voter responses to forecast the vote share of the different parties. Second, due to the first-past-the-post electoral system, the vote share figures have to then be converted to projections about seats.
There are significant complications in both steps. Voters are often hesitant to truthfully reveal their vote. Moreover, since the sample of voters that are chosen is random and small, extrapolating the vote share estimate to seat estimates across different constituencies is subject to guesswork and massive uncertainty. Sitting on top of all this uncertainty is the commercial pressure that pollsters face. Their business model depends not just on the accuracy of their forecast but, crucially, also on how well their forecast stacks up against the forecasts of other pollsters.
The situation facing these pollsters is one that confronts forecasters in many fields, including economic statistics and weather. The world that forecasters try to forecast is inherently uncertain. They all use some indicators to forecast but the link between the indicators and the actual outcome is imperfect at best. Moreover, forecasters across fields operate in competitive forecasting markets.
I had to briefly navigate the treacherous terrain of economic forecasting many years ago. Early into the job, I received a piece of advice from a veteran of the field that proved to be hugely useful. His approach started with the recognition that the statistical likelihood of getting a forecast exactly right was zero (this actually is a proposition that can be proved). Consequently, his approach was to first use his statistical model to produce a forecast. Let us call this the fundamental forecast. Then, he would try to figure out what his competitors were forecasting. There are of course different ways of trying to figure that out ranging from statistical approaches to spending significant time in networking. This process would typically give him a range of forecasts in the market. He would then adjust his fundamental model-based forecast to produce a number within this range. Crucially, he would pick a number which was towards one of the two ends of the range.
This forecasting strategy was useful under all outcomes. If the actual number came out to be inside the range of forecasts from different forecasters, then he could say that both the actual number and his forecast were within the range of forecasts in the market. If the actual number was outside the range of forecasts and breached the range towards the opposite end from the one that he had picked, then he could pass off the error as a one-time special case due to some data vagary that affected all forecasters since all forecasters had erred. Hence, there was no loss of market reputation relative to other forecasters. If, however, the actual number was outside the range of forecasts but breached the range towards the end that he had picked, then he could say that everyone got it wrong but his forecast was one of the closest to the actual number. This would be a marketing coup and enhance his reputation.
This approach often had interesting implications. Since many forecasters used this strategy, it led to the range of forecasts itself being narrow since everyone was trying to forecast based on their forecasts of the forecasts of other forecasters. Additionally, the process often led to the range of forecasts becoming increasingly de-linked from the fundamental model-based forecast.
This quirk of forecasting was first pointed out by one of the greatest 20th-century economists, John Maynard Keynes, in the context of judging beauty contests where the winner is the one with the most votes. Hence, a judge trying to pick the winner would try to forecast the choices of the other judges. There are many areas that are prone to this phenomenon, the biggest being stock prices. The price of a stock depends on the collective views of many investors. Hence, forecasting a stock price involves forecasting the forecasts of other investors, which can lead to stock price movements that are unrelated to the fundamentals of the company whose stock is being priced.
What happened on June 1 to the national pollsters in India was most likely the Keynesian curse of them basing their forecasts on their forecasts of the forecasts of other pollsters. It would also explain why some of the smaller state-specific pollsters didn’t make similar mistakes.
The writer is a Royal Bank Research Professor of Economics, University of British Columbia
Click here for real-time updates on the Lok Sabha Election Results 2024