## Thursday, 10 September 2015

### Benford Law in Stock Market Index

The Benford's Law is an empirical law about the probability distribution of the first digit of each number. In everyday datasets, especially ones with values with different order of magnitude,  such as bill, number in addresses, populations, stock market prices and so on, the probability of the first digit is not uniform as one could think. Instead, the probability of the first digit d is proportional to the space between d and d+1 in logarithmic space. Thus, the probability of the digit d, being d a digit in [1,9] is a decreasing function which can be calcualted as:

For a more detailed description of the Law, I strongly recommend to visit the wikipedia entry for the Benford's Law.

Being a little curious about this law, I wrote a short R script to check this law against the prices of stocks in the different stock market indexes, such as S&P 500, DJ EUROSTOXX50E, FTSE 100, IBEX 35, RUSSELL 2000 and NIKKEI 225. In order to run this script, you will need a computer with a Bloomberg connection. I guess it can be easily hacked so that we can get the stock prices from the Yahoo! finance website, but I haven't tried.

library(Rbbg) #Bloomberg Connection

#Connect to Bloomberg API
conn <- blpConnect()

indexes <- c("RTY Index", "SX5E Index", "IBEX Index", "SPX Index", "NKY Index", "UKX Index")

#Get members for index and their RICS
memb_codes <- bds(conn, indexes[6], c("INDX_MEMBERS") )
rics <- paste(memb_codes[ ,1], "Equity", sep= " ")

#Get spots for each member and their first digit
spots <- bdp(conn, rics, c("PX_LAST") )
firstDigit <- as.numeric( substring( spots[ , 1], 1, 1 ) )

#Get histogram for first Digit
histogram <- hist( firstDigit, breaks = 0:9 )

#Benford Law Theoretical Probability
benfordTheoretical <- log10( 1 + 1/(1:9) )

#Result dataframe
results_UKX <- data.frame(FirstDigit = 1:9, Probability = histogram$density, Theoretical = benfordTheoretical) #For those not having a bloomberg connection, we can load the data load("BenfordLaw.RData") #Plot results plot(results_UKX$Probability, main='Benford Law for stock prices in FTSE100 Index',
xlab="First Digit", ylab="Probability", col="green", pch=1, xaxt="n")
axis(1, at=seq(1,9))
lines(results_UKX\$Theoretical, col="red", pch=0)
legend("topright", c("Results", "Theoretical"), lty=c(0,1), pch=c(1,0), col=c("green", "red"))

I attach the results for each of the index to see how well the distribution matches the theoretical Benford distribution.