One thing we admire in our cricketers is consistency. Full marks to the gritty player who scores 50 on a minefield, even though he gets out for 50 when well set on a featherbed. But do we admire so much his team-mate who gets a duck in the first instance, but makes amends by crashing an impressive 100 in the second? They have the same average – but do they provide the same value?
Consistency can measured by calculating the standard deviation, which, in simple terms, seeks to measure the average deviation that each score is from the overall mean. The lower the standard deviation, the lower the variation in the scores.
We can obviously apply this to cricket scores, but a couple of issues need to be resolved: what to do with “not out” scores, and how can we use it to compare the consistency of players with different averages?
To resolve the first, I elected to add any uncompleted innings to the next innings, so that effectively, I was calculating the standard deviation of the runs made between dismissals. If the last innings was a “red ink”, it was ignored.
To allow comparison of consistency between different players, I simply divided the calculated standard deviation by the batting average (ignoring the last innings if it was “not out”).
I performed this exercise three times for Test cricketers; for those who scored at least 1000 runs, for those who scored at least 5000 runs, and for those who scored at least 10000 runs.
The first table lists the most consistent Test batsmen who have scored at least 1000 runs. Australia’s Bruce Laird, who scored with such consistency without scoring a century in his brief late-70s career, heads the list, and is followed by the admirable Sutcliffe, whose consistency is astounding given the extent of his career. Alastair Cook and MS Dhoni are notable current players in this list.
Table 1: Consistency Index: Most Consistent (Minimum 1000 runs)
Batsman
Team
CI
SD
Average
Matches
Innings
Not Out
Runs
Bruce Laird
Australia
0.75
26.48
35.29
21
40
2
1341
Herbert Sutcliffe
England
0.78
47.22
60.73
54
84
9
4555
Douglas Jardine
England
0.79
37.08
46.70
22
33
6
1296
Ashley Giles
England
0.80
16.81
20.90
54
81
13
1421
Alastair Cook
England
0.81
34.24
42.09
36
66
2
2694
Maurice Tate
England
0.82
20.96
25.49
39
52
5
1198
Rusi Surti
India
0.83
23.72
28.70
26
48
4
1263
Jock Cameron
South Africa
0.83
25.05
30.22
26
45
4
1239
George Gunn
England
0.83
33.39
40.00
15
29
1
1120
Chandika Hathurusingha
Sri Lanka
0.84
24.74
29.63
26
44
1
1274
Ian Redpath
Australia
0.84
36.62
43.46
66
120
11
4737
Sid Barnes
Australia
0.85
53.39
63.06
13
19
2
1072
Mark Richardson
New Zealand
0.86
38.33
44.77
38
65
3
2776
Taufeeq Umar
Pakistan
0.87
34.22
39.30
25
46
2
1729
Imran Farhat
Pakistan
0.88
29.02
33.10
27
51
1
1655
Charles Kelleway
Australia
0.88
32.83
37.42
26
42
4
1422
Dwayne Bravo
West Indies
0.88
28.74
32.73
31
57
1
1833
Peter Richardson
England
0.88
33.08
37.47
34
56
1
2061
Chetan Chauhan
India
0.89
28.07
31.58
40
68
2
2084
Colin Bland
South Africa
0.89
43.67
49.09
21
39
5
1669
Trevor Goddard
South Africa
0.89
30.67
34.47
41
78
5
2516
Deryck Murray
West Indies
0.89
20.40
22.91
62
96
9
1993
Mahendra Singh Dhoni
India
0.89
32.20
36.14
35
56
6
1807
David Sheppard
England
0.89
33.70
37.81
22
33
2
1172
Alan Davidson
Australia
0.89
21.97
24.59
44
61
7
1328
At the other end, we also have some current players in the least consistent category, notably Sinclair, Taibu, and until recently, Atapattu, who mixed a dreadful sequence of low scores early in his career with some heavy scoring later on:
Table 2: Consistency Index: Least Consistent (Minimum 1000 runs)
Batsman
Team
CI
SD
Average
Matches
Innings
Not out
Runs
Matthew Sinclair
New Zealand
1.62
52.70
32.55
32
54
5
1595
Vinoo Mankad
India
1.51
47.57
31.48
44
72
5
2109
Jacques Rudolph
South Africa
1.49
53.81
36.21
35
63
7
2028
Guy Whittal
Zimbabwe
1.48
43.65
29.43
46
82
7
2207
Tatenda Taibu
Zimbabwe
1.45
42.94
29.60
24
46
3
1273
Wasim Akram
Pakistan
1.44
32.57
22.63
104
147
19
2898
Mohammad Ashraful
Bangladesh
1.43
34.10
23.88
48
93
4
2125
Javagal Srinath
India
1.43
20.31
14.21
67
92
21
1009
Wasim Jaffer
India
1.42
48.30
34.11
31
58
1
1944
Vic Pollard
New Zealand
1.41
34.35
24.35
32
59
7
1266
Dilip Sardesai
India
1.40
55.10
39.24
30
55
4
2001
Sidath Wettimuny
Sri Lanka
1.39
40.31
29.07
23
43
1
1221
Marvan Atapattu
Sri Lanka
1.39
54.40
39.02
90
156
15
5502
Matthew Elliot
Australia
1.38
46.20
33.49
21
36
1
1172
Madan Lal
India
1.38
31.27
22.65
39
62
16
1042
Ridley Jacobs
West Indies
1.37
38.70
28.32
65
112
21
2577
Tim Robinson
England
1.36
49.34
36.39
29
49
5
1601
Bill Ponsford
Australia
1.35
65.00
48.23
29
48
4
2122
John Bracewell
New Zealand
1.35
27.56
20.43
41
60
11
1001
Jimmy Adams
West Indies
1.35
55.72
41.26
54
90
17
3012
Now for the serious Test batsmen:
Table 3: Consistency Index: Most Consistent (Minimum 5000 runs)
Batsman
Team
CI
SD
Average
Matches
Innings
Not Out
Runs
Jack Hobbs
England
0.92
52.33
56.95
61
102
7
5410
Don Bradman
Australia
0.94
93.49
99.94
52
80
10
6996
Arjuna Ranatunga
Sri Lanka
0.94
33.48
35.50
93
155
12
5105
John Wright
New Zealand
0.97
36.58
37.83
82
148
7
5334
Mark Waugh
Australia
0.97
40.58
41.82
128
209
17
8029
Graham Thorpe
England
0.98
43.25
44.23
100
179
28
6744
Rohan Kanhai
West Indies
0.98
46.58
47.53
79
137
6
6227
Clive Lloyd
West Indies
0.99
46.44
46.68
110
175
14
7515
Denis Compton
England
1.00
49.9
50.06
78
131
15
5807
Sourav Ganguly
India
1.00
42.22
42.18
113
188
17
7212
Bill Lawry
Australia
1.03
48.43
47.15
67
123
12
5234
Ken Barrington
England
1.03
59.9
58.28
82
131
15
6806
Matthew Hayden
Australia
1.04
52.77
50.74
103
184
14
8625
Ricky Ponting
Australia
1.05
59.47
56.88
128
215
26
10750
Michael Slater
Australia
1.05
45.09
42.84
74
131
7
5312
Doug Walters
Australia
1.06
50.86
48.10
74
125
14
5357
Marcus Trescothick
England
1.06
46.34
43.80
76
143
10
5825
Sunil Gavaskar
India
1.06
54.42
51.12
125
214
16
10122
David Gower
England
1.07
47.29
44.25
117
204
18
8231
Vivian Richards
West Indies
1.07
53.69
50.24
121
182
12
8540
Michael Atherton
England
1.07
40.41
37.70
115
212
7
7728
Len Hutton
England
1.07
60.86
56.67
79
138
15
6971
The higher Consistency Indices show that it is much harder to maintain consistency over a longer career. It is interesting to observe that the two most consistent batsmen are two “old-timers”, Hobbs and Bradman – class will out! And who would have thought that the most consistent Australian after Bradman in this category was Mark Waugh!
At the other end of the scale for this category, we find Waugh’s twin brother prominently placed:
Table 4: Consistency Index: Least consistent (Min 5000 runs)
Player
For
CI
SD
Ave
M
I
NO
Runs
Marvan Atapattu
SL
1.39
54.40
39.02
90
156
15
5502
Zaheer Abbas
Pak
1.32
59.29
44.80
78
124
11
5062
Kumar Sangakkara
SL
1.31
71.23
54.38
78
129
9
6525
Virender Sehwag
Ind
1.27
64.81
51.06
66
114
4
5617
Steve Waugh
Aus
1.26
64.16
51.06
168
260
46
10927
Shivnarine Chanderpaul
WI
1.25
62.37
49.72
114
196
31
8203
Brian Lara
WI
1.24
65.33
52.89
131
232
6
11953
Herschelle Gibbs
SA
1.24
51.85
41.95
90
154
7
6167
Ian Botham
Eng
1.24
41.69
33.55
102
161
6
5200
Sanath Jayasuriya
SL
1.23
49.15
40.07
110
188
14
6973
VVS Laxman
Ind
1.22
54.24
44.46
102
169
24
6446
Aravinda de Silva
SL
1.21
52.21
42.98
93
159
11
6361
Mark Taylor
Aus
1.19
51.55
43.50
104
186
13
7525
Wally Hammond
Eng
1.19
69.46
58.46
85
140
16
7249
Jacques Kallis
SA
1.19
64.91
54.58
128
216
33
9988
Mahela Jayawardene
SL
1.18
61.73
52.36
100
164
12
7959
Carl Hooper
WI
1.18
43.09
36.47
102
173
15
5762
Sachin Tendulkar
Ind
1.1
64.28
54.28
156
256
27
12429
Rahul Dravid
Ind
1.17
61.07
52.28
131
227
26
10509
Stephen Fleming
NZ
1.17
47.05
40.07
111
189
10
7172
The case of Chanderpaul is interesting. Ten years ago, he was heading towards being one of the most consistent batsmen ever, with a CI of 0.82. Over the last decade, while he has been one the Windies few shining lights, there has also been much greater variation in his scoring.
This group also contains a few batsmen who play more aggressively than most: Sehwag, Jayasuriya and Botham are notable here. One would expect, naturally, their consistency to suffer as a result of their aggression.
Finally, a table just for the mega-stars, those who have scored 10000 Test runs, plus Kallis, who will surely join them the next time he goes to bat:
Table 5: Consistency Index: Top eight run-scorers
Player
For
CI
SD
Ave
M
I
NO
Runs
Ricky Ponting
Aus
1.05
59.47
56.88
128
215
26
10750
Sunil Gavaskar
Ind
1.06
54.42
51.12
125
214
16
10122
Allan Border
Aus
1.08
54.45
50.37
156
265
44
11174
Rahul Dravid
Ind
1.17
61.07
52.28
131
227
26
10509
Sachin Tendulkar
Ind
1.18
64.28
54.28
156
256
27
12429
Jacques Kallis
SA
1.19
64.91
54.58
128
216
33
9988
Brian Lara
WI
1.24
65.33
52.89
131
232
6
11953
Steve Waugh
Aus
1.26
64.16
51.06
168
260
46
10927
I for one was surprised to find the Aussie captain heading this list, and Tendulkar so far down the table. And perhaps Gavaskar was a better player than he is perhaps given credit for.
I hope the browsers of this site find this a worthwhile exercise. I would value their comments.
I find these things little confusing. May be stats reveal a part of cricketing ability. If you look at players who have scored more than 10,000 runs. None of them have CI less than 1. Which shows that you will have fluctuations when you play over a longer period. So you can't really look at these CI index seriously. This never brings up something unique out! All it does it get people confused.
Increase the number of runs by 1000 from 5000. It will reveal that the lowest CI is consistently increasing. Well Don being Don is an exception.
Ric's comment: The idea is not to view the CI's themselves, but to see how batsmen with similar records compare in their consistency. It doesn't make sense to compare the CI of someone who has played three Tests with someone who has played 150. That's why I have grouped them as I have. The fact that the CIs increase as they play more is interesting, but irrelevant.
Posted by: smilingbuddha at January 30, 2009 12:37 PM
Great post, found it very satisfying.
Posted by: D.V.C. at January 30, 2009 12:42 PM
Mark Waugh makes perfect sense. His highest score was just 143 and his average was about 5 runs less than his team mates as a result of him never going on to big hundreds. He needed to be consistent to hold his spot, and he was, he almost never looked out of form in Tests.
Ric's comment: It's good when the stats match the perceptions!
Posted by: Ajay Nair at January 30, 2009 12:43 PM
Why would a batsman more consistent according to the formula defined here be better than one who is less consistent, especially among the serious batsmen (5000+ runs)? Most of these guys with one notable exception, average between 45-60. In that case someone who has less standard deviation has more scores in that range, I'd guess; while a Lara or Tendulkar will have more extremes. However, a 45-60 score is of almost no use in a test match - it's certainly unlikely to have a match-winning impact. Meanwhile, a batsman who'd make substantial scores (80+) more often than the 'more consistent' batsmen are likely to have a match-winning/saving impact in these games.
Posted by: Ross at January 30, 2009 12:49 PM
That's quite surprising - particularly Jacques Kallis, who I've always thought is so consistent is quite prominently on the least consistent list?
Posted by: Jeff at January 30, 2009 12:49 PM
Hi Ric,
Thanks for the analysis.
On Chanderpaul, until 1998 i'd say he was consistent but average. Then from 1999 to 2006 i'd say he was good but inconsistent. But from 2007 onwards he has been consistently brilliant - I had a quick look at his scores over the past 2 years and he's averaging over 100 with a CI of about 1.
As for whether it's a good thing to be consistent, I think this is partly related to your average... the lower the players average, the more valuable the inconsistency becomes.
At the extreme end, i'd rather a player who averages 1 in 100 inns scores all of those runs in a single innings rather than 1 run in every inns.
However, at the other extreme, if a player averages 100 over 4 innings, i'd rather they were 4 innings of 100 rather than one Lara-esque inns of 400 and 3 ducks.
I hope this makes sense.
Ric's comment: I can see what you are getting at! You make a good point! You are saying it is good to be consistently good, but bad to be consistently bad!
Posted by: ram at January 30, 2009 1:02 PM
was really insane..these statistics are mere description of ones class
Posted by: Pak at January 30, 2009 1:07 PM
I feel that ricky pointing is under rated as shown by this list He is one of the most consistent batsmen and as good as Tendulkar and lara
Posted by: TomC at January 30, 2009 1:10 PM
In the originally proposed forumla the correspondent sugegsted using SD (standard deviation) to measure consistency. Shouldn't SD beused only for normal distributions? Batsmen's runs follow follow a positively skewed distribution, i.e. where the distribution curve tapers further on the right of the curve than on the left. Is SD then the appropriate statistic?
Posted by: Ashutosh Sinha at January 30, 2009 1:11 PM
Just one question? How do you take into account the conditions of the pitch as you had mentioned this in your problem statement at the beginning of the article?
Posted by: PMG at January 30, 2009 1:22 PM
Seems that a lot of the emotion and excitement of batting is clustered around those in the high level inconsistency block.
Posted by: Henry at January 30, 2009 1:26 PM
Wouldn't skew have a dis-proportionate effect on the s.d. rather than the mean? Did you assess the distributions of the data? Perhaps a transform might reduce the impact of outliers?
Posted by: Koos van Zyl at January 30, 2009 1:55 PM
I've thought about a consistency index for batsmen before, but I do think the simple StDev/Ave calculation is too naive. For one, scoring a triple hundred, say, will seriously affect your Consistency Index in a bad way.
When I think of consistent, I think of someone who makes the runs, every time. Someone who will always cross the 40-run mark, almost guaranteed.
I don't have the resources to check which indicator would be a good one, but I have a few suggestions which might work better?
* Use the median score instead of the average score in the standard deviation calculation.
* Calculate the standard deviation only for scores less than the player's average/median.
* Maybe even a simple [#scores>50 between dismissals]/[#dismissals] will do the trick, where 50 can maybe be replaced by some other number, like average-5 or something.
* perhaps a study of the type of distribution scores take (I don't think it's the bell curve...) and using some parameter as the index?
Ric's Comment: Sure, Koos, there are many ways to do this, probably, but my gut feeling is that doing it this way gives a reasonably accurate reflection of actuality - as may do the ways you suggest.
Scoring a triple hundred will only affect your CI adversely if you don't do it regularly. If you don't, then surely you are being inconsistent! Bradman's CI is quite low, despite his high scores, because he did it regularly.
Posted by: Anonymous at January 30, 2009 1:59 PM
Actually it's not surprising to find Tendulkar down on that list.
Out of the 10000 run plus club Tendulkar has more injuries than all the rest.
I think if you take Tendulkar as preinjury ,say before 2001 and post injuries after 01/02 that's where you will get some difference.
In 2003 I think he scored some 150 runs at an avg of 15!!
I remember jokes going around that time about how even Murli had more runs and a better avg. than Tendulkar!!
Posted by: David brennan at January 30, 2009 2:01 PM
Good article. I am surprised Steve Waugh does not feature more promininatly. From 1995 to the 5th Test at the Oval in 2001-he averaged 70 in first innings and any time Australia struggled he always seemed to score at least a half century.
Posted by: A G at January 30, 2009 2:13 PM
Would be interesting to see something similar for ODI's.
Ric's comment: I can certainly do that!
Posted by: Waqas at January 30, 2009 2:29 PM
The definition of consistency index is little biased. Extreme value for example a score of zero and a score of 220 will create a lot of discrepancy in standard deviation.
In my opinion, the best way to check the consistency is by crossing rate around a particular value/values. In plain word, the higher the crossing rate the lower is the consistency.
Posted by: sdr at January 30, 2009 2:50 PM
What follows is a bit statsy:
Interesting idea. However, by computing mean and sd in this manner, you are implicitly assuming that scores can be modeled as a Gaussian distribution. The biggest problem with this is that you are not taking into account the fact that scores can not be negative. A Gamma distribution might be more appropriate? Once fitted, you could still compute the variance. It would also better model the occasional high scores which will punished by the mean, sd method.
Posted by: Nishith Prabhakar at January 30, 2009 3:50 PM
Not convincing at all. What you have calculated is simply call the co-efficient of variance in statistics. Again, as long as you were calling it a consistency index, it probably was acceptable. But the moment you equated that to "greatness" or the measure of how good a batsman was, the analysis became quite unnecessary.
Anyway, the coefficient of variance only measures the dispersion of the data, but should ONLY be used with ratio scales. Cricket batting scores are ordinal scale.
Ric's comment: I was measuring consistency, not greatness. The two, however, are not mutually exclusive in my view. Bradman was both great and consistent. I don't think that was an accident.
Posted by: Pratik at January 30, 2009 4:37 PM
Nice analysis of consistency, but you might want to look at a cases where a batsman performs well in a crisis, but not when scoring is easy. Some players (like Steve Waugh) have played consistently well in tough situations, but have got out without scoring much in easier situations. Hence, they have a relatively poor consistency record, from a mathematical viewpoint. But, from the opening paragraph of this article I had the impression that one of the motives behind this exercise was "who brings more value" to the side. Despite Mark Waugh showing higher consistency, almost everyone would rather have Steve instead of mark in a crisis situation.
By the way, it is interesting to see the much maligned Saurav ganguly leading the list of Indian batsmen with more than 5000 test runs. Perhaps dada is a better player than many would make him out to be? It is also a bit surprising to find Dravid so far down. Is this was mainly due to his recent form slump? Any idea about his CI pre-2007 ?
Ric's comment: Dravid's CI prior to 1905 was 1.16. By the end of the 2006-07 season, it was 1.14. It is now 1.17!
I think Mark Waugh performed pretty well in crisis situations, and his low CI bears this out. I can recall some crucial innings he played in tight situations. Steve certainly did have a better reputation in this respect though, deservedly or otherwise. His CI is possibly a bit higher than expected by the high number of not outs he had, a result, some cynics would say, of often batting through the tail from a relatively low batting position.
Posted by: knight at January 30, 2009 4:40 PM
Another question is does Consistency necessarily mean better batsman. If a batsman fail in one test but make up with a huge hundred in next test, than chances are that he might win the the second test for his team. But someone consistently makes about around 40-50 runs regularly cannot be called a match winner .
Ric's comment: ...unless everyone else is only scoring 20!
Posted by: Siddhartha at January 30, 2009 4:57 PM
I agree with Koos above. The way you have measured it is appropriate to measure 'consistency'
However, a far more meaningful analysis would probably come out of measuring how the standard deviation of scores below the median compares with the median. (can use any other percentile instead of median also)
After all, no one has a problem with batsmen scoring more runs than the expectation. Why not redesign the measure to disregard outliers in that direction?
Ric's comment: The whole exercise is about the presence or otherwise of outliers - why would you want to ignore them? Those high scores contribute to the mean; they therefore have to be considered when measuring consistency.
Posted by: aj at January 30, 2009 5:13 PM
i think that this is a result of ponting's consistency over the last 8 years or so. he has been exceptionally consistent. dravid wld be higher if it wasn't for last couple of years. sachin hasn't been as consistent for last 8 years, in contrast to ponting.
Posted by: V Prabhu at January 30, 2009 5:26 PM
I think that this analysis is flawed. Good batsmen are those who keep on scoring when they go about 20. Sehwag will always have high standard deviation because all of his last 100's are above 150. I think that a better measure of consistency would be to estimate a probability that a person scores 50. Although you can do curve fitting, and estimate the area of distribution of scores below 50, a simple measure would be to compute ratio of number of 50 plus scores or 30 plus scores to all scores, or the gaps in each 50 plus socres, and take the mean or standard deviation of that. That we we can say, a batsman has consistently scored above 30, or so on. By your logic, a lower order batsman with a highest score of 40 will never have standard deviation higher than 40. That does mean that he is consistent, but is that what we are looking for. Or rather, is he more consistent than Attapattu in scoring 30 runs?
Posted by: Mohamed Z. Rahaman at January 30, 2009 5:34 PM
This is useless stats. It's impossible to make such lists because you have to take each innings in the contex of the playing conditions, how the rest of the batsmen performed, what kind of support the player had, etc. How do you take Wasim Akram for example... 1. he bats relatively late in the innings, if his team fields first, then he must have spent enormous energy bowling.. etc. And as for Chanderpaul, I don't need stats to tell me that over the last 5 years he's been teh most consistent batsman in the WI in not the world. His innings must be taken in context of the team he's playing for and the role he's asked to play. E.g., opening the batting to batting at #5 or 6.
Posted by: Raghav Bihani at January 30, 2009 6:21 PM
If you remove 2 scores of 375 and 400* from Lara's career, it will make him a much more consistent batsman, still with over 11000 runs. But it will not make him a better player.
Once you are looking at a list of greats, the number of runs, hundreds, consistency, longevity are nothing but mere stats. All that matters is your ability to win matches for your team especially when under pressure.
Posted by: Chitraj Singh at January 30, 2009 6:55 PM
I think this is a fantastic approach to measuring a players caliber.Oddly enough as a project between friends,I devised the exact same model of dividing the standard deviation with the mean- however called it the "risk" of a player.
I then benchmarked this index with the overall "risk" of current batsmen who bat in the similar positions as the batsman being compared.
The problem I noticed with this approach was that an erraticly poor player i.e. high SD but low average can get the same score as someone who is "consistent" with a high average And as Koos van said a triple hundred can in fact adversely effect your index since standard deviation pays no attention to the direction of the deviation. I havent found a solution to this but I was considering a formula along the lines of
Risk = [1-(players score/team total)]* SD/Mean
Although this does not necessarily solve the problem adds credibility to scoring more.Obviously this is after you adjust for not outs like you mentioned.
Posted by: TropicalSky at January 30, 2009 8:24 PM
Consistency Index calculated via Standard Deviation is never going to be a good index. For example, a player who is consistently bad will have an excellent CI. If you still want to use Standard Deviation, then you have to categorise batsmen on the basis of the number of innings they have played (with a minimum of 50, minimum of 100 and so on) rather than the number of runs scored.
Simply speaking, with your analysis a batsman who has scored 20 to 25 runs in every innings and went on to score a total of 1000 runs in 50 innings will look like a much more consistent batsman than another who scored "at least" 20 to 25 runs in every innings, but has also converted a few of them into centuries and scored 1000 runs in a much less 30 innings.
You know what i mean?
Ric's comment: Yes, the CI I used only measures the variation of each player around their own mean. No account is taken of consistently bad v consistently good. In other words, it doesn't seek to show how good players are - just how much or little variation there is in their scoring.
Posted by: Rommel Ramotar at January 30, 2009 8:27 PM
I was pleasantly surprised to see the great Rohan Kanhai listed in your stats. Rohan was simply among the top 5 greatest batsmen of all time. Stats was never a part of his game. His daring, breathtaking strokeplay has said it all. He always so correct and proper while beating an attack to the dust. He invented the famous falling sweep and I saw him with my own eyes playing a back handed sweet fro four against Neville McCoy of Jamaica(he went on to score 187 retired in that game). Many players including Zaheer
Posted by: deep at January 30, 2009 9:04 PM
ha ha ganguly is in the 'most consistent' list while dravid joins tendulkar in the 'least consistent' list. now i am the biggest fan of our dada, but this is certainly news to those who have followed indian cricket the last 2 decades, and the only reason is the large number of scores between 30-70 range that ganguly seems to have (and only 16 centuries in 113 tests which is quite low for a batsman of his class). while dravid and tendlya both have careers liberally interspersed with purple patches and mammoth innings which lead to lot of deviations from the mean-thus, ironically, somehow making them 'inconsistent' by the same token as it makes them consistent - (in turning out big scores regularly). Thus the really brilliant batsmen who scores 100s and 200s often will always be considered less 'consistent' (except the Don because he scored a 100 less than every two tests. So in your study, consistency is synonymous with average - read "score around your average"
Posted by: Azfar Alam at January 30, 2009 10:58 PM
Ric, I am sorry to say I am not at all impressed with this analysis. This proves nothing and just confuses people.Consistency in Cricket is not about standard deviation as in mathematics. No wonder the 'most consistent' batsman in your analysis are mostly non-batsman or players who played little test cricket. You could have defined consistency as players who most consistently cross a score of (say) 40 which means everytime that player goes out to bat, the team can bank upon him making a good contribution. By your yardstick, perhaps the most consistent players of the modern era, Steve Waugh & Dravid, come you as the least consistent.
Ric's comment: I think you will find that players "who consistently cross a score of (say) 40" do well in this analysis. Dravid certainly doesn't come out as being the least consistent in the tables above, while I invite you to check out the number of ducks Steve Waugh made compared with the others in the 10000+ club
Posted by: David Barry at January 31, 2009 12:13 AM
sdr, using the co-efficient of variation sort-of assumes an exponential or geometric distribution, rather than a Gaussian distribution. Cricket scores are reasonably close to a geometric distribution, though it's skewed towards zero, and there are more large scores. So this should be a reasonable measure for a batsman's consistency.
Whether or not it's as good as, say, percentage of scores greater than half the average, I don't know. But I agree with Ric, those two measures would probably be well correlated.
Posted by: Azfar Alam at January 31, 2009 12:39 AM
Ric, the reactions you are getting from the readers including me is due to the fact the word 'consistent' in Cricket has an entirely different meaning than in Mathematics or Statistics. It is just unacceptable to find Steve Waugh as the most 'inconsistent' among the top run-getters. Any such statistical analysis is done with a purpose. But from your analysis, no logical conclusions can be drawn. Don't get me wrong I am Cricket Stat buff myself.I love doing and reading this kind of analysis..and quite saddened by the news of Bill Frindall's death.
Ric's comment: Someone has to be last in any list, Azfar, and in the list you speak of, it happens to be Steve Waugh. There is a good mathematical reason for this - his scores varied more widely from his mean than did the others from theirs. I don't necessarily agree that the meaning of "consistent" in cricket is different from that in other contexts. A batsmen who scores 7 ducks in a row is surely consistent - he certainly couldn't be branded as inconsistent! I think you are taking the word consistency to mean a run of what we would consider to be "good" scores. My meaning of the word is wider than that.
Bill Frindall was certainly a legend in his own right. I loved his spirit of independence that didn't tie him to bureaucratic decisions of a statistical nature that he regarded as ridiculous.
Posted by: TropicalSky at January 31, 2009 1:18 AM
Ric, Thanks for your earlier response. I think
your analysis will look much more reasonable if you compare batsmen with similar averages. (above 50, 45 to 50, 40 to 45 etc).What do you think?
Ric's comment: I agree, but you need to be careful that the players you are lumping together in the one average range (eg 45-50) have also played around the same amount of cricket, because as someone else noted ,the CIs tend to sneak up as they play more. That's why I prepared the tables as I did, on runs, rather than average, since there is more likelihood that two players with a similar number of runs have played a similar amount of cricket.
Posted by: TropicalSky at January 31, 2009 1:46 AM
OK, How about 50 to 75 with average between 40 to 45, 45 to 50 and >50; 75 to 100 tests with average between 40 to 45, 45 to 50 and so on? That should pretty much do it; I guess? Then we can figure out who are the consistently good batsmen among a particular group.
Posted by: TropicalSky at January 31, 2009 1:50 AM
Anyways, on the whole I'm buying your
"Consistency Index: Top eight run-scorers" part of the analysis; given that they are all good batsmen who have played same amount of cricket and scored similar amount of runs; and all have averages above 50. Thanks.
Posted by: Marcus at January 31, 2009 3:46 AM
I've always just judged a batsman's consistency by their innings/50 ratio (ie. No. of innings/no. of 50+ scores). This is a little more sophisticated! But it is interesting that Mark Waugh has a higher CI than Steve, because his ratio is 3.12, whereas Steve's is 3.17 - so I'm glad I'm not completely on the wrong track!
Ric's comment: Well done! The method I have used allows us to measure the consistency of players who have never scored a 50.
Posted by: Rajin at January 31, 2009 3:52 AM
Well personally consistency depends simply on a high avg. hovering on or about 50 and your conversion rate i.e the ability to convert 50s to 100s and 100s into big scores.You see to me a batsman can score many hundreds but have an avg. of under 40 so that's not consistent also he can have an avg. close to 50 but not many 100s meaning he might have many not outs to push his avg. up so on my take it needs to be a combination of the 2.To me lara and Tendulkar is the most consistent bcuz they have very high conversion rates,gr8 avgs. and carry on to turn 100s into big ones on a fairly regular basis.
Posted by: Anand at January 31, 2009 5:09 AM
Ric, you made a good point. But if someone is making consistently 40-50's in test matches, its of no use. You need someone to score big hundreds to win matches. Who consistently do that are matchwinners & great players. Saurav, Thorpe, Atherton, Arjun Ranatunga, John wirght might be more consistent. But they never produced big hundreds consistently to win matches. So this being an interesting analysis does not hold much water. Yes you can look at 10 K + club to see who is more consistent May be other way is to look at folks who are averaging between 40-45, 45-50,50-55 and more than a certain number of runs and then look at the CI index. It for me would be a real assesment of consistency.
I still liked the different analysis.
Ric's comment: I think you are underestimating the value of someone who reliably scores 40-50 in Tests. Put two of them together, and you have a partnership of 100 runs nearly every time - I think most teams would take that. As I commented before, you are right in looking a certain groups based on similar averages and runs - I chose not to present it that way, because I wanted to show the whole range of results without a long ponderous post with a squillion tables. But you can pick players of similar experience out of the tables I have presented - eg Kanhai, Lawry, Walters.
If this exercise has made you think deeply about this modelling, then I am happy - that's all I want! Thanks.
Posted by: Saurabh Somani at January 31, 2009 5:21 AM
have often thought of doing the same exercise myself, but there is a problem with the Not Out scores. when you add them to the next out score, what you're effectively doing is increasing the batsman's standard deviation. e.g. sir gary sobers scores have actually varied from 0 to 365, but in the analysis his scores would have a range of 0 to 490 (this is 365 + 125 that he made in the next innings). increasing the range would per force increase the standard deviation.
I would be very grateful if you could respond to this with your thoughts.
Ric's comment: You are absolutely right - what, in effect, I have done is to measure the variation of scores made between dismissals rather than between innings. Mostly these coincide, but as you point out, sometimes they don't. The Sobers example you refer to treats him as having one score of 490.
Other options would be to ignore not outs completely, or to treat them as dismissed scores (eg 0 not out becomes 0). I believe what I have done is better than those options.
Posted by: Raghu at January 31, 2009 5:36 AM
Nice.
Can you do a Sharp Ratio as well? Which is pretty much what you've done I'd think, but not quite.
Cheers!
Posted by: Aaron at January 31, 2009 9:31 AM
Ha! No one from New Zealand will be surprised to see Mathew Sinclair waaaay ahead of the pack when it comes to inconsitency. With a double century on debut and another one soon after he was looking likely to be a lynch pin of the side for years to come. Since then however he's been consistently bad. The good news is he's got a few more years left to re-enforce his position on the number one spot.
Although I wonder, if he stays consistently bad for another 5 years (which is quite likely), and the selectors keep giving him another go, might that actually bring his inconsistency rating down a little?
Posted by: Shriram at January 31, 2009 12:27 PM
I think the word "consistent" is generally associated with consistently good scores only. Therefore, this analysis is more about "predictability" of a batsman. While certainly interesting, I think this analysis is geared towards answering the question "What's the confidence with which we can predict that a batsman would score his average score in a particular innings"
Another suggestion for the similar analysis for ODIs...throw strike rates into the mix as well and see what you get!
Ric's comment: Good comment, Shriram. Not sure how the strike rates suggestion would work, though....
Posted by: Sumit Sanghai at January 31, 2009 2:14 PM
I think the simple notion of % of N+ scores where N could be 30, 50, 70, etc would do the trick. As many have already said STD deviation isn't the best performance indicator. Also, one should take the best consecutive N years (N could be 10) of a batsman to do this analysis. The reason being that some batsmen really lose form as they age and some are blooded too early which will lead to less consistency.
Posted by: Harish Raj at January 31, 2009 6:20 PM
I just glanced at the article. And I have to say I am most pleasantly surprised to see Ganguly is the most consistent of the Indian batting greats including the golden generation. Especially considering he's valued to be the least in terms of batting talent among them all. But then,numbers dunt tell the real story..!!
Posted by: GMnorm at January 31, 2009 6:29 PM
yes Ponting is very consistent vs Harbhajan
Posted by: Vinay at January 31, 2009 6:57 PM
Well, these are just another statistics that doesnt actually let you know anything. For ex, the stats say that Sachin is not as consistent as we think but Sachin Tendulkar has received the maximum number of wrong decisions. What about them? If the decisions went his way he would have been atleast 2000 runs more than what he actually has..
Posted by: KarachiFrog at January 31, 2009 6:59 PM
After looking at these stats the only conclusion I can come to is SO WHAT!. When Ashley Giles figures (4th) in the list of the world's most consistent batsmen, then the list has no relevance to much at all - and given Giles 13 'not outs' thanks to batting at the tail, it suggests that the whole concept is skewed. This is not a list of history's most valuable or useful players (though there are a number of quality batsmen in the list). Apart from Herbie Sutcliffe & Sid Barnes none of the 'consistent' list averaged over 50 and fall into the 'high quality' bracket. Your comment "One thing we admire in our cricketers is consistency" is not necessarily shared by all - don't speak for me Ric. I'd sooner see Afridi go out there and belt 75 off 45 balls, though he doesn't do it every innings, rather than watch some dude who plods out 25 or 30 runs time after time. The guys that are going to create cricket excitement in the coming years are Duminy & Warner & their ilk. I look foward to them doing it in tests as well as the shorter games.
Go back to work Ric.Your 'early retirement' from mathematics teaching seems to have suckered you into the inane. Take up golf & get a life.
Ric's comment: I'm not sure I ever claimed I was measuring "history's most valuable or useful players" - its simply a ranking of those whose scores varied little from their average (or varied widely, depending which table you look at). Nor is it a list of the cricket's excitement machines.
The golf's fine, thanks!
Posted by: waterbuffalo at January 31, 2009 7:49 PM
Interesting to see Mark Waugh at 5 in the Most Consistent list and Steve Waugh at 5 in the Least Consistent list. I have always thought that Mark Waugh was underated whilst Steve was overated. One only has to look at how difficult Steve makes batting look and how easy Mark made it look. Add to that the dozens of LBW's not given against Steve by Aussie umpires and you can see that the reason why Steve Waugh is held in such high esteem is he simply had a far bigger mouth.
Posted by: Sankar Vasudevan at January 31, 2009 10:34 PM
Its quite surprising that many who have posted are not able to understand the exact meaning of this analysis, in spite of being well informed about statistics.
This analysis is not about exclusively about pitch state, crisis situations or against the quality of attack. The more number of matches one plays (the runs being representative of that), these factors would become equivalent amongst players.
So by combining the standard deviation and the average together (don't look at SD alone..), we can determine to a reasonable extent, who is more consistent.
Of course, there could be possible exceptions like when a batsman scores a string of 3 zeroes in a match heading for a draw and coming up a double hundred to clinch a game. His S.D is going to be high of course and statistics cannot judge these things...but then, these are exceptions and not the rules.
Although I would agree with one of the posts here about providing the median and mode for a better analysis. Good work all in all.
Posted by: Henry at February 2, 2009 1:46 PM
"The whole exercise is about the presence or otherwise of outliers - why would you want to ignore them? Those high scores contribute to the mean; they therefore have to be considered when measuring consistency."
Well yes, but it would be nice to get more of a feel for the distribution. Two batsmen with the same CV/S.D. might have differently shaped distributions (i.e. narrow with a few large outliers, fat with no outliers). The intuition for this analysis is as follows - the utility of runs is rarely linear - obvioulsy big hundreds are better than small hundreds, but when the score is very high, there is always a suspicion of a weak/understrength bowling attack and/or a flat pitch and the consequent result is often a draw. Clearly you want to be as model free as possible, but I could be acknowledged that there are multiple ways to increase one's CV/S.D.
Posted by: Charles Davis at February 2, 2009 11:15 PM
Interesting analysis, Ric, and interesting that some seem to miss the point. I suppose it is because cricket commentators (mis)use the word 'consistent' when they mean 'consistently good'.
The question of whether consistency is a desirable quality is still open: I don't know the answer, although it appears that batsmen who are predictable and those who are not can both be rated very highly by judges of the game. But perhaps someone like Lara would be less well remembered if he had traded some of those double centuries for a whole bunch of fifties, so in his case 'inconsistency' was a positive thing. Given the plight of the West Indies, 50s from Lara were almost never match-winning innings.
In Chanderpaul's case, his 'inconsistency' seems to derive from his tendency to string together consecutive not outs.
Posted by: Ambuj Saxena at February 2, 2009 11:41 PM
Interesting analysis, though I do not agree with the procedure. If I were
to analyze consistency, I would have used the following formula:
C(X)=Percent of times a batsman scores less than X% of his average.
As you might have noticed, this formula allows for a lot of different
standards of consistency measurements. For example, one can measure the
C(50) consistency of all batsman, which would be calculating the percent of
times each batsman scores less than half of his average. I have not been
able to figure out what will be a good value of X for the most objective
analysis and I believe this will remain subjective. What I like most about
this formula is that it doesn't punish the batsman for scoring big in a few
innings. It also doesn't punish the batsman within a few runs of his
average, and is universal enough to compare most batsman in a single
statistics pool.
Can you please do an analysis based on this formula and share the results.
Ric's comment: I've quickly used your method to determine the percentage of scores below 50% of the Test averages of Ranatunga (consistent in my analysis) and Attapattu (inconsistent), two player who had similar overall averages and aggregates. Ranatunga had 39% of his scores below 50% of his average, while Atapattu had 53%. I suppose this means (on the basis of a very small sample!) that both methods are going to produce more or less the same outcome. With my method though, every single score counts, whereas with yours, all scores below 50% of the average carry the same weight (eg, a duck carries the same weight as a score of 18) while those above it are not taken into account at all. I think your method if applied to all players would produce an interesting point of discussion, but I'm not sure it is as effective in determining overall consistency. Thanks for your input.
Posted by: keyur at February 3, 2009 5:50 AM
good analysis but i have a few queries regarding it. firstly, i believe notout innings must be excluded. even if a batsman made a unbeaten innings of 50 last time,he does not start his next innings on 50 but on a fresh zero the batting conditions and opposition bowling also vary. further this method punishes those who have streak of notouts & hence notouts should be excluded.admittedly leaving notouts will reduce the avg. as well but as we are measuring consistency it shouldn't make a difference. secondly as has been noted in the very first comment, the CI tends to rise with the no. of innings played or no. of dissmissals. i believe it is easier to maintain consistency over say 50 innings or dismissals than over 200. so to correct this error the consistency index as obtained by you should be further divided by square root of the no. of dissmissals. (i think this is how variance is measured in stats)
this will allow fair comparison of all players irrespective of no. of innings played
Ric's comment: I did actually do it ignoring incompleted innings - Hobbs was still top of the 5000+ group, but Bradman dropped, Clive Lloyd was second - Laird dropped to 3rd in the 1000+ group, David Hookes was top - Tendulkar and Gavskar came down to equal Ponting, Lara was still high in the 10000+ group, Kallis was even lower. But I question the validity of doing this - whole slabs of players' careers are ignored, and anyway, does not the common old batting average measure the runs scored between dismissals, rather than in each innings? Given that, I still reckon the way I presented it is the best methodology. Thanks for your input, Keyur!
Posted by: Tom at February 7, 2009 5:07 PM
Doug Walters on the list of consistent batsmen surprised me; and as much as I loved to watch Doug bat, consistency was not one of the traits he was known for. Therefore, I think some of the criticism regarding exactly what is being measured here needs a look at.
Posted by: Jeremy Gilling at February 17, 2009 3:09 AM
Two surprising omissions from the least consistent (1000 run minimum) table are Ken Rutherford (NZ) and Bill Edrich (England), both of whom had Atapattu-like horror starts to their careers.
Ric's comment: Edrich had an index of 1.18, reasonably high, and Rutherford only 1.06. The latter scored quite consistently in the second half of his career, and with only 3 centuries, had little at the top end to stretch his standard deviation.
Posted by: Andrew at February 26, 2009 10:51 PM
Mark "Audi" Waugh ranked consistent despite the Sri Lanka debacle, but Attapattu has not been able to shrug off his horror start. Presumably he's middling consistency once he actually got going.
This tends to punish batsmen for going on with big scores - MEWaugh never made the massive scores that drag down Bradman to only 2nd place.
Maybe too technical, but how would a "semi-variance" measure go, penalising only for downside scores below the average?
Y Anantha Narayanan has over 35 years of IT background. Over the past 15 years, he has been concentrating on Cricket analysis and software development. He has been involved with StumpVision, Wisden, Hallmark Software and his own site www.thirdslip.com during this period.
David Barry was cricket-starved when teaching English in France, and
study of cricket stats was his only way to stay sane. He is now back
in Brisbane, Australia, and working towards a PhD in Physics. He once
played for the worst team in the G-division of Muscat's cricket
league.
After doing an MBA in marketing and working in an advertising agency, S Rajesh decided that his skills might be put to better use by number-crunching on cricket. He hasn’t regretted that decision in the last six years, and edits the Numbers Game column on cricinfo.com every Friday.
Andrew Samson had his moments with bat and ball, once scoring 43 and taking 3 for 14 with his legbreaks, but he was much better at arithmetic, which explains why he is where he is today. Andrew has been keeping cricket stats since the days when it used to be done with pen and paper, and has been involved in scoring/stats for Radio and TV since 1987. He has been Cricket South Africa's official statistician since1994.
A former scientist and occasional TV quiz champion, Charles Davis now works full time at sports statistics in Melbourne.
His only real contribution to the Test record books came at age 4, when he formed part of the record 90,800 crowd
who saw West Indies at the MCG in 1961. He has two books to his credit, and claims to be the only cricket statistician
ever who has been quoted in the New York Times and in Australian Federal Parliament on the same day. Not to be
confused with the West Indian batsman Charlie Davis, especially in terms of ability.
Having just taken early retirement as a Mathematics teacher in Hobart, Ric
Finlay now fully devotes his time to recording cricket, both past and
present, for the popular CSW cricket database, along with his colleague
David Fitzgerald (www.tastats.com.au). His interest in the game is
inversely proportional to his ability as a player, but he did once score a
century after being dropped at 3 and running out three of his team-mates.
His first memory of international cricket is the 1962-63 MCC tour of
Australia, described as one of the most boring ever. Totally fascinated, he
was instantly hooked, and has never looked back. Author of three books on
cricket of a historical nature, he has provided statistics and scored for
radio and television cricket coverage since 1983.