Publicity for this photograph, currently on display in a San Francisco museum, set me to noodling around with numbers connected with hitting safely in 56 consecutive games, a feat performed by DiMaggio during the 1941 baseball season. Many say it is the most amazing and improbable record in sports. So I wondered what the probability might actually be.

Of course you have to make assumptions. They don't have to be perfect in order to give a good idea of what magnitude of improbability we are looking at. Let us assume a batting average of .400 and four at-bats each game (over the 56-game streak, DiMaggio went 91-for-223, essentially four at-bats per game and an average of .409). It's obviously very difficult to bat for an average higher than that. In the same year that DiMaggio hit in 56 consecutive games, Ted Williams batted .406 for the season--and it hasn't been done since then. I remember reading somewhere that over the time that DiMaggio was compiling his legendary streak--it began on May 15 and ended on July 17--Williams hit for a higher average. That's not too surprising. His average for the whole season was only three points lower than DiMaggio's during the streak.

Anyway, if you have a 0.4 chance of getting a hit in each at-bat, the likelihood of getting no hits in a game in which you have four official at-bats is given by 0.6^{4}, which is 0.1296. The likelihood of getting at least one hit is therefore 1 - 0.1296 = 0.8704. If there is a 0.8704 chance of something happening in each independent trial, what is the chance of it happening 56 times in a row? That is given by (0.8704)^{56}, which is quite a small number: 0.00042--or about 1-in-2500.

Now, that is **not** a rough answer to the question: what is the probability of a .400 hitter ever duplicating DiMaggio's streak? DiMaggio hit in 56 straight games played from May 15 to July 16, but we'd be no less impressed if he had done it in 56 consecutive games played from late July to sometime in September. The season is now 162 games long, so a player who plays in almost every game has right around 100 different groupings of 56 consecutive games (games 1 to 56, 2 to 57, 3 to 58, and so on). Consequently it seems reasonable to expect that the chance of our .400 hitter getting at least one hit in 56 straight games played within the same season is something like 1 - (2499/2500)^{100}, which is about 0.039--a little less than a 4% chance.

I think the above calculations are sufficient to demystify DiMaggio's streak. Yes, it was improbable, but not freakishly so. A batter who could stay healthy and consistently hit .400 over the course of ten seasons would have a roughly 1-in-20 chance of equalling DiMaggio's record sometime over the course of the ten years. As averages fall away from .400, however, probabilities become vanishingly small. The highest lifetime batting average among active players is Ichiro Suzuki's .333. His chance of getting a hit in a game is about 0.8. His chance of getting a hit in 56 consecutive games somewhere over the course of 1500 games is about one-half of one per cent, or 1-in-200, which is just one-tenth the chance a .400 hitter has. Ichiro's longest career consecutive game batting streak is 27.

## Comments