The Language of Numbers
There is a language of descriptive statistics used in our industry to interpret, analyze, and at times, mislead investors. Though a full understanding of these dynamics often seems daunting, there is a core group of statistics that if understood fully, will shed light on the majority of potential pitfalls and quickly help breeders sift through the hype and improve the performance of their bloodstock portfolio. By understanding these building blocks of the language, breeders can make significantly better decisions in all areas of their breeding program. For the purposes of this article, we’ll address eight of the more commonly used statistics.
Average Earnings Per Starter - One of the most commonly used statistics to describe the earning power of a sire’s progeny, it is also one of the least discriminating and most easily skewed of all the numbers used in stallion advertising. A very simple computation, it is derived by taking a sire’s total progeny earnings and dividing it by the number of starters.
Mildly effective as a starting point for stallion evaluation, a sire’s average earnings per starter allows mare owners to get a general idea of how a sire compares to his
counterparts. But because such a large portion of the stallion population falls into the
"By understanding these building blocks of the language, breeders can make significantly better decisions in all areas of their breeding program."
$25,000 - $40,000 average earnings range, this is frequently a non-descriptive statistic for sires, and tells mare owners very little. Also, this statistic is subject to being heavily skewed by the sire’s top earner, best illustrated in the case of Skip Trial, where Skip Away accounts for nearly 30% of his total progeny earnings. Knowing this, Skip Trial’s average earnings per starter of $91,704 can hardly be taken as an accurate indicator of his foals’ quality.
Median Earnings - A highly useful number for the astute breeder, the median earnings gives us the amount where 50% of a sire’s progeny have earned more than, and 50% have earned less than. To understand this, imagine a sire with only 11 starters. Individually, they have earned the following amounts:
Starter #1 100,000
Starter #2 85,000
Starter #3 80,000
Starter #4 79,000
Starter #5 68,000
Starter #6 40,000
Starter #7 21,000
Starter #8 18,000
Starter #9 7,000
Starter #10 6,000
Starter #11 3,000
In this scenario, the median earnings for this sire is $40,000. Exactly half of his progeny have earned less than this $40,000, and half have earned in excess of $40,000. This figure is a terrific indicator of a sire who gets a large number of poor individuals. If we’re researching a sire and discover his median earnings to be just $7,500, we know that at least 50% of that sire’s progeny fail to pay their way, indicating that investors should look elsewhere.
The obvious benefit is that median earnings are immune to heavy skewing by a single runner. The one shortcoming is that a sire lacking class in his progeny can achieve an inflated median earnings by way of durable individuals. Though they’re not fast enough to possess class, they aren’t fast enough to hurt themselves either, leading to skewed median earnings that may convince some that a sire’s foals have more class than they actually have.
Standard Starts Index (SSI) - The SSI tells us the earning power for an individual based on their average earnings per start relative to peers of the same sex and age, with 1.00 representing the average for each crop.
Because the SSI is relative to an individual’s peers, the SSI quantifies a horse’s earning power without regards to inflation. A horse with an SSI of 1.00 in 1975 theoretically equates to a runner with an SSI of 1.00 in thirty years later in 2005. Also, the SSI effectively weeds out slow, durable types that campaign at the lower levels of racing. Used primarily for evaluating individuals racing class.
Average Earnings Index (AEI) - One of the most commonly used indexes to measure the performance of a sire’s progeny relative to the progeny of other stallions, the Average Earnings Index (AEI) is the average earnings for a sire’s progeny during a calendar year, with 1.00 being the average for the breed. Like the SSI, the AEI allows comparisons of stallions from different time periods, but is subject to skewing by a leading runner. Also, it’s imperative that breeders understand that the AEI favors sires who throw durable types who can make more starts during the year, even if they are competing at the lower levels of racing.
Comparable Index (CI) - An attempt to measure the quality of mares bred to a particular stallion, the CI is the average earnings for foals out of the same mares, but sired by different stallions. For instance, if a group of mares sent to a 1st year stallion had previously produced foals with an AEI of 1.50, that same number would represent the new sire’s CI. The idea here is to quantify the quality of mares being sent to any given sire, allowing future assessments as to whether or not a sire is improving on his opportunities or riding the coattails of his mares.
The primary shortcoming of the CI is that we never know the quality of sires previously bred to a group of mares. Such is likely the case of most highly touted stallion prospect such as Mineshaft. Many of the mares in his first two books had previously seen the likes of Danzig, Mr. Prospector, and Seattle Slew. Those opportunities would definitely raise the earnings power of the resulting foals, creating a disproportionately high CI that may create the illusion that Mineshaft is dragging his mares down. But just because he can’t raise his mares to the extent that Mr. Prospector did, shouldn’t be held against him.
Sire Index (SI) - Similar to the AEI except that it measure earnings power based on average earnings per start, and not a calendar year. Like the AEI, the SI also categorizes according to sex and year of birth, but does not allow a group of lower-level, durable types to skew a sire’s index.
One of the best illustrations of how the SI differs from the AEI is Airdrie Stud’s Indian Charlie. A known source of unsound individuals, Indian Charlie’s SI is 2.74, almost three times the breed’s average, a clear indicator that he can sire talented individuals based on his progeny’s relative earning power per start. But when we employ the AEI, measuring his progeny’s relative earning power over a calendar year, the figure drops to 2.14. Not surprisingly, his sire emulates this pattern very closely.
Comparable Sire Index (ComSI) - Used in conjunction with a sire’s SI to measure the quality of mares being sent to a stallion, the ComSI also assigns an index based on average earnings per start, not a calendar year. The same problems that exist within the CI also pertain to the ComSI, the only difference being the unit of measurement that is averaged out for each sire.
Percentage of Stakes Winners from Foals/Starters - This is typically one of the most abused numbers in the industry. Depending on what publication you’re reading, the percentages may or may not include actual black-type races. Both Thoroughbred Times and The Blood-Horse use non black-type stakes in their annual stallion registers, often misleading newcomers about the potency of a stallion’s stakes production. A prime example of this can be found in the 2005 Thoroughbred Times Stallion Register for the stallion Truckee. He is shown to have two stakes winners from 36 starters. When looking closer, one notices that one of his ‘stakes winners’ is Appealing Wayz, winner of the Rocky Mountain Futurity in Wyoming with career earnings of just $9,103. With minimum purse values for accredited black-type being $40,000, we clearly have a case where a sire is being given credit for a stakes winner that doesn’t meet cataloging standards. If an industry newcomer can learn just one thing early on, he or she would be wise to have an understanding of what actually constitutes a stake race, and how most stallions’ stakes production are inflated by non black-type stakes races.
As is the case with all statistical inferences, it is the responsibility of the user to become familiar with the methodology and language behind the numbers, as well as the strengths and weaknesses in accurately describing a phenomenon. Only after breeders have familiarized themselves with the appropriate statistical language, can they start using them effectively to avoid poor bloodstock investments.