This article by Barry Meadow appeared in the December edition of Horseplayer Monthly. To read the March 2014 issue with 32 pages of handicapping interviews and insight for free, please click here.
Barry
Meadow has spent more than 30 years in the gambling world. He wrote his first
book, Success at the Harness Races,
in 1967. He's also written Money
Secrets at the Racetrack, which has been lauded as the definitive guide
to money management at the track. Meadow's eclectic resume includes serving in
Vietnam, writing television sitcoms, playing the professional tennis circuit in
India, doing standup comedy in California, and, of course, playing blackjack at
the professional level in his spare time. http://www.huntingtonpress.com/go/authors/barry-meadow
A trainer
wins 18% first off the claim. A handicapping system hits 29%
winners. A jockey's year-to-date win percentage is 6%. Will any of
these stats, or others, help your bottom line? Or will they simply
mislead you?
Until
William Quirin's Winning at the Races
was published in 1979, few handicapping books offered much in the way of
statistics, mainly because compiling them was an exercise in tedium.
You'd have to buy every Racing Form, every day, and then go through each race
searching for some characteristic you wanted to research. When you
finally found a qualified selection, you'd grab a different Form to check the
chart, and then record each result. Doing even the simplest work took
incredible patience, or a staff of unpaid students.
All that
changed with the introduction not only of the personal computer, but more
recently with the availability of daily downloads. Now, for just a few
dollars a day, anyone can download every past performance line for every horse
in the nation, write a simple query, and find out if horses really do yield a
flat-bet profit if they return in exactly five days (they don't) or whether you
can make money by playing every dropper from a straight maiden into a
maiden claimer who showed early speed last out (ditto).
The
gathering of horsey data is no longer much of a problem. Ask the computer
a question, and it will spit out answer.
However,
while accumulating data is one thing, interpreting it correctly is something
else altogether. The
essential problem is that while ideas should be forward-tested (you state a
hypothesis, then test it), many data miners work backwards, falling victim to
what is known as "hindsight bias." They start with
already-known results, and then look for patterns that might have contributed
to these results. Typical: A player notes that many recent winners
at his track were dropping in class, so he decides to check the last three
months' results. Sure enough, class droppers did well, but because the
survey includes the recent results that he already knows, his sample will be
skewed.
Let's look
at some basic principles. Understand these, and you won't be misled by
handicapping stats:
* The
larger the sample size, the more likely will the percentages be accurate.
Conversely, anything goes when looking at tiny sample sizes.
* The less often a
result occurs and the higher the payoffs, the greater the sample size you need
to measure the validity of the idea.
* Unlike groups cannot be lumped together: 3-5 shots cannot be lumped in
with 7-1 shots.
* Check the actual
number of plays, not simply the number of races investigated to obtain those
plays.
* Rules
that appear arbitrary (horse's last race must have taken place within the past
21 days, horse must go off at odds of 5-1 or above, etc.) indicate that the
system came from back fitting with the arbitrary rules added to get rid of a
bunch of losers.
* Whenever an idea
has been developed from one set of results, it must be tested on a
completely separate group of results.
* Once
a result has been proven (e.g., coin flips win 50%), you can use a statistical
formula known as standard deviation to predict the range of results; however,
if a result
is merely recorded and not proven, you cannot
accurately predict the range of results since you do not know whether the
result is typical or atypical.
* Return-on-investment
statistics are often skewed by a handful of longshot winners--sometimes even by
one such winner.
* Any
study of race results should look at what the usual results are for the
particular odds category, and compare the usual ratio of wins, places, and
shows to the results in question.
* Streaks,
both positive and negative, often happen for no reason other than the
statistical fluctuations that are part of any long mathematical series
Whenever
you see a handicapping statistic, ask these questions:
1. Could
it be false?
Years ago,
betting every favorite lost only half the track take. However, my own
survey of 400,000 more recent favorites showed conclusively that you would lose
the full track take by betting every favorite today. Yet some authors
still continue to mistakenly tell their readers that the old stat is still
valid.
2. Who
says so?
A man
touting his own system might tell you that it had an ROI of 37% last year at
Belmont. Nice (if it's true), but what about every other track? Did
it lose everywhere except Belmont? Often, it's the information that isn't
being revealed that it is the most revealing.
3. How
many plays were there?
A sample
size of 1,000 plays for a system whose average winning payoff is $24 is just
about useless. If a guy tells you he bet 417 longshots last year
and showed a 15% profit, don't be surprised if he does the same this year and
shows a 30% loss.
4. How
was the number derived?
Who
compiled the numbers? How far back? Which tracks? What were
the odds? What was the 1-2-3 record, and what was the expected 1-2-3
record for horses at those odds?
5. If an
ROI figure is not included, is the number of any use?
If a stat
has an impact value of 2.3 (horses with characteristic win 2.3 times their fair
share of races), that's good--but if they average a $3.80 payoff, who cares?
6. If an
ROI figure is included, how many plays is it based on, and did a few big
payoffs skew the results?
A 500-play
report that shows a 7% profit is worthless if its two biggest winners accounted
for all the profit.
7. Is it
possible that the result is simply a fluke?
If horses
from post 6 showed a net profit for a particular meeting but posts 5 and 7 were
losers, it's likely the result is nothing more than a statistical
anomaly.
8. Have
others, using different races, found similar results?
If you
based a method on the results of certain races, you need to test it on
different races - as many as possible. Better yet, have somebody else
test it.
9. Is
there evidence that the tested factor was more successful than can usually be
expected, less so, or about average?
That includes not only the win percentage, but
whether the prices were better or worse than usual.
These are
starter questions. If you really want to get serious about the subject,
study books like How to Lie with
Statistics (Darrell Huff), Fooled by
Randomness (Nassim Nicolas Taleb), Innumeracy
(John Allen Paolos) and Statistics for
Dummies (Deborah Rumsey).
Don't
believe everything you read - even if it's got a number attached.
No comments:
Post a Comment