Tải bản đầy đủ - 0 (trang)
Hack 63. Sense the Real Randomness of Life

# Hack 63. Sense the Real Randomness of Life

Tải bản đầy đủ - 0trang

thinkismostlikelytooccur?

TableCoin-flippatterns,withprobabilitiesnotshown

A

B

C

D

Probability

Tails,Tails,Tails,Tails,Tails

?

?

?

?

"Theothersaretooordered."

"Aismoremixedup,soit'smorelikely."

"Alooksmorerandom,likeitcouldreallyhappen."

Eventhoughyouknowthatcoinflippingisrandom(assuming

thecoinisn'tweighted),lookingrandomdoesn'tmake

somethingmoreprobable.Allofthesepatternsofcoinflipsare

actuallyequallyprobable,asshownbythemathinTable6-3.

TableCoin-flippatterns,withprobabilities

A

B

Tails,Tails,Tails,Tails,Tails

C

D

Tails

Probability

1/2x1/2x1/2x1/2x1/2=1/32=

.03125

1/2x1/2x1/2x1/2x1/2=1/32=

.03125

1/2x1/2x1/2x1/2x1/2=1/32=

.03125

1/2x1/2x1/2x1/2x1/2=1/32=

.03125

flips,allpossibleoutcomesmustbeequal,becauseeachflipof

thecoinisindependentoftheotherflips.Inotherwords,the

thereisnowaythatthecoincanknowwhichsideitissupposed

tolandonthenexttimeitisflipped.Acoin,likediceora

roulettewheel,hasnomemory.

HowtoSpotRandomOutcomes

Toknowanunusualsequenceofeventswhenyouseeit,you

needtodecidewhetheryouaresupposedtobepaying

attentiontoacombinationorapermutation.Inprobability

Tailsinanyorder)andtheprobabilitiesofcertainpermutations

particularorder).

likely,orwhetheragivenoutcomecouldhaveoccurredby

forexample,orthenumberofdifferentwaysofdrawingfive

arepossible.Herearetheimportantdistinctionsbetweenthe

two:

Combinations

Acombinationisthetotalnumberofwaysthatonecould

endupwithaparticularnumberofvalueswhendrawing

randomlyfromsomepopulation.Coinflipsaresamples

combinationsvaries,dependingonthenumberofacertain

valueoneisinterestedin.Inotherwords,withfivedrawsor

Permutations

Permutationsarethenumberofwaysthatagivennumber

ofelementscouldbearranged.Inotherwords,theyarethe

numberofexactsequences.Inourcoin-flipexample,5

elementsthatcaneachbe1of2valuesresultsin32

differentpossibleordersofarrangement.So,eachofthe

permutationsshowninTable6-3willoccur1outofevery

32times.

HowtoCalculateCombinations

Thenumberofpossiblecombinationsiscalculatedbytakingthe

numberofpossiblevaluesforonedraw(e.g.,twovaluesfora

Thereare32possiblecombinationsof5coinflips(25).

Theequationforcomputingthenumberofwaystogeta

elementsdrawnfromapopulationis:

Thepreviousequationrequiresthesevariables:

n

Thenumberofelementsordraws(e.g.,5coinflips).

r

!

Factorial,whichmeanstotakethenumberandmultiplyit

bythatnumberminus1,thenbythatnumberminus2,and

soon,allthewaydownto1.Forexample,5!represents

5x4x3x2x1=120(which,bytheway,iswhythereare120

possiblecombinationsoffivecardsinapokerhand[Hack

#62]).

is:

10combinationsoutof32possiblecombinationsmeansthat

coinflips,youcouldusethebruteforcemethodoflistingallthepossible

HHHHHTHHHHHHHHTTHHHTHHTTHTHTTHHHTTTTHTTTHHHTHTHHTH

HHHTTTHHTTHHTHHTHTHHHHTHTTHTHTHTHHHTTHHHHTHHTTTHHT

HTTTHTTTTHHTHTTTTHTTHTTHHTTTHHHTTTTTTTTTHTHTHTTHTHHTTHT

TTTHT

WhentoBeSuspicious

Decidingwhetherapatternisrandom(i.e.,whatonewould

expectbychance)isamatterof:

Knowingthechancesofcertaincombinations(not

permutations)

Fightingthepsychologicaltendencytoexpectchance

resultstonotproducearecognizablepattern

Settingastandardforhowunlikelyaneventmustbebefore

questioningthedata

Let'sreturntoourtableofcoinflips,shownnowinTable6-4

TableCoin-flipoutcomesandprobabilities

Order

Order

Outcome

Outcome

probability

Tails

Tails,Tails,Tails,Tails,Tails

Tails

Tails

probability

.03125

Three

FiveTails

.03125

ThreeTails .31250

.03125

Four

.03125

.31250

.03125

.15625

3timesforevery100timesyouproducefivecoinflips.Itis

unlikelytohappenbychanceonagivenattempt,butitwill

happenoccasionallyacrossaseriesofattempts.Ifithappens

frequentlyacrossaseriesofattempts,somethingmightbeup.

Whatleveloflikelihoodareyoucomfortablewith?Howrare

mustaneventbebeforeyoudecideitdidnotoccurbychance?

Scientistshavesetastandardof5percent.Ifstudyresults

suggestanoutcomethatwouldoccurbychanceonly5percent

orlessofthetime,itisusuallyconsideredtobesignificant,and

isprobablyevidencethatsomethingotherthanchanceisin

play.

Yougettodecideforyourself,though,whenyouwanttoaccuse

someoneofbeingacheat.Goodluckonmakingthatdecision!

Itshouldresultinfistfightslessthan5percentofthetime.

JillLohmeierwithBruceFrey

Hack64.SpotFakedData

Ifyouhaven'tgivenitmuchthoughtbefore,itmightbe

quitenaturaltoassumethatalldigitsareequallylikely

toshowupinmostrandomdatasets.Butaccordingto

Benford'slaw,formanytypesofnaturallyoccurring

data,thelowerthedigit,themorefrequentlyitwilloccur

checktheauthenticityofanydataset.

Inthe19thcentury,longbeforetheageofelectronic

calculators,scientistsusedtablespublishedinbookstofind

valuesoflogarithms.Aparticularlyobservant19th-century

astronomerandmathematician,SimonNewcomb,noticedthat

thepagesoflogarithmtablesweremoreworninthefirstpages

thaninthelastpages.Newcombconcludedthatnumbers

beginningwith1occurmorefrequentlythannumbersbeginning

with2,numbersbeginningwith2occurmorefrequentlythan

numbersbeginningwith3,andsoon.

Newcombpublishedanempiricalresultbasedonhis

observationsintheAmericanJournalofMathematicsin1881,

whichstatedtheprobabilitiesofanumberinmanytypesof

naturallyoccurringdata,beginningwithdigitdford=1,2,...

andwaslargelyforgottenuntilover50yearslaterwhenFrank

Benford,aphysicistatGeneralElectric,noticedthesame

patternofwearandtearoflogarithmtables.

Afterextensivetesting(20,229observations!)onawidevariety

ofdataincludingatomicweights,drainageareasofrivers,

censusfigures,baseballstatistics,andfinancialdata,among

otherthingsBenfordpublishedthesameprobabilitylaw

concerningthefirstsignificantdigitintheProceedingsofthe

AmericanPhilosophicalSociety(Benford,1938).Thistime,the

firstsignificantdigitlawattractedgreaterattentionandbecame

knownasBenford'slaw.AlthoughBenford'slawbecamefairly

wellknownafterthe1938paper,whichincludedsubstantial

statisticalevidence,itlackedarigorousmathematical

foundationuntilthatevidencewasprovidedbyGeorgiaTech

MathematicsprofessorTheodoreHillin1996(Hill,1996).

Today,Benford'slawisroutinelyappliedinseveralareasin

whichnaturallyoccurringdataarise.Perhapsthemostpractical

applicationofBenford'slawisindetectingfraudulentdata(or

unintentionalerrors)inaccounting,anapplicationpioneeredby

professorMarkNigrini(http://www.nigrini.com/).

Thedetectionoffabricateddataisimportantnotonlyin

accounting,butalsoinawidevarietyofotherapplications(for

example,clinicaltrialsindrugtesting).Thishackdescribes

Benford'slaw,showsyouhowtoapplyit,providessome

intuitivejustificationonwhyitworks,andgivessomeguidelines

onwhenBenford'slawcanbeapplied.

HowItWorks

Initssimplestform,Benford'slawstatesthatinmanynaturally

occurringnumericaldata,thedistributionofthefirst(nonzero)

significantdigitfollowsalogarithmicprobabilitydistribution

describedasfollows.FollowingHill(1997),letD1(x)denote

thefirstbase10significantdigitofanumberx.Forexample,D

1(9108)=9,andD1(0.025108)=2.

Then,accordingtoBenford'slaw,theprobabilitythatD1(x)=

d,wheredcanequal1,2,3,...,9,isgivenbythefollowing

equation:

Thus,Table6-5givestheprobabilitiesofthefirstsignificant

digits.

TableProbabilitiesoffirstdigitsunderBenford'sLaw

Firstnonzerodigit

1

2

3

4

5

6

7

8

9

ProbabilityaccordingtoBenford'slaw

0.301

0.176

0.125

0.097

0.079

0.067

0.058

0.051

0.046

LayingDowntheLaw

TodemonstrateBenford'slaw,I'llconsidertwoexamplesthat

youcanverifyyourself.

ToseeBenford'slawinaction,openthephonebookofyourcity

ortowntoanypage,andrecordthenumberofhousenumbers

thatbeginwitheachnonzerodecimaldigit.Twopagesshould

town,therelativefrequenciesshouldresembletherespective

probabilitiespredictedbyBenford'slaw.

Table6-6showsresultscomputedfromthe413housenumbers

takenfromtwopagesofthe2005-2006

Narragansett/Newport/Westerly,RIYellowBook(WhitePages

section).

First

nonzero

digit

1

2

3

4

5

6

7

8

9

Relativefrequencyforfirst

digitofhousenumber

0.334

0.174

0.143

0.075

0.073

0.075

0.046

0.043

0.036

Probabilityaccordingto

Benford'slaw

0.301

0.176

0.125

0.097

0.079

0.067

0.058

0.051

0.046

Figure6-1showsthepatternmoreclearly.

law

AlthoughtheagreementwithBenford'slawisnotperfect,you

canseeareasonablygoodfit.Ifyoutakealargersampleof

totheprobabilitiespredictedbyBenford'slaw.

Stockprices

ThestockmarketisknowntofollowBenford'slaw.Youcan

verifythisyourselfbyobtainingup-to-the-minuteNASDAQ

Securitiespricesat

http://quotes.nasdaq.com/reference/comlookup.stm.

Figure6-2andTable6-7showtherelativefrequenciesofthe

firstnonzerodecimaldigitsforNASDAQSecuritiesasofJanuary

27,2006,comparedtotheprobabilitiespredictedbyBenford's

law.

Figure6-2.ThestockmarketfollowingBenford's

law

### Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Hack 63. Sense the Real Randomness of Life

Tải bản đầy đủ ngay(0 tr)

×