Tải bản đầy đủ - 0 (trang)
Hack 57. Plot Histograms in Excel

Hack 57. Plot Histograms in Excel

Tải bản đầy đủ - 0trang

Let'sthinkofeachrangeasabucket.Everyplayer-seasongoes

intoabucket.Forexample,in1959,HankAaronhada.354

average,sowe'llputthatseasoninthe.350-.355bucket.So,

here'sourplan:we'llputeachplayer-seasonintoabucket,

countthenumberofplayer-seasonsineachbucket,anddrawa

graphshowing(inascendingorder)thenumberofplayersin

eachbucket.Thissinglediagramisahistogram.



TheCode

Inthisexample,Iwantedtolookatthedistributionofbatting

average.Iusedatablecontainingthetotalbattingstatisticsfor

eachplayerineachyear(andthelistofallteamsforwhich

eachplayerplayed),andIcalledthetableb_and_t.Iselected

onlybatterswithenoughplateappearancestoqualifyfora

leaguetitle,andonlythoseplayerswhoplayedbetween1955

and2004:

SELECTb.playerID,M.nameLast,M.nameFirst,b.yearID,b.teamG,

b.teamIDs,b.AB,b.H,

b.H/b.ABASAVG,

b.AB+b.BB+b.HBP+b.SFasPA

FROMb_and_tbinnerjoinMasterM

onb.playerID=m.playerID

WHEREyearID>1954

ANDb.AB+b.BB+b.HBP+b.SF>b.teamG*3.1;



Afterrunningthisquery,IsavedtheresultstoanExcelfile

namedbatting_averages.xls.

OnewaytodrawhistogramsinExcelistousetheAnalysis

ToolPakadd-in.YoucanaddthisbyselectingAdd-Ins...from

theToolsmenu,andthenselectingAnalysisToolPak.Thisadds

anewmenuitemtotheToolsmenu,calledDataAnalysis,



whichintroducesseveralnewfunctions,includingaHistogram

function.ButIfindthisinterfaceconfusingandinflexible,soI

dosomethingelse.

Hereismymethodforcreatingahistogram:

1. Inthedataworksheet,createanewcolumncalledRange.

2. Inthefirstcellofthiscolumn,useafunctiontoroundthe

valueforwhichyouwouldliketoplotthedistribution.The

simplewaytodothisistousetheSignificantFiguresoption

oftheROUNDfunction.Inmyworksheet,columnIcontained

thevalueforwhichIwantedtocalculatethedistribution

(battingaverage),soIcoulduseaformulasuchas

ROUND(I2,2)toroundtothenearest.010.Personally,Ifinda

bucketsizeof.005tobemoredescriptive,soIuseatrick.

YoucanmultiplyavalueinsidetheROUNDfunctionandthen

divideoutsidethefunctiontogetbucketsofalmostany

size.InsidetheROUNDfunction,Imultiplybythereciprocalof

thebucketsizeinthiscase,1/.005=200.Outsidethe

function,Imultiplybythebucketsize.Inmyworksheet,

columnIcontainedtheaveragevalues.So,IusedROUND(I2*

200,0)/200asmyformula.Copyandpastethisformulainto

everyrowoftheworksheet.(Youcandouble-clickthe

bottom-rightcornerofthecelltodothisquickly.)

3. Now,we'rereadytocountthenumberofplayersineach

bucket.Selectallofthedataintheworksheet,includingthe

newRangecolumn.FromtheDatamenu,selectPivotTable

andPivotChartReport.SelectPivotChartReportandclick

Finish(we'lluseallthedefaults).Wewillselecttwofields

forourpivottable.FromthePivotTableFieldListpalette,

selectRange.Drag-and-dropthisontotheDropRowFields

Herepartofthepivottable.Next,drag-and-drop"playerID"

ontotheDropDataItemHerepartofthepivottable.By

default,ExcelwillcountthenumberofplayerIDsinthe

underlyingdatathatmatcheachrangevalue.Thepivot

tableisnowshowingthenumberofitemsineachbucket.



Youshouldseea(veryugly)graphwiththenumberof

playersineachbucket.

4. Cleanupthegraph.(Iliketoerasethebackgroundfilland

linesandchangethewidthofthecolumns.)Figure5-5

showsanexampleofacleaned-upgraph.



Figure5-5.Histogramfromapivotchartreport



Lookingatthehistogram,weseethatthedistributionlooks

similartoabellcurve;itskewstowardtherightandiscentered

ataround.275.



HackingtheHack

Oneofthenicethingsaboutcalculatingbinswithformulasis

thatyoucaneasilychangetheformulaforbinning.Herearea

fewsuggestionsforotherformulas:



ROUNDDOWN(,)andROUNDUP(,

)

ThisROUNDDOWNfunctionroundsdowntothenearest

significantfigure.Forexample,ROUNDDOWN(3.59,0)equals3,

andROUNDDOWN(3.59,1)equals3.5.Similarly,ROUNDUProundsup

tothenearestsignificantfigure.ROUNDUP(3.59,0)equals4,

andROUNDUP(3.59,1)equals3.6.



LOG(,)

Sometimesit'susefultoplotavalueonalogarithmicscale,

andtouselogarithmic-sizebins.YoucancombineLOG

functionswithROUNDfunctionstocreatevariable-sizebins.



CONCATENATE(...)

TheCONCATENATEfunctiondoesn'tcomputenumbers,itputs

texttogether.Ifyouwanttoexplicitlylistranges(suchas

3.500-3.599),youcanusetheCONCATENATEfunctiontocreate

these;forexample,CONCATENATE(ROUNDDOWN(3.59,1),"to

",ROUNDUP(3.59,1)-0.01)returns3.5to3.59.

Ifyouwanttotakethistothenextlevel,youcanreplacethe

binsizewithanamedvalue.(Forexample,namecellA1

bin_size.)Thismakesiteasytochangethebinsizedynamically

andexperimentwithdifferentnumbersofbins.

JosephAdler







Hack58.GoforTwo



Infootball,whenisthetwo-pointconversionattemptthe

rightchoice?Regardlessofwhich"chart"you'reusing,

theproblemgetsevenmorecomplicatedwhen

statisticiansenterthedebate.

Afewyearsback,Iwasenjoyingwatchingmylocalprofessional

footballteamastheywerelosingaclosegame.Iwasn't

entertainedbymyteam'sdismalperformanceasmuchasIwas

delightedbymyteam'sbefuddledcoachasheattemptedto

readandunderstandatwo-pointconversionchart.



Infootball,afteratouchdownisscored(thetouchdownitselfisworth

sixpoints),thescoringteamhastwooptionsforscoringan"extra

point"ortwo.Usually,theteamchoosestokickasingleextrapoint

throughtheuprights(likeashort-distancefieldgoal),buttheymight

alsochooseto"gofortwo"points(knownasthetwo-pointconversion),

whichinvolvestheoffenserushingorpassingforanothertripintothe

endzone.



Atthetime,aswaslater"confirmed"bysportswriters,itwas

clearthathewasn'tsurehowtoreadthechart.Specifically,

wheninterpretingthecolumnonthechartthatlistedhowmany

pointsbehindoraheadateamwas,hethoughtthismeanthow

manypointsaheadorbehindateamwouldbeiftheymadethe

point-afterconversion.

AsImusedabouthowanNFLheadcoachmightneverhave

learnedtoreadsuchachart,Ibegantowonderwhoproduced

this"chart"andwhatprinciplesitwasbasedon.Later,asI



searchedforthe"officialchart,"Ifoundtwo"official"charts,

andtheydidn'talwaysagree.

Morerecently,Iranacrossachartbasedonastatistical

analysisoftheprobabilityofpossibleoutcomesandonthe

amountoftimeremaining(asindicatedbythenumberof

possessionsremaining).Thischartdidn'tagreewitheitherof

theearlierchartsIdiscovered.

Thishackisforyou,Coach.Itexaminesfromastatistical

perspectivewhentogofortwopointsandwhentosettlefor

one.



TraditionalTwo-PointConversionCharts

WhenyouseeacoachonTVholdingaplasticlaminatedcard

andstudyingitbeforedecidingwhethertogofortwo,

sportscastersliketorefertothecardasthechart,though,as

mentionedintheprevioussection,there'smorethanonechart

inuse.Theslightdifferencesmightbeduetothefactthatone

isidentifiedasbeingusedintheNFLandtheotherisidentified

asaclassicsetofstandarddecisionsusedincollegefootball.

Thedifferencesmightalsobebasedonthefactthatthecollege

chartwasproducedforacertainteamthatmayhavehada

moreaggressiveorconfidentstyle.Thecollegechartseemsto

playforavictory,notatie.Thoughcollegeballnowhas

overtimerules,theyareafairlyrecentdevelopment,whereas

theproshavehadovertimeforawhile.

TheNFLchartisprovidedonNormHitzges'website(Normisa

broadcasterinDallasandanall-aroundsportsguru)at

http://www.normhitzges.com/thechart.htm.Thecollegechart

(foundathttp://www.NFL.com/fans/twopointconv.html)is

identifiedastheoneusedinthe1970sanddevelopedatthe

UniversityofCalifornia,LosAngeles(UCLA).Table5-14



providesthesuggesteddecisionsfrombothchartsandis

condensedabit.

TableClassicdecisionmakingfortwo-pointattempts







Behind(NFL)

Behind(College)



Ahead(NFL)

Ahead(College)



Pointsbehindorahead

0

1



0

1







1

1

2

1

2

2



2

2

2

2

1

1



3

1

1

3

1

1



4

1



4

2

2



5

2

2

5

2

2



6

1

1

6

1

1



7

1

1

7

1

1



8

1

1

8

1

1



9

1

2

9

1

1



10

2

1

10

1

1











11

1

2

11

2

1



12

1

2

12

2

2



TheUCLAchartdoesnotprovidesuggestionsforwhenthe

scoreistiedorwhenyourteamisbehindbyfourpoints.The

NFLchart,ontheotherhand,isfullofadviceforalloccasions.

Asdiscussed,theprimarydifferenceseemstobewhether

you'rewillingtoplayforthetieornot.UCLAclearlydidnot

wishtoplayforthetie,whiletheNFLcharthasnosuch

hesitancy.



ModernSuper-ScientificChart

Intherealworld,asetofstatisticalprobabilitiescontrolsthe

outcomeofasportingevent,andthedecisionaboutwhetherto

gofortwoortaketheextrapointshouldbebasedonmore

informationthanjustthescoreandwhetheryourteamis

winningorlosing.Inactualgamesituations,smartcoaches

takethefollowingadditionalfactorsintoaccount:

Thelikelihoodthattheirfieldgoalkickerwillmakethefield

goal

Thelikelihoodthattheirteamwillscoreonagiventwo-



pointconversionplay

Thecurrenthealth,attitude,andskilloftheirplayers

Howmanymorepossessionstheirteamwillreceive

PaststatisticsshowthattheaverageNFLfootballteammakes

about98percentofitsextrapointsandabout40percentofits

two-pointattempts.Coachesmustusetheirexperienceand

intuitiontogaugetheirplayers'currentabilitylevel,andachart

isn'tmuchhelponthatscore.

Asforpossessionsleft,however,thisisexactlythetypeof

informationthatdecisionsystemsbasedonprobabilityneedto

takeintoaccount.Basedonaprocessofworkingbackward

fromtheendingofahypotheticalfootballgamethattakesthe

probabilityofsuccessoneitheroption(98percentforone-point

playsand40percentfortwo-pointplays)intoaccount,

statisticianshaveproducedachartbasedonnotonlyonthe

currentscore,butalsoonthetotalnumberofpossessions

remainingforbothteams.

Ina2000issueofChancemagazine(Vol.13,No.3),Harold

Sackrowitzpresentedtheresultsofsuchananalysisusinga

processcalleddynamicprogramming.Table5-15showsa

portionofDr.Sackrowitz'schart.

TableModerndecisionmakingfortwo-pointattempts







Possessions

remaining

1



2



3



Pointsbehindor

ahead















0



1 2 3 4 5 6 7 8 9 10 11 12























12 1

211 2111

12112 12

21112111

12112 12



























Behind 1

Ahead 1

Behind 1

Ahead 1

Behind 1







2



2





4



5



6





Ahead 1

Behind 1

Ahead 1

Behind 1

Ahead 1

Behind 1

Ahead 1



2111211111

1211211222

2211211111

1211211222

2111211111

1211211222

2211211111



1

1

1

1

1

1

1



2



2



2

2

2



Thistwo-pointconversionchartisbasedonthebranching

possibilitiesstartingatdifferentpointsinthegameand

assumingbasicprobabilitiesofsuccessforeitheranextrapoint

oratwo-pointconversion.AnaverageNFLquarterseessix

possessionsintotal,sothinkofthischartasbeingmostuseful

inthefourthquarter.Sackrowitzalsoassumesa50percent

chanceforovertimevictories.



HowItWorks

ThecalculationsforTable5-15worksomethinglikethissimple

example:

1. Imagineyouaredownbyonepointwithoutmuchchanceof

gettingtheballagain.

2. Youhavea98percentchanceofmakinganextrapointkick

anda50percentchanceofwinninginovertime.Goingfor

theextrapointresultsinavictory49percentofthetime

(.98x.50=.49).

3. Youhavea40percentchanceofconvertingatwo-point

play,sogoingfortwopointsresultsinavictory40percent

ofthetime.Failureendsthegame,andsuccesswinsthe

game.



4. 49percentisbetterthan40percent,soyoushouldelectto

gofortheextrapoint.Noticethatifyoubelieveyourteam's

chancesofconvertingthetwo-pointplayarebetterthan49

percent,youshouldgoforit.Calculationslikethese,but

overalongerseriesofpossessions,resultinthedecision

treereflectedinTable5-15.

Whichchartshouldyouusethenexttimeyoufindyourself

coachinginacrucialfootballgamewithakeydecisiontomake?

That'suptoyou,butjustrememberthatbefuddledfootball

coachIwatchedonTVafewyearsago.Notonlywashe

replacedthenextyearbyDickVermeil,consideredoneofthe

brighterfootballcoachesaround,butitwasVermeilwhohelped

developtheUCLAtwo-pointconversionchartshowninTable514.Nowyouknowtherestofthestory!



Hack59.RankwiththeBestofThem



Therearemanywaystousedatatomakejudgments

aboutwhoisbestinanysport.Alltheintuitivewaysto

compareperformanceinindividualsportshavevalidity

concerns,however.

MyfriendsandIareacompetitivelot.Ourarenaofcombat,

mostrecently,hasbeenpoker.Onaregularbasis,myfriends

andIgatheratmyhomeandtakepartinaTexasHold'Em

pokertournament.It'saninformalaffair,butwealltakeitvery

seriously.Thewayourpokertournamentswork,everyone

startswiththesameamountofchips,andwhentheyaregone

theyaregone.Thereisafirstoneout,alastoneout,and

everythinginbetween.So,forexample,ifsevenpeopleplay,

someonecomesinfirst,second,third,fourth,fifth,sixth,and

seventh.

Weallthinkofourselvesasprettygoodand,beingcompetitive,

wehavelongedforanobjectivemethodofcomparing

performanceacrosstournaments.Asoneofthestatisticiansin

thegroup,Itookituponmyselftodevisevariouswaysof

producingsomesortofobjectiveindexthatwouldallowall

participantstocomparetheirperformancewitheachotherto

decideonceandforallwhoisthebestplayerandwhoisonly

luckynowandagain.Thisisthestoryofmyquestandthe

statisticalsolutionsIchose.Nottogivetheendingaway,butI

learnedthatthereisnosinglebestsolution.



HowtoRankFairly

Thisbusinessofhowtoidentifythebestisacommonproblem



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Hack 57. Plot Histograms in Excel

Tải bản đầy đủ ngay(0 tr)

×
x