Tải bản đầy đủ - 0 (trang)
Chapter 26. System Performance and Profiling

Chapter 26. System Performance and Profiling

Tải bản đầy đủ - 0trang

Keepinginteractivesessionsresponsive

Processingbatchjobspromptly

MaximizingCPUutilization[1]

Crankingthroughasmanyprocessesperhouraspossible

PreventinganyparticularprocessfordominatingCPUtime

Systemperformancedegradeswhenoneofthesegoalsoverwhelmstheothers.

Theseproblemsareveryintuitive:iftherearefivetimesthenormalnumberof

usersloggedintoyoursystem,chancesarethatyoursessionwillbeless

responsivethanatlessbusytimes.

Performancetuningisamultifacetedproblem.Atitsmostbasic,performance

issuescanbelookedatasbeingeitherglobalorlocalproblems.Global

problemsaffectthesystemasawholeandcangenerallybefixedonlybythe

systemadministrator.TheseproblemsincludeinsufficientRAMorharddrive

space,inadequatelypowerfulCPUs,andscantynetworkbandwidth.Theglobal

problemsarereallytheresultofahostoflocalissues,whichallinvolvehow

eachprocessonthesystemconsumesresources.Often,itisuptotheuserstofix

thebottlenecksintheirownprocesses.

Globalproblemsarediagnosedwithtoolsthatreportsystem-widestatistics.For

instance,whenasystemappearssluggish,mostadministratorsrunuptime

(Section26.4)toseehowmanyprocesseswererecentlytryingtorun.Ifthese

numbersaresignificantlyhigherthannormalusage,somethingisamiss(perhaps

yourwebserverhasbeenslashdotted).

Ifuptimesuggestsincreasedactivity,thenexttooltouseiseitherpsortoptosee

ifyoucanfindthesetofprocessescausingthetrouble.Becauseitshowsyou

"live"numbers,topcanbeparticularlyusefulinthissituation.Ialsorecommend

checkingtheamountofavailablefreediskspacewithdf,sinceafullfilesystem

isoftenanunhappyone,anditsmiseryspreadsquickly.

Onceparticularprocesseshavebeenisolatedasbeingproblematic,it'stimeto

thinklocally.Processperformancesufferswheneitherthereisn'tmoreCPUtime



availabletofinishatask(thisisknownasaCPU-boundprocess)ortheprocess

iswaitingforsomeI/Oresource(i.e.,I/O-bound),suchastheharddriveor

network.OnestrategyfordealingwithCPU-boundprocesses,ifyouhavethe

sourcecodeforthem,istouseaprofilerlikeGNU'sgprof.Profilersgivean

accountingforhowmuchCPUtimeisspentineachsubroutineofagiven

program.Forinstance,ifIwanttoprofileoneofmyprograms,I'dfirstcompile

itwithgccandusethe-pgcompilationflag.ThenI'druntheprogram.This

createsthegmon.outdatafilethatgprofcanread.NowIcanusegproftogive

meareportwiththefollowinginvocation:



$gprof-bexecutablegmon.out

Here'sanabbreviatedversionoftheoutput:



Flatprofile:

Eachsamplecountsas0.01seconds.

notimeaccumulated



%cumulativeselfselftotal

timesecondssecondscallsTs/callTs/calln

0.000.000.0020.000.00d

0.000.000.0010.000.00g

0.000.000.0010.000.00p

Here,weseethatthreesubroutinesdefinedinthisprogram

(die_if_fault_occurred,get_double,and

print_values)werecalled.Infact,thefirstsubroutinewascalledtwice.

Becausethisprogramisneitherprocessor-norI/O-intensive,nosignificanttime

isshowntoindicatehowlongeachsubroutinetooktorun.Ifonesubroutine

tookasignificantlylongertimetorunthantheothers,oronesubroutineiscalled

significantlymoreoftenthantheothers,youmightwanttoseehowyoucan

makethatproblemsubroutinefaster.Thisisjustthetipoftheprofilingiceberg.

Consultyourlanguage'sprofilerdocumentationformoredetails.

Onelessdetailedwaytolookatprocessesistogetanaccountingofhowmuch



timeaprogramtooktoruninuserspace,inkernelspace,andinrealtime.For

this,thetime(Section26.2)commandexistsaspartofbothCandbashshells.

Asanexternalprogram,/bin/timegivesaslightlylessdetailedreport.Nospecial

compilationisnecessarytousethisprogram,soit'sagoodtooltousetogeta

firstapproximationofthebottlenecksinaparticularprocess.

ResolvingI/O-boundissuesisdifficultforusers.Onlyadminstratorscanboth

tweakthelow-levelsystemsettingsthatcontrolsystemI/Obufferingandinstall

newhardware,ifneeded.CPU-boundprocessesmightbeimprovedbydividing

theprogramintosmallerprogramsthatfeeddatatoeachother.Ideally,these

smallerprogramscanbespreadacrossseveralmachines.Thisisthebasisof

distributedcomputing.

Sometimes,youwantaparticularprocesstohogallthesystemresources.Thisis

thedefinitionofadedicatedserver,likeonethathoststheApachewebserveror

anOracledatabase.Often,serversoftwarewillhaveconfigurationswitchesthat

helptheadministratorallocatesystemresourcesbasedontypicalusage.This,of

course,isfarbeyondthescopeofthisbook,butdocheckoutWebPerformance

TuningandOraclePerformanceTuningfromO'Reillyformoredetails.For

moresystem-widetips,pickupSystemPerformanceTuning,alsofromO'Reilly.

Aswithsomanythingsinlife,youcanimproveperformanceonlysomuch.In

fact,byimprovingperformanceinonearea,you'relikelytoseeperformance

degradeinothertasks.Unlessyou'vegotamachinethat'sdedicatedtoavery

specifictask,bewarethetemptationtoover-optimize.

—JJ



26.2TimingPrograms

Twocommands,timeand/bin/time,providesimpletimings.Theirinformationis

highlyaccurate,becausenoprofilingoverheaddistortstheprogram's

performance.Neitherprogramprovidesanyanalysisontheroutineortrace

level.Theyreportthetotalexecutiontime,someotherglobalstatistics,and

nothingmore.Youcanusethemonanyprogram.

timeand/bin/timedifferprimarilyinthattimeisbuiltintomanyshells,including



bash.Therefore,itcannotbeusedinsafelyportableBourneshellscriptsorin

makefiles.ItalsocannotbeusedifyouprefertheBourneshell(sh)./bin/timeis

anindependentexecutablefileandthereforecanbeusedinanysituation.Toget

asimpleprogramtiming,entereithertimeor/bin/time,followedbythe

commandyouwouldnormallyusetoexecutetheprogram.Forexample,totime

aprogramnamedanalyze(thattakestwocommand-linearguments,aninputfile

andanoutputfile),enterthefollowingcommand:



%timeanalyzeinputdataoutputfile

9.0u6.7s0:3018%23+24k285+148io625pf+0w

Thisresult(inthedefaultCshellformat)indicatesthattheprogramspent9.0

secondsonbehalfoftheuser(usertime),6.7secondsonbehalfofthesystem

(systemtime,ortimespentexecutingUnixkernelroutinesontheuser'sbehalf),

andatotalof30secondselapsedtime.Elapsedtimeisthewallclocktimefrom

themomentyouenterthecommanduntilitterminates,includingtimespent

waitingforotherusers,I/Otime,etc.

Bydefinition,theelapsedtimeisgreaterthanyourtotalCPUtimeandcaneven

beseveraltimeslarger.Youcansetprogramstobetimedautomatically(without

typingtimefirst)orchangetheoutputformatbysettingshellvariables.

TheexampleaboveshowstheCPUtimeasapercentageoftheelapsedtime(18

percent).TheremainingdatareportsvirtualmemorymanagementandI/O

statistics.Themeaningvaries,dependingonyourshell;checkyouronlinecsh

manualpageorarticle.

Inthisexample,underSunOS4.1.1,theotherfieldsshowtheamountofshared

memoryused,theamountofnonsharedmemoryused(k),thenumberofblock

inputandoutputoperations(io),andthenumberofpagefaultsplusthenumber

ofswaps(pfandw).Thememorymanagementfiguresareunreliableinmany

implementations,sotakethemwithagrainofsalt.

/bin/timereportsonlytherealtime(elapsedtime),usertime,andsystemtime.

Forexample:



%/bin/timeanalyzeinputdataoutputfile



60.8real11.4user4.6sys

[Ifyouuseashellwithoutabuilt-intimecommand,youcanjusttypetime.—

JP]Thisreportsthattheprogramranfor60.8secondsbeforeterminating,using

11.4secondsofusertimeand4.6secondsofsystemtime,foratotalof16

secondsofCPUtime.OnLinuxandsomeothersystems,thatexternaltime

commandisin/usr/bin/timeandmaymakeamoredetailedreport.

There'sathirdtimeronsomesystems:timex.Itcangivemuchmoredetailif

yoursystemhasprocessaccountingenabled.Checkthetimex(1)manpage.

—ML



26.3WhatCommandsAreRunningandHow

LongDoTheyTake?

Whenyoursystemissluggish,youwillwanttoseewhatusersareonthesystem

alongwiththeprocessesthey'rerunning.Togetabriefsnapshotofthis

information,theterselynamedwcanshowyouwhoisloggedin,fromwhere,

howlongthey'vebeenidle,andwhatprogramsthey'rerunning.Forinstance,

whenIrunwonmyRedHatboxathome,Igetthisresult:



3:58pmup38days,4:37,6users,loadaverage:0

USERTTYFROMLOGIN@IDLEJCP

jjohntty2-13Feb027:03m1.32

jjohnpts/1:08:55am7:02m0.06

jjohnpts/3:08:55am0.00s51.01

jjohnpts/0:08:55am7:02m0.06

jjohnpts/4:08:55am2:25m2:01

jjohnpts/2mp3.daisypark.neTue4pm3:41m0.23

Originally,IloggedinattheconsoleandstartedX.Mostofthesessionsare

xterminalsexceptforthelast,whichisansshsession.TheJCPUfieldaccounts

fortheCPUtimeusedbyalltheprocessesatthatTTY.ThePCPUsimply

accountsfortheprocessnamedintheWHATfield.Thisisaquickandsimple



commandtoshowyouthestateofyoursystem,anditreliesonnospecial

processaccountingfromthekernel.

Whenyou'redebuggingaproblemwithaprogram,tryingtofigureoutwhyyour

CPUusagebillissohigh[inthedayswhenCPUcycleswererented—JJ],or

curiouswhatcommandssomeone(includingyourself)isrunning,thelastcomm

commandonBerkeley-likeUnixescanhelp(ifyourcomputerhasitsprocess

accountingsystemrunning,thatis).Here'sanexamplethatliststheuserlesleys:



%date

MonSep416:38:13EDT2001

%lastcommlesleys

emacslesleysttyp11.41secsWedSep4

catXlesleysttyp10.06secsWedSep4

sttylesleysttypa0.02secsWedSep4

tsetlesleysttypa0.12secsWedSep4

sedlesleysttypa0.02secsWedSep4

hostnamelesleysttypa0.00secsWedSep4

quotalesleysttypa0.16secsWedSep4

...

Theprocessesarelistedintheordercompleted,mostrecentfirst.Theemacs

processonthetty(Section2.7)ttyp1started10minutesagoandtook1.41

secondsofCPUtime.Sometimewhileemacswasonttyp1,lesleysrancatand

killedit(theXshowsthat).Becauseemacsranonthesameterminalascatbut

finishedlater,Lesleymighthaveemacs(withCTRL-z)stopped(Section23.3)

toruncat.Theprocessesonttypaaretheonesrunfromher.cshrcand.login

files(thoughyoucan'ttellthatfromlastcomm).Youdon'tseetheloginshellfor

ttypa(csh)herebecauseithasn'tterminatedyet;itwillbelistedafterLesley

logsoutofttypa.

lastcommcandomore.Seeitsmanualpage.

Here'sahint:onabusysystemwithlotsofusersandcommandsbeinglogged,

lastcommisprettyslow.Ifyoupipetheoutputorredirectitintoafile,likethis:



teeSection43.8

%lastcommlesleys>lesley.cmds&

%catlesley.cmds

...nothing...

%lastcommlesleys|teelesley.cmds

...nothing...

thelastcommoutputmaybewrittentothefileorpipeinbigchunksinsteadof

line-by-line.Thatcanmakeitlookasifnothing'shappening.Ifyoucantieupa

terminalwhilelastcommruns,therearetwoworkarounds.Ifyou'reusinga

windowsystemorterminalemulatorwitha"logtofile"command,useitwhile

lastcommruns.Otherwise,tocopytheoutputtoafile,startscript(Section37.7)

andthenrunlastcomm:



%scriptlesley.cmds

Scriptstarted,fileislesley.cmds

%lastcommlesleys

emacslesleysttyp11.41secsWedSep4

catXlesleysttyp10.06secsWedSep4

...

%exit

Scriptdone,fileislesley.cmds

%

Afinalword:lastcommcan'tgiveinformationoncommandsthatarebuiltinto

theshell(Section1.9).Thosecommandsarecountedaspartoftheshell's

executiontime;they'llbeinanentryforcsh,sh,etc.aftertheshellterminates.

—JPandJJ



26.4CheckingSystemLoad:uptime



Gotohttp://examples.oreilly.com/upt3formoreinformationon:uptime

TheBSDcommanduptime,alsoavailableunderSystemVRelease4,AIX,and

someSystemVRelease3implementations,willgiveyouaroughestimateofthe

systemload:



%uptime

3:24pmup2days,2:41,16users,loadaverage:1.90,1

uptimereportsthecurrenttime,theamountoftimethesystemhasbeenup,and

threeloadaveragefigures.TheloadaverageisaroughmeasureofCPUuse.

Thesethreefiguresreporttheaveragenumberofprocessesactiveduringthelast

minute,thelast5minutes,andthelast15minutes.Highloadaveragesusually

meanthatthesystemisbeingusedheavilyandtheresponsetimeis

correspondinglyslow.Notethatthesystem'sloadaveragedoesnottakeinto

accounttheprioritiesandniceness(Section26.5)oftheprocessesthatare

running.

What'shigh?Asusual,thatdependsonyoursystem.Ideally,you'dlikeaload

averageunder,say,3,butthat'snotalwayspossiblegivenwhatsomesystemsare

requiredtodo.Higherloadaveragesareusuallymoretolerableonmachines

withmorethanoneprocessor.Ultimately,"high"meanshighenoughthatyou

don'tneeduptimetotellyouthatthesystemisoverloaded—youcantellfrom

itsresponsetime.

Furthermore,differentsystemsbehavedifferentlyunderthesameloadaverage.

Forexample,onsomeworkstations,runningasingleCPU-boundbackground

jobatthesametimeastheXWindowSystem(Section1.22)willbring

responsetoacrawleventhoughtheloadaverageremainsquite"low."Inthe

end,loadaveragesaresignificantonlywhentheydifferfromwhateveris

"normal"onyoursystem.

—AF



26.5KnowWhentoBe"nice"toOther

Users...andWhenNotTo



TheBSD-SystemVsplitisn'tsoobviousinmodernUnixes,butthedifferent

prioritysystemsstillliveinvariousflavors.Thisarticleshouldhelpyou

understandthesysteminwhateverversionyouhave.

IfyouaregoingtorunaCPU-bound(Section26.1)processthatwill

monopolizetheCPUfromotherprocesses,youmayreducetheurgencyofthat

moreintensiveprocessintheeyesoftheprocessschedulerbyusingnicebefore

youruntheprogram.Forexample:



$niceexecutable_filename

Onmostsystems,nousercandirectlychangeaprocess'spriority(onlythe

schedulerdoesthat),andonlytheadministratorcanusenicetomakeaprocess

moreurgent.Inpractice,niceisrarelyusedonmultiusersystems—thetragedy

ofthecommons—butyoumaybeabletogetmoreprocessesrunning

simultaneouslybyjudicioususeofthisprogram.

Ifyou'renotfamiliarwithUnix,youwillfinditsdefinitionofpriorityconfusing

—it'stheoppositeofwhatyouwouldexpect.Aprocesswithahighnicenumber

runsatlowpriority,gettingrelativelylittleoftheprocessor'sattention;similarly,

jobswithalownicenumberrunathighpriority.Thisiswhythenicenumberis

usuallycalledniceness:ajobwithalotofnicenessisverykindtotheotherusers

ofyoursystem(i.e.,itrunsatlowpriority),whileajobwithlittlenicenesshogs

theCPU.Theterm"niceness"isawkward,liketheprioritysystemitself.

Unfortunately,it'stheonlytermthatisbothaccurate(nicenumbersareusedto

computeprioritiesbutarenottheprioritiesthemselves)andavoidshorrible

circumlocutions("increasingtheprioritymeansloweringthepriority...").

Manysupposedlyexperiencedusersclaimthatnicehasvirtuallynoeffect.Don't

listentothem.Asageneralrule,reducingthepriorityofanI/O-boundjob(ajob

that'swaitingforI/Oalotofthetime)won'tchangethingsverymuch.The

systemrewardsjobsthatspendmostoftheirtimewaitingforI/Obyincreasing

theirpriority.ButreducingthepriorityofaCPU-boundprocesscanhavea

significanteffect.Compilations,batchtypesettingprograms(troff,TEX,etc.),

applicationsthatdoalotofmath,andsimilarprogramsaregoodcandidatesfor

nice.Onamoderatelyloadedsystem,Ihavefoundthatnicetypicallymakesa

CPU-intensivejobroughly30percentslowerandconsequentlyfreesthatmuch



timeforhigherpriorityjobs.Youcanoftensignificantlyimprovekeyboard

responsebyrunningCPU-intensivejobsatlowpriority.

NotethatSystemVRelease4hasamuchmorecomplexprioritysystem,

includingreal-timepriorities.Prioritiesaremanagedwiththepriocntlcommand.

Theoldernicecommandisavailableforcompatibility.OtherUnix

implementations(includingHPandConcurrent)supportreal-timescheduling.

Theseimplementationshavetheirowntoolsformanagingthescheduler.

Thenicecommandsetsajob'sniceness,whichisusedtocomputeitspriority.It

maybeoneofthemostnonuniformcommandsintheuniverse.Therearefour

versions,eachslightlydifferentfromtheothers.BSDUnixhasonenicethatis

builtintotheCshell,andanotherstandaloneversioncanbeusedbyothershells.

SystemValsohasonenicethatisbuiltintotheCshellandaseparatestandalone

version.

UnderBSDUnix,youmustalsoknowabouttherenice(8)command(Section

26.7);thisletsyouchangethenicenessofajobafteritisrunning.UnderSystem

V,youcan'tmodifyajob'snicenessonceithasstarted,sothereisnoequivalent.

Thinkcarefullybeforeyouniceaninteractivejoblikeatexteditor.

SeeSection26.6.



We'lltacklethedifferentvariationsofniceinorder.



26.5.1BSDCShellnice

UnderBSDUnix,nicenumbersrunfrom-20to20.The-20designation

correspondstothehighestpriority;20correspondstothelowest.Bydefault,

Unixassignsthenicenumber0touser-executedjobs.Thelowestnicenumbers

(-20to-17)areunofficiallyreservedforsystemprocesses.Assigningauser'sjob

tothesenicenumberscancauseproblems.Userscanalwaysrequestahigher

nicenumber(i.e.,alowerpriority)fortheirjobs.Onlythesuperuser(Section

1.18)canraiseajob'spriority.

Tosubmitajobatagreaterniceness,precedeitwiththemodifiernice.For



example,thefollowingcommandrunsanawkcommandatlowpriority:



%niceawk-fproc.awkdatafile>awk.out

Bydefault,thecshversionofnicewillsubmitthisjobwithanicelevelof4.To

submitajobwithanarbitrarynicenumber,useniceoneoftheseways,wheren

isanintegerbetween0and20:



%nice+ncommand

%nice-ncommand

The+ndesignationrequestsapositivenicenumber(lowpriority);-nrequestsa

negativenicenumber.Onlyasuperusermayrequestanegativenicenumber.



26.5.2BSDStandalonenice

ThestandaloneversionofnicediffersfromCshellniceinthatitisaseparate

program,notacommandbuiltintotheCshell.Youcanthereforeusethe

standaloneversioninanysituation:withinmakefiles(Section11.10),whenyou

arerunningtheBourneshell,etc.Theprinciplesarethesame.nicenumbersrun

from-20to20,withthedefaultbeing0.Onlythesyntaxhasbeenchangedto

confuseyou.Forthestandaloneversion,-nrequestsapositivenicenumber

(lowerpriority)and--nrequestsanegativenicenumber(higherpriority—

superuseronly).Considerthesecommands:



$nice-6awk-fproc.awkdatafile>awk.out

#nice--6awk-fproc.awkdatafile>awk.out

Thefirstcommandrunsawkwithahighnicenumber(i.e.,6).Thesecond

command,whichcanbeissuedonlybyasuperuser,runsawkwithalownice

number(i.e.,-6).Ifnolevelisspecified,thedefaultargumentis-10.



26.5.3SystemVCShellnice

SystemVtakesaslightlydifferentviewofnicenumbers.nicelevelsrunfrom0



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Chapter 26. System Performance and Profiling

Tải bản đầy đủ ngay(0 tr)

×