Tải bản đầy đủ - 0 (trang)
Chapter 34. The sed Stream Editor

Chapter 34. The sed Stream Editor

Tải bản đầy đủ - 0trang

that"whenyouactuallydo.

sed(streameditor)amazesme.Why?It'snotjustthatsedcaneditdataasit

streamsthroughapipe(likeallwell-behavedUnixfilters(Section1.5)do).sed

cantestandbranchandsubstituteandholdandexchangedataasitstreams

through,butsocanalmostanyscriptinglanguage.Maybeit'stheminimalistin

methatlovesatinyprogram(bytoday'sstandards,atleast)withjustafew

operations—butoperationssowell-chosenthattheymakethetoolpowerfulfor

itssize.Sure,sure,Perlprobablycandoeverythingthatsedcan—anddoeach

ofthosethingsintwentydifferentways.Ah,I'vegotit:whenI'mtryingtodo

anythingmorethanasimplesubstitutionondatastreamingby,sed'selegant

simplicityalmostforcesmetostripaproblemtoitsbasics,tothinkofwhatI

reallyneedtodo.Nofunctions,nolibraries,nothingexceptbeautifullysimple

functionality.

[AssomeonewholearnedPerlregularexpressionsbeforeIlearnedsed,Ican

relatetowhatJerryissaying.OneofthethingsIlikeabouttheclassicUnix

toolboxprogramslikesedisthattheyreallydoforceyouintoasortofShakerlikeelegantsimplicity;thebestprograms,nomatterwhatthelanguage,havea

qualitylikeaShakerchair:purefunction,butwitharespectforthefactthat

functiondoesn'thavetobeugly.—SJC]

Endofsermon.;-)Evenifyouaren'tintoeleganceandsimplicity,andyou

justwannagetthejobdone,whatdowecoveraboutsedthatmightbeuseful?

Inthischapter,westartoutwiththebasics:Section34.2,Section34.3,Section

34.4,Section34.5,Section34.6,andSection34.7showyouhowtogetstarted,

howtotestyourscripts,andhowtostructuremoreadvancedscripts.Section

34.8throughSection34.14coverregularexpressionsandcomplex

transformations.Section34.15throughSection34.24dealwithadvancedtopics

suchasmultilinematchinganddeletions,tests,andexitingascriptwhenyou're

done.

—JPandSJC



34.2TwoThingsYouMustKnowAboutsed



Ifyouarealreadyfamiliarwithglobaleditsinothereditorslikeviorex,you

knowmostofwhatyouneedtoknowtobegintousesed.Therearetwothings,

though,thatmakeitverydifferent:

1. Itdoesn'tchangethefileitedits.Itisjustwhatitsnamesays:a

"streameditor"—designedtotakeastreamofdatafromstandard

input(Section43.1)orafile,transformit,andpassittostandard

output(Section43.1).Ifyouwanttoeditafile,youhavetowriteashell

wrapper(Section34.4)tocapturestandardoutputandwriteitback

intoyouroriginalfile.

sedcommandsareimplicitlyglobal.Inaneditorlikeex,thecommand:



s/old/new/

willchange"old"to"new"onlyonthecurrentlineunlessyouusetheglobal

commandorvariousaddressingsymbolstoapplyittoadditionallines.Insed,

exactlytheoppositeistrue.Acommandliketheoneabovewillbeappliedtoall

linesinafile.Addressingsymbolsareusedtolimittheextentofthematch.

(However,likeex,onlythefirstoccurrenceofapatternonagivenlinewillbe

changedunlessthegflagisaddedtotheendofthesubstitutioncommand.)

Ifallyouwanttodoismakesimplesubstitutions,you'rereadytogo.Ifyouwant

todomorethanthat,sedhassomeuniqueandpowerfulcommands.

Thischaptermakesnoattempttocovereverythingthereistoknowaboutsed.

Forthemostpart,thischaptersimplycontainsadviceonworkingwithsedand

extendedexplanationsofhowtousesomeofitsmoredifficultcommands.

—TOR



34.3Invokingsed

Ifyouwereusingsedonthefly,asastreameditor(Section34.2),youmight

executeitassimplyasthis:



%somecommand|sed's/old/new/'|othercommand



Givenfilenames,sedwillreadtheminsteadofstandardinput:



%sed's/old/new/'myfile

Asimplescriptcangorightonthecommandline.Ifyouwanttoexecutemore

thanoneeditingcommand,youcanusethe-eoption:



%sed-e's/old/new/'-e'/bad/d'myfile

Oryoucanusesemicolons(;),whichareasedcommandseparator:



%sed's/old/new/;/bad/d'myfile

Or(especiallyusefulinshellscripts(Section1.8))youcanusetheBourne

shell'sabilitytounderstandmultilinecommands:



sed'

s/old/new/

/bad/d'myfile

Oryoucanputyourcommandsintoafileandtellsedtoreadthatfilewiththe-f

option:



%sed-fscriptfilemyfile

There'sonlyoneothercommand-lineoption:-n.sednormallyprintseveryline

ofitsinput(exceptthosethathavebeendeletedbytheeditingscript).Butthere

aretimeswhenyouwantonlylinesthatyourscripthasaffectedorthatyou

explicitlyaskforwiththepcommand.Inthesecases,use-ntosuppressthe

normaloutput.

—TOR



34.4TestingandUsingasedScript:checksed,

runsed



Allbutthesimplestsedscriptsareofteninvokedfroma"shellwrapper,"ashell

script(Section35.2)thatinvokessedandalsocontainstheeditingcommands

thatsedexecutes.Ashellwrapperisaneasywaytoturnwhatcouldbea

complexcommandlineintoasingle-wordcommand.Thefactthatsedisbeing

usedmightbetransparenttousersofthecommand.

Twoshellscriptsthatyoushouldimmediatelyarmyourselfwitharedescribed

here.Bothuseashellforloop(Section35.21)toapplythesameeditstoany

numberoffiles.Butthefirstjustshowsthechanges,soyoucanmakesurethat

youreditsweremadecorrectly.Thesecondwritestheeditsbackintotheoriginal

file,makingthempermanent.



34.4.1checksed

Gotohttp://examples.oreilly.com/upt3formoreinformationon:checksed

Theshellscriptchecksedautomatestheprocessofcheckingtheeditsthatsed

makes.Itexpectstofindthescriptfile,sedscr,inthecurrentdirectoryand

appliestheseinstructionstotheinputfilesnamedonthecommandline.The

outputisshownbyapagerprogram;thedefaultpagerismore.



#!/bin/sh

script=sedscr



forfile

do

echo"**********<=$file>=sedoutput**

sed-f$script"$file"|diff"$file"done|${PAGER-more}

Forexample:



$catsedscr

s/jpeek@ora\.com/jpeek@jpeek.com/g

$checksedhome.htmlnew.html



**********<=home.html>=sedoutput**********

102c102

<Emailitoruseth

->Emailitoruse

124c124

jpeek

->Pagecreatedby:jpe

**********<=new.html>=sedoutput**********

22c22

<Sendcommentstom

-->Sendcommentsto

Ifyoufindthatyourscriptdidnotproducetheresultsyouexpected,perfectthe

editingscriptandrunchecksedagain.



34.4.2runsed

Gotohttp://examples.oreilly.com/upt3formoreinformationon:runsed

Theshellscriptrunsedwasdevelopedtomakechangestoafilepermanently.It

appliesyoursedscrtoaninputfile,createsatemporaryfile,thencopiesthatfile

overtheoriginal.runsedhasseveralsafetychecks:

Itwon'teditthesedscriptfile(ifyouaccidentallyincludesedscronthe

commandline).

Itcomplainsifyoutrytoeditanemptyfileorsomethingthatisn'tafile

(likeadirectory).

Ifthesedscriptdoesn'tproduceanyoutput,runsedabortsinsteadof

emptyingyouroriginalfile.



runsedonlymodifiesafileifyoursedscrmadeedits.So,thefile'stimestamp

(Section8.2)won'tchangeifthefile'scontentsweren'tchanged.

Likechecksed,runsedexpectstofindasedscriptnamedsedscrinthedirectory

whereyouwanttomaketheedits.Supplythenameornamesofthefilestoedit

onthecommandline.Ofcourse,shellmetacharacters(Section33.2)canbe

usedtospecifyasetoffiles:



$runsed*.html

runsed:editinghome.html:

runsed:donewithhome.html

runsed:editingnew.html:

runsed:donewithnew.html

runsed:alldone

runseddoesnotprotectyoufromimperfecteditingscripts.Youshoulduse

checksedfirsttoverifyyourchangesbeforeactuallymakingthempermanent

withrunsed.(Youcouldalsomodifyrunsedtokeepbackupcopiesofthe

originalversions.)

—DD,JP,andTOR



34.5sedAddressingBasics

Asedcommandcanspecifyzero,one,ortwoaddresses.Anaddresscanbea

linenumber,alineaddressingsymbol,oraregularexpression(Section32.4)

thatdescribesapattern.

Ifnoaddressisspecified,thecommandisappliedtoeachline.

Ifthereisonlyoneaddress,thecommandisappliedtoanylinematching

theaddress.

Iftwocomma-separatedaddressesarespecified,thecommandisperformed

onthefirstmatchinglineandallsucceedinglinesuptoandincludingaline

matchingthesecondaddress.Thisrangemaymatchmultipletimes



throughouttheinput.

Ifanaddressisfollowedbyanexclamationmark(!),thecommandis

appliedtoalllinesthatdonotmatchtheaddress.

Toillustratehowaddressingworks,let'slookatexamplesusingthedelete

command,d.Ascriptconsistingofsimplythedcommandandnoaddress:



d

producesnooutputsinceitdeletesalllines.

Whenalinenumberissuppliedasanaddress,thecommandaffectsonlythat

line.Forinstance,thefollowingexampledeletesonlythefirstline:



1d

Thelinenumberreferstoaninternallinecountmaintainedbysed.Thiscounter

isnotresetformultipleinputfiles.Thus,nomatterhowmanyfileswere

specifiedasinput,thereisonlyoneline1intheinputstream.

Similarly,theinputstreamhasonlyonelastline.Itcanbespecifiedusingthe

addressingsymbol,$.Thefollowingexampledeletesthelastlineofinput:



$d

The$symbolshouldnotbeconfusedwiththe$usedinregularexpressions,

whereitmeanstheendoftheline.

Whenaregularexpressionissuppliedasanaddress,thecommandaffectsonly

thelinesmatchingthatpattern.Theregularexpressionmustbeenclosedby

slashes(/).Thefollowingdeletecommand:



/^$/d

deletesonlyblanklines.Allotherlinesarepassedthroughuntouched.



Ifyousupplytwoaddresses,youspecifyarangeoflinesoverwhichthe

commandisexecuted.Thefollowingexampleshowshowtodeletealllines

surroundedbyapairofXHTMLtags,inthiscase,
    and
,that

markthestartandendofanunorderedlist:



/^
    /,/^<\/ul>/d

    Itdeletesalllinesbeginningwiththelinematchedbythefirstpatternuptoand

    includingthelinematchedbythesecondpattern.Linesoutsidethisrangearenot

    affected.Ifthereismorethanonelist(anotherpairof
      and
    after

    thefirst),thoselistswillalsobedeleted.

    Thefollowingcommanddeletesfromline50tothelastlineinthefile:



    50,$d

    Youcanmixalineaddressandapatternaddress:



    1,/^$/d

    Thisexampledeletesfromthefirstlineuptothefirstblankline,which,for

    instance,willdeletetheheaderfromanemailmessage.

    Youcanthinkofthefirstaddressasenablingtheactionandthesecondaddress

    asdisablingit.sedhasnowayoflookingaheadtodetermineifthesecondmatch

    willbemade.Theactionwillbeappliedtolinesoncethefirstmatchismade.

    Thecommandwillbeappliedtoallsubsequentlinesuntilthesecondmatchis

    made.Inthepreviousexample,ifthefiledidnotcontainablankline,thenall

    lineswouldbedeleted.

    Anexclamationmarkfollowinganaddressreversesthesenseofthematch.For

    instance,thefollowingscriptdeletesalllinesexceptthoseinsideXHTML

    unorderedlists:



    /^
      /,/^<\/ul>/!d

      Curlybraces({})letyougivemorethanonecommandwithanaddress.For



      example,tosearcheverylineofalist,capitalizethewordCautiononanyof

      thoselines,anddeleteanylinewith
      :



      /^
        /,/^<\/ul>/{

        s/Caution/CAUTION/g

        //d

        }

        —DD



        34.6OrderofCommandsinaScript

        Combiningaseriesofeditsinascriptcanhaveunexpectedresults.Youmight

        notthinkoftheconsequencesoneeditcanhaveonanother.Newuserstypically

        thinkthatsedappliesanindividualeditingcommandtoalllinesofinputbefore

        applyingthenexteditingcommand.Buttheoppositeistrue.sedappliesevery

        editingcommandtothefirstinputlinebeforereadingthesecondinputlineand

        applyingtheeditingscripttoit.Becausesedisalwaysworkingwiththelatest

        versionoftheoriginalline,anyeditthatismadechangesthelineforsubsequent

        commands.seddoesn'tretaintheoriginal.Thismeansthatapatternthatmight

        havematchedtheoriginalinputlinemaynolongermatchthelineafteranedit

        hasbeenmade.

        Let'slookatanexamplethatusesthesubstitutecommand.Supposesomeone

        quicklywrotethefollowingscripttochangepigtocowandcowtohorse:



        s/pig/cow/

        s/cow/horse/

        Thefirstcommandwouldchangepigtocowasexpected.However,whenthe

        secondcommandchangedcowtohorseonthesameline,italsochangedthe

        cowthathadbeenapig.So,wheretheinputfilecontainedpigsandcows,the

        outputfilehasonlyhorses!

        Thismistakeissimplyaproblemoftheorderofthecommandsinthescript.



        Reversingtheorderofthecommands—changingcowintohorsebefore

        changingpigintocow—doesthetrick.

        Anotherwaytodealwiththiseffectistouseapatternyouknowwon'tbeinthe

        documentexceptwhenyouputitthere,asatemporaryplaceholder.Eitherway,

        youknowwhatthe"document"lookslikeaftereachstepintheprogram.



        s/pig/cXXXoXXXw/

        s/cow/horse/

        s/cXXXoXXXw/cow/

        Somesedcommandschangetheflowthroughthescript.Forexample,theN

        command(Section34.16)readsanotherlineintothepatternspacewithout

        removingthecurrentline,soyoucantestforpatternsacrossmultiplelines.

        Othercommandstellsedtoexitbeforereachingthebottomofthescriptortogo

        toalabeledcommand.sedalsomaintainsasecondtemporarybuffercalledthe

        holdspace.Youcancopythecontentsofthepatternspacetotheholdspaceand

        retrieveitlater.Thecommandsthatmakeuseoftheholdspacearediscussedin

        Section34.14andotherarticlesafterit.

        —DD



        34.7OneThingataTime

        IfindthatwhenIbegintotackleaproblemusingsed,IdobestifImakea

        mentallistofallthethingsIwanttodo.WhenIbegincoding,Iwriteascript

        containingasinglecommandthatdoesonething.Itestthatitworks,thenIadd

        anothercommand,repeatingthiscycleuntilI'vedoneallthat'sobvioustodo.I

        saywhat'sobviousbecausemylistisnotalwayscomplete,andthecycleof

        implement-and-testoftenaddsotheritemstothelist.Anotherapproachinvolves

        actuallytypingthelistoftasksintoafile,ascomments,andthenslowly

        replacingthemwithsedcommands.Ifyou'reoneoftherarebuthighly

        appreciatedbreedthatactuallydocumentstheircode,youcanjustleavethe

        commentsinthescriptorexpandonthem.

        Itmayseemtobearathertediousprocesstoworkthisway,andindeedthereare



        anumberofscriptswhereit'sfinetotakeacrackatwritingthewholescriptin

        onepassandthenbegintestingit.However,theone-step-at-a-timemethodis

        highlyrecommendedforbeginners,becauseyouisolateeachcommandandget

        toeasilyseewhatisworkingandwhatisnot.Whenyoutrytodoseveral

        commandsatonce,youmightfindthatwhenproblemsarise,youendup

        recreatingtherecommendedprocessinreverse;thatis,removingorcommenting

        outcommandsonebyoneuntilyoulocatetheproblem.

        —DD



        34.8DelimitingaRegularExpression

        Whetherinsedorvi,whenusingthesubstitutioncommand,adelimiteris

        requiredtoseparatethesearchpatternfromthereplacementstring.Thedelimiter

        canbeanycharacterexceptblankoranewline(viseemstobemorerestrictive

        thansed,althoughvimisextremelyflexible).However,theusualpracticeisto

        usetheslash(/)asadelimiter(forexample,

        s/search/replacement/).

        Wheneitherthesearchpatternorthereplacementstringcontainsaslash,itis

        easiertochangethedelimitercharacterthantoescapetheslash.Thus,ifthe

        patternwasattemptingtomatchUnixpathnames,whichcontainslashes,you

        couldchooseanothercharacter,suchasacolon,asthedelimiter:



        s:/usr/mail:/usr2/mail:

        Notethatthedelimiterappearsthreetimesandisrequiredafterthereplacement.

        Regardlessofwhichdelimiteryouuse,ifitdoesappearinthesearchpatternor

        thereplacement,putabackslash(\)beforeittoescapeit.

        Ifyoudon'tknowwhatcharactersthesearchpatternmighthave(inashell

        programthathandlesanykindofinput,forinstance),thesafestchoiceforthe

        delimitercanbeacontrolcharacter.

        Youcanuseanydelimiterforapatternaddress(notjustaslash).Putabackslash

        beforethefirstdelimiter.Forexample,todeletealllinescontaining/usr/mail,

        usingacolon(:)asthedelimiter:



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Chapter 34. The sed Stream Editor

Tải bản đầy đủ ngay(0 tr)

×