Tải bản đầy đủ - 0 (trang)
Hour 23. Performing Stream Analytics with Storm

Hour 23. Performing Stream Analytics with Storm

Tải bản đầy đủ - 0trang

ZooKeeperhandlescoordinationbetweenNimbusandSupervisors(seeFigure23.1).



FIGURE23.1Stormclustercomponents.

Stormincludessomeimportantabstractions:

Stream—AstreamisanunboundsequenceofStormtuples.AStormtupleisa

namedlistofvalues.

Spout—AspoutfeedsadatastreamintotheStormsystem.Aspoutcanreaddata

fromexternalsources(forexample,eventqueuingsystemssuchastheAzureEvent

huborApacheKafka,thedefactomessagingsystemforHadoop)andcanalso

generatedata(forexample,generatingrandomnumberdatafortesting).

Bolt—Aboltprocessesoneormoreinputsandproduceszeroormoreoutputs.A

boltperformscomputationsandtransformationsoninputstreams.Aboltcanalso

writedatatoexternaldestinations,suchasaSQLAzuretableorblobstorage.

Topology—Anetworkofspoutsandboltsformsatopology.Spoutsfeedadata

streamtoatopology.Boltssubscribetooneormoreinputstreamsandperform

continuouscomputationsonstreamingdata.

Figure23.2illustratesaStormtopology.



FIGURE23.2Stormtopology.



Streamgroupingscontrolthewaytuplesaresenttobolts.Themaintypesofgroupings

follow:

Shufflegrouping—Thetupleissenttoarandomlychosenboltsothattheloadis

equallydistributedamongbolts.

Fieldsgrouping—Thetupleissenttoaboltdependingonthetuplefields.Tuples

withthesamevalueofchosenfieldsaredirectedtothesamebolt.Fieldgroupings

areusefulincountingscenariosandarediscussedmorewithanexamplelaterinthe

“AnalyzingSpeedLimitViolationIncidentswithStorm”section.

Allgrouping—Copiesofatuplearesenttoallbolts.

Globalgrouping—Alltuplesfromaspoutaresenttoasinglebolt.

Directgrouping—Thestreamsourcedecideswhichtupleisdirectedtowhichbolt.

TryItYourself:ProvisioningHDInsightStormCluster

CompletethefollowingstepstoprovisiontheHDInsightStormcluster:

1.LogintotheMicrosoftAzureManagementportalandselectHDInsightfrom

theleftpane.

2.Selecttheoption+NEWfromthebottom-leftcorneroftheportal.

3.SelectStormfromthelistofoptions(seeFigure23.3).Intheappropriate

spaces,enteraclustername,thenumberofworkernodesrequired,andthe

passwordforadminaccount,andselectanexistingstorageaccount.



FIGURE23.3ProvisioningtheHDInsightStormcluster.

4.ClicktheCreateHDInsightclusterbuttontoprovisionthecluster.



UsingSCP.NETtoDevelopStormSolutions

TheStreamComputingPlatform(SCP)providesnecessary.NETlibrariesfordeveloping

StormsolutionsthattargettheHDInsightStormcluster.SCPeasestheprocessofcreating

Stormapplications,withabstractionssuchasspouts,bolts,andtopologies,usingthe.NET

Framework.

HDInsighttoolsforVisualStudioinstallStormprojecttemplatesrequiredfordeveloping

HDInsightStormsolutionsusingSCP.NET.HDInsighttoolsworkwiththefollowing:

VisualStudio2012withUpdate4

VisualStudio2013withUpdate4

VisualStudio2013CommunityEdition

VisualStudio2015CTP6

AzureSDK2.5.1orhigherisalsorequired.AzureSDKandHDInsighttoolsforVisual

StudiocanbeinstalledusingtheMicrosoftWebPlatforminstaller(seeFigure23.4).



FIGURE23.4InstallingHDInsighttoolsforVisualStudio.

AfterHDInsighttoolsforVisualStudioareinstalled,theStormApplicationtemplate

becomesavailableinthelistofinstalledHDInsightprojecttemplates(seeFigure23.5).



FIGURE23.5StormApplicationprojecttemplate.



AnalyzingSpeedLimitViolationIncidentswithStorm

ToappreciatetheuseofSCP.NETindevelopingstreamcomputingapplications,consider

thefollowingscenarioforanalyzingspeedlimitviolationincidentsat10different

locationsinNewYorkCity.Assumethatlicenseplatereaderdevicesinstalledat10

differentlocationsprovidetupleswiththefollowinginformationforincomingvehicle

traffic:

Eventtimestamp

Location

Vehicleregistrationnumber

Speed

Thescenarioinvolvesfilteringtheincomingstreamtoextracttupleswithaspeedofmore

than60mphandcountingthenumberofextractedtuplesbylocation(seeFigure23.6).



FIGURE23.6Stormtopologyforanalyzingspeedlimitviolations.

TryItYourself:CreatingaNewStormApplication



LaunchVisualStudioandcreateanewprojectnamed

LicensePlateReaderApplicationusingtheStormApplicationproject

template,availableunderHDInsightprojecttemplates.Thiscreatesanewproject

containingapredefinedboltandspout(seeFigure23.7).



FIGURE23.7Newprojectwithapredefinedspoutandbolt.

Note

Refertotheaccompanyingcodesamplesforthishourforthecompletecode

listing.

Completethefollowingstepstocreatealicenseplatereaderspout:

1.UsethefollowingcodesnippettocreateanewenumforLocations.The

enumdefines10citylocationswherelicenseplatereaderdeviceshavebeen

installed.

publicenumLocations

{

TimesSquare=0,

LexingtonAvenue=1,

BrooklynBridge=2,

PennStation=3,

CentralPark=4,

ColumbusCircle=5,

GrandCentral=6,

FifthAvenue=7,

WallStreet=8,

MadisonSquare=9

}



2.UsethefollowingcodesnippettocreatetheLPREventclass,which

representsatuplefromalicenseplatereaderdevicethatcontainsinformation

suchasthetimestamp,devicelocation,vehicleregistrationnumber,andspeed:

Clickheretoviewcodeimage

[Serializable]

publicclassLPREvent

{

publicDateTimeTimeStamp{get;set;}

publicLocationsLocation{get;set;}

publicstringVehicleRegNo{get;set;}

publicintSpeed{get;set;}



staticRandom=newRandom();

publicstaticLPREventCreateEvent()

{

returnnewLPREvent

{

TimeStamp=DateTime.Now,

Location=(Locations)random.Next(10),

VehicleRegNo=“xxxxx”,

Speed=random.Next(100)

};

}

}



ThestaticfunctionCreateEventoftheLPREventclasscreatesanew

eventobjectbypopulatingrandomdataforeventlocationandspeedto

simulateanincomingvehicle.

3.RenametheexistingspouttoLPRSpoutfromtheSolutionExplorer.Thisalso

renamestheclassandallitsreferences.Usethefollowingcodesnippetto

definetheLPRSpoutclass:

Clickheretoviewcodeimage

publicclassLPRSpout:ISCPSpout

{

privateContextctx;

privateRandomr=newRandom();

publicLPRSpout(Contextctx)

{

this.ctx=ctx;

//Defineoutputstreamschema

Dictionary>outputSchema=new

Dictionary
List>();

outputSchema.Add(“default”,newList(){typeof(LPREvent)

});

this.ctx.DeclareComponentSchema(newComponentStreamSchema(null,

outputSchema));

}

publicstaticLPRSpoutGet(Contextctx,Dictionary

parms)

{

returnnewLPRSpout(ctx);

}

publicvoidNextTuple(Dictionaryparms)

{

//EmitanewLPReventtooutputstream

ctx.Emit(newValues(LPREvent.CreateEvent()));

}

publicvoidAck(longseqId,Dictionaryparms)

{

thrownewNotImplementedException();

}

publicvoidFail(longseqId,Dictionaryparms)

{



thrownewNotImplementedException();

}

}



Thecodesnippetdefinesthespout’soutputstreamschema.TheNextTuple

functionemitsanewtuple,withrandomvaluesforspeedandlocation,tothe

outputstreambycallingtheLPREventclass’sCreateEventfunction.Ina

real-worldscenario,theNextTuplefunctionwouldreaddatafromexternal

sources(suchastheAzureServiceBusorEventHubQueue)andemitthedatato

thestream.

Next,completethefollowingstepstocreateafilterbolttoextractviolations:

1.RenametheBlot.csfilefromSolutionExplorertoFilterBolt.csand

usethefollowingcodesnippettodefinetheFilterBoltclass.Thecode

snippetdefinesinputandoutputstreamschemas.TheExecutemethodfilters

outtupleswithnoviolationsfromtheinputstreamandemitsonlytuplesthat

violatespeedlimittotheoutputstream.

Clickheretoviewcodeimage

publicclassFilterBolt:ISCPBolt

{

privateContextctx;

publicFilterBolt(Contextctx)

{

this.ctx=ctx;

//Defineinputstreamschema

Dictionary>inputSchema=new

Dictionary
List>();

inputSchema.Add(“default”,newList(){typeof(LPREvent)

});

this.ctx.DeclareComponentSchema(new

ComponentStreamSchema(inputSchema,

null));

//Defineoutputstreamschem

Dictionary>outputSchema=new

Dictionary
List>();

outputSchema.Add(“default”,newList(){typeof(LPREvent)

});

this.ctx.DeclareComponentSchema(new

ComponentStreamSchema(inputSchema,

outputSchema));

}

publicstaticFilterBoltGet(Contextctx,Dictionary

parms)

{

returnnewFilterBolt(ctx);

}

publicvoidExecute(SCPTupletuple)

{

LPREvent=(LPREvent)tuple.GetValue(0);



//Emitonlytupleviolatingspeedlimittooutputstream

if(lprEvent.Speed>60)

{

this.ctx.Emit(newValues(lprEvent));

}

}

}



Tocountthetuplesthatviolatethespeedlimit,youmustaddacounterboltto

thetopology.Completethefollowingstepstoaddanewbolttothesolution:

2.Right-clicktheSolutionExplorerandselecttheNewitemoption.

3.SelectStormBoltasthetemplateforthenewitemandnameit

CounterBolt.

4.UsethefollowingcodesnippettodefinetheCounterBoltclass:

Note

Theboltdefinestheschemaonlyfortheinputstream.Becausetheboltisa

terminalnodeinthetopology,outputschemadefinitionisnotrequired.The

ExecutemethodofthecounterboltupdatesaSQLAzuretablewitha

countofviolations,bylocation.

Clickheretoviewcodeimage

publicclassCounterBolt:ISCPBolt

{

privateContextctx;

publicCounterBolt(Contextctx)

{

this.ctx=ctx;

//Defineinputstreamschema

Dictionary>inputSchema=newDictionary
List>();

inputSchema.Add(“default”,newList(){typeof(LPREvent)});

this.ctx.DeclareComponentSchema(new

ComponentStreamSchema(inputSchema,

null));

}

publicstaticCounterBoltGet(Contextctx,Dictionary

parms)

{

returnnewCounterBolt(ctx);

}

publicvoidExecute(SCPTupletuple)

{

LPREvent=(LPREvent)tuple.GetValue(0);

stringlocation=lprEvent.Location.ToString();

//UpdateSQLAzuretablewithviolationcountsbylocation

DBHelper.UpdateCounters(location);

}

}



CreatingtheStormTopology

TheITopologyBuilderinterfacedefinestheStormtopology.Usethefollowingcode

snippettodefinethetopologyyousawillustratedinFigure23.6.Thecodesnippetaddsan

lprspouttothetopology,whichemitsthetuples.

Thespeedfilterboltconsumesvehicletuplestoextracttuplesthatviolatethespeedlimit.

Becauseitdoesnotmatterwhichtuplesareprocessedbywhichboltinstance,theshuffle

groupingdistributestheloadequallyamongboltinstances.

Finally,thecounterboltconsumesfilteredtuplesandactsastheterminalnodeinthe

topology.ThecounterboltpushesviolationcountsbylocationintoaSQLAzuretable.

FieldsGroupingroutesincomingtuplessothattupleswiththesamelocationsare

directedtothesameboltinstance.Thisensuresthattwoormoreboltinstancesdon’ttryto

updatethesamerowintheSQLAzuretableatthesametime.

Clickheretoviewcodeimage

publicITopologyBuilderGetTopologyBuilder()

{

//CreateLicensePlateReaderApplicationstormtopology

TopologyBuilder=newTopologyBuilder(“LicensePlateReaderApplication”);

//Add‘vehicles’lprspouttothetopology

//Thespoutemitsvehicletuples

topologyBuilder.SetSpout(

“vehicles”,

LPRSpout.Get,

newDictionary>()

{

{Constants.DEFAULT_STREAM_ID,newList(){“vehicle”}}

},

1);

//Addspeedfilterbolttothetopology

//Theboltemitsover-speedingvehicletuples

//Shufflegroupingisusedtorouteincomingvehicletuple

//torandomlychosenbolt

topologyBuilder.SetBolt(

“speedfilter”,

FilterBolt.Get,

newDictionary>()

{

{Constants.DEFAULT_STREAM_ID,newList()

{“overspeedingvehicle”}}

},

1).shuffleGrouping(“vehicles”);

//Addvehiclecounterbolttothetopology

//TheboltpushesthecountstoaSQLAzuretable

//FieldsGroupingisusedtorouteincomingtuples,toensurethat

//tupleswithsamelocationfieldsaredirectedtosamebolt

instance

topologyBuilder.SetBolt(

“counter”,

CounterBolt.Get,

newDictionary>()

{

{Constants.DEFAULT_STREAM_ID,newList(){“location”,

“count”}}



}

,1).fieldsGrouping(“speedfilter”,newList(){0});

//Addtopologyconfiguration

topologyBuilder.SetTopologyConfig(newDictionary()

{

{“topology.kryo.register”,”["[B"]”}

});

returntopologyBuilder;

}



CreatingtheSQLAzureTabletoStoreViolationCounts

UsethefollowingcodesnippettocreateaSQLAzuretabletoholdspeedlimitviolation

countsbylocation:

Clickheretoviewcodeimage

CREATETABLE[dbo].[SpeedLimitViolationCount](

[Location][varchar](20)NOTNULLPRIMARYKEY,

[VehicleCount][int]NULL)



Thefollowingcodesnippetcreatesthestoredprocedureinvokedbythecounterboltto

updatespeedlimitviolationcountsbylocation.Thestoredprocedurechecksforthe

existenceofalocationinthetable.Ifthelocationdoesnotexistinthetable,itcreatesa

newrecord;otherwise,itupdatestheviolationcountsfortheexistingrecord.

Clickheretoviewcodeimage

CREATEPROC[dbo].[usp_IncrementCount]

@LOCATIONVARCHAR(20)

AS

IFEXISTS(SELECT1FROMSpeedLimitViolationCountWHERELocation=@LOCATION)

BEGIN

UPDATESpeedLimitViolationCountSETVehicleCount=VehicleCount+1WHERE

Location=@LOCATION

END

ELSE

BEGIN

INSERTINTOSpeedLimitViolationCount(Location,VehicleCount)VALUES(@

LOCATION,1)

END

GO



SubmittingtheTopologytotheHDInsightStormCluster

TosubmittheStormtopologyforexecution,right-clicktheprojectandselectSubmitto

StormonHDInsight(seeFigure23.8).



FIGURE23.8SubmittingatopologytoStorm.

Atthispoint,youmightbepromptedtosignintoyourAzuresubscriptionassociatedwith

theStormcluster.Aftersigningin,selectyourStormclusterfromthedrop-downlist(see

Figure23.9).



FIGURE23.9SelectingaStormcluster.

Tomonitorthetopologypostsubmission,selecttheLicensePlateReadertopologyfromthe

listofStormtopologies(seeFigure23.10).Thisbringsupmoreinformationaboutthe

topology,optionstoperformadditionalactions(forexample,pausingorterminatinga

topology,orrebalancingthetopologytoadjustparallelismafterchangingthenumberof

nodesinStormcluster),topologystatistics,andinformationonspoutsandbolts.



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Hour 23. Performing Stream Analytics with Storm

Tải bản đầy đủ ngay(0 tr)

×