Tải bản đầy đủ - 0 (trang)
YARN – Future and Support

YARN – Future and Support

Tải bản đầy đủ - 0trang

Chapter10.YARN–FutureandSupport

YARNisthenewmoderndataoperatingsystemforHadoop2.YARNactsasacentral

orchestratortosupportmixedworkloads/programmingmodels,runningmultipleengines,

andmultipleaccesspatternssuchasbatchprocessing,interactive,streaming,andrealtime,inHadoop2.

Inthischapter,wewilltalkaboutYARN’sjourneyanditspresentandfutureinthebig

dataindustry.



WhatYARNmeanstothebigdata

industry

ItcanbesaidthatYARNisaboontothebigdataindustry.WithoutYARNtheentirebig

dataindustrywouldhavebeenatseriousrisk.Astheindustrystartedplayingwithbig

data,newandemergingvarietiesofproblemscameintothepictureandhencenew

frameworks.

YARN’ssupporttorunthesenewandemergingframeworksallowstheseframeworksto

focusonsolvingtheproblemsforwhichtheywerespecificallymeantfor,whileYARN

takescareofresourcemanagementandothernecessarythings(resourceallocation,

schedulingjobs,faulttolerance,andsoon).

HadtherebeennoYARN,theseframeworkswouldhavehadtodoalltheresourcemanagementontheirown.Therearemanybigdataprojectsthatfailedinthepastdueto

unrealisticexpectationsonimmaturetechnologies.

YARNistheenablerforportingmatureandenterprise-classtechnologiesdirectlyonto

Hadoop.WithoutYARN,theonlythinginHadoopwastouseMapReduce.



Journey–presentandfuture

Aroundtwoyearsback,YARNwasintroducedwiththeHadoop0.23releaseon11Nov,

2011.

Sincethen,therewasnolookingbackandtherewereanumberofreleases.

Finally,onOctober15,2013ApacheHadoop2.2.0wastheGA(GeneralAvailability)

releaseofApacheHadoop2.x.

InOctober2013,ApacheHadoopYARNwontheBestPaperawardatACMSoCC

(SymposiumonCloudComputing)2013.

ApacheHadoop2.x,poweredbyYARN,isnodoubtthebestplatformforallofthe

HadoopecosystemcomponentssuchasMapReduce,ApacheHive,ApachePig,andsoon

thatuseHDFSastheunderlyingdatastorage.

YARNwasalsohonoredbyotheropensourcecommunitiesforframeworkssuchas

ApacheGiraph,ApacheTez,ApacheSpark,ApacheFlink,andmanyothers.

VendorssuchasHP,Microsoft,SAS,Teradata,SAP,RedHat,andthelistgoeson,are

movingtowardsYARNtoruntheirexistingproductsandservicesonHadoop.

PeoplewillingtomodifyapplicationscanalreadyuseYARNdirectly,buttherearemany

customers/vendorswhodon’twanttomodifytheirexistingapplication.Forthem,thereis

ApacheSlider,anotheropensourceprojectfromHortonworks,whichcandeployany

existingdistributedapplicationswithoutrequiringthemtobeportedtoYARN.

ApacheSliderallowsyoutobridgeexistingalways-onservicesandmakessuretheywork

reallywellontopofYARN,withouthavingtomodifytheapplicationitself.

Sliderfacilitatesmanylong-runningservicesandapplicationssuchasApacheStorm,

ApacheHBase,ApacheAccumulo,andsoonrunningonYARN.

Thisinitiativewilldefinitelyexpandthespectrumofapplicationsandusecasesthatone

canactuallyusewithHadoopandYARNinfuture.



Presenton-goingfeatures

Now,let’sdiscussthepresenton-goingworksinYARN.

LongRunningApplicationsonSecureClusters(YARN-896)

Supportlong-livedapplicationsandlong-livedcontainers.Referto

https://issues.apache.org/jira/browse/YARN-896.

ApplicationTimelineServer(YARN-321,YARN-1530)

Currently,wehaveaJobHistoryServerforMapReducehistory.TheMapReducejob

historyservercurrentlyneedstobedeployedasatrustedserverinsyncwiththe

MapReduceruntime.Everynewapplicationwouldneedasimilarapplicationhistory

server.HavingtodeployO(T*V)(whereTisthenumberoftypeofapplication,Visthe

numberofversionofapplication)trustedserversisclearlynotscalable.

ThisJIRAistocreateonlyonetrustedapplicationhistoryserver,whichcanhaveageneric

UI.Refertothefollowinglinksformoreinformation:

https://issues.apache.org/jira/browse/YARN-321

https://issues.apache.org/jira/browse/YARN-1530

Diskscheduling(YARN-2139)

SupportfordiskasaresourceinYARN.YARNshouldconsiderdiskasanotherresource

forschedulingtasksonnodes,isolationatruntime,andspindlelocality.Referto

https://issues.apache.org/jira/browse/YARN-2139.

Reservation-basedscheduling(YARN-1051)

ToextendtheYARNRMtohandletimeexplicitly,allowinguserstoreservecapacityover

time.ThisisanimportantsteptowardsSLAs,long-runningservices,workflows,and

helpsingangscheduling.



Futurefeatures

Let’sdiscussthefutureworksinYARN.

ContainerResizing(YARN-1197)

ThecurrentYARNresourcemanagementlogicassumesthattheresourcesallocatedtoa

containerarefixedduringitslifetime.Whenuserswanttochangetheresourcesofan

allocatedcontainer,theonlywayisreleasingitandallocatinganewcontainerwiththe

expectedsize.Allowingruntimechangestotheresourcesofanallocatedcontainerwill

giveusbettercontrolofresourceusageontheapplicationside.Referto

https://issues.apache.org/jira/browse/YARN-1197.

Adminlabels(YARN-796)

Supportforadminstospecifylabelsfornodes.TheexamplesoflabelsareOS,processor

architecture,andsoon.Refertohttps://issues.apache.org/jira/browse/YARN-796.

ContainerDelegation(YARN-1488)

Allowcontainerstodelegateresourcestoanothercontainer.Thiswouldallowexternal

frameworkstosharenotjustYARN’sresource-managementcapabilities,butalsoits

workload-managementcapabilities.

ThisalsoshowsthatYARNisnotonlyfocusedontheApacheHadoopecosystem

components,butalsoonanyexistingexternalnon-Hadoopproductsandservicesthatwant

touseHadoop.

Also,workisgoingoninbringingtogethertheworldsofDataandPaaSbyusingDocker,

GoogleKubernetes,andRedHatOpenShiftonYARNsothatacommonresource

managementcanbedoneacrossdataandPaaSworkloads.



YARN-supportedframeworks

ThefollowingisthecurrentlistofframeworksthatrunsontopofYARN,andthislistwill

goongettinglongerinthefuture:

ApacheHadoopMapReduceanditsecosystemcomponents

ApacheHAMA

OpenMPI

ApacheS4

ApacheSpark

ApacheTez

Impala

Storm

HOYA(HBaseonYARN)

ApacheSamza

ApacheGiraph

ApacheAccumulo

ApacheFlink

KOYA(KafkaonYARN)

Solr



Summary

Inthischapter,webrieflytalkedaboutYARN’sjourneysinceitsinception.YARNhas

completelychangedHadoopfromthewayitwasearlierintheHadoop1.xversion.Now

YARNisafirst-classresourcemanagementframeworkforsupportingmixed

workloads/processingframeworks.

Fromwhatcanbeenseenandpredicted,YARNissurelyahitinthebigdataindustryand

hasmanymorenewandpromisingfeaturestocomeinthefuture.Currently,YARN

handlesmemoryandCPUandwillcoordinateadditionalresourcessuchasdiskand

networkI/Ointhefuture.



Index

A

AccessControlList(ACL)

about/NodeManager(NM),Thecapacityscheduler

administrativetools

about/Administrativetools

commands/Administrativetools

genericoptions,supporting/Administrativetools

/Administrativetools

anagrams/PracticalexamplesofMRv1andMRv2

ApacheGiraph

about/ApacheGiraph

URL/ApacheGiraph

ApacheHadoop2.2.0

about/Journey–presentandfuture

ApacheSamza

about/ApacheSamza

Kafka/ApacheSamza

ApacheYARN/ApacheSamza

ZooKeeper/ApacheSamza

Kafkaproducer,writing/WritingaKafkaproducer

hello-samzaproject,writing/Writingthehello-samzaproject

ApacheSamza,layers

processinglayer/ApacheSamza

streaminglayer/ApacheSamza

executionlayer/ApacheSamza

ApacheSlider

about/Journey–presentandfuture

ApacheSoftwareFoundation

about/Mesos

ApacheSpark

about/ApacheSpark

features/ApacheSpark

running,onYARN/WhyrunonYARN?

ApacheTez

about/ApacheTez

URL/ApacheTez

ApplicationContext(AppContext)/TheMapReduceApplicationMaster

ApplicationMaster

about/TheMapReduceApplicationMaster

ApplicationMaster(AM)/ApplicationMaster(AM)

restarting/TheMapReduceApplicationMaster



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

YARN – Future and Support

Tải bản đầy đủ ngay(0 tr)

×