Tải bản đầy đủ - 0 (trang)
Hour 14. Consuming HDInsight Data from Microsoft BI Tools over Hive ODBC Driver: Part 1

Hour 14. Consuming HDInsight Data from Microsoft BI Tools over Hive ODBC Driver: Part 1

Tải bản đầy đủ - 0trang

32-BitVersus64-BitHiveODBCDriver

YouneedtounderstandthesetwovariantsofHiveODBCDrivertoproperlyconfigure

HiveODBCDriverorDataSourceNames(DSN).Otherwise,itmightnotworkbecause

it’sloadingthewrongkindofdriver.

YoucaninstallandconfigurebothvariantsofHiveODBCDriver(32-bitand64-bit)side

bysideona64-bitWindowsmachine.Youshouldknow,however,that64-bitapplications

canuseonlythe64-bitdriverorDSN,and32-bitapplicationscanuseonlythe32-bit

driverorDSN.Forexample,ifyouhavea64-bitWindowsmachinewith32-bitExceland

64-bitSQLServer,tousethedrivercorrectlyfromExcel,youmustinstallandconfigure

the32-bitHiveODBCDriver.Similarly,tousethedriverfromSQLServer,youmust

installandconfigurethe64-bitHiveODBCDriver.



SettingUptheHiveODBCDriver

MicrosoftHiveODBCDriverisafreedownloadthatyoucanfindatthislink,basedon

your32-bitor64-bitWindowsplatformandtheapplicationsthatwillbeusingit(see

Figure14.1):http://www.microsoft.com/en-us/download/details.aspx?id=40886.



FIGURE14.1HiveODBCDriverdeploymentoptions.

AfterinstallingtheHiveODBCDriver,youcantypeDataSourceintheSearchbox

onWindows8toseeoptionstoconfigureboththe32-bitand64-bitODBCsources(see

Figure14.2).



FIGURE14.2SettingupODBCdatasources—32-bitversus64-bit.

Youcanchooseeitherofthese,basedonyourspecificneedorthebit-nessofthe

applicationsthatwillbeaccessingHive-baseddata.

IfyouareusingotherversionsoftheWindowsoperatingsystem,youcangotothe



ControlPanelandthenopeneitherODBCDataSources(32-Bit)orODBCDataSources

(32-Bit)underAdministrativeTools.

Bydefault,ona64-bitWindowsmachine,youcanaccess64-bitODBCDataSource

AdministratorfromtheControlPanel.Ifyouwanttousethe32-bitversionofODBCData

SourceAdministratorona64-bitWindowsmachine,youcanexecutethiscommandin

Run:C:\WINDOWS\SysWOW64\odbcad32.exe.

TryItYourself:Configuringthe64-BitDriver

IntheODBCDataSourceAdministrator(64-Bit)dialogbox,gototheSystem

DSNtabandclicktheAddbutton(seeFigure14.3).



FIGURE14.3ODBCDataSourceAdministrator(64-Bit)dialogbox.

Tip

AsystemDSNisavailabletoalluserswhologintothemachine.Auser

DSNisavailableonlytotheuserwhocreatestheDSN.

IntheCreateNewDataSourceWizard,selectMicrosoftHiveODBCDriver(see

Figure14.4)andclickFinish.Ifyoudon’tfindMicrosoftHiveODBCDriver

listedthere,thenthe64-bitHiveODBCDriverisnotyetinstalled.



FIGURE14.4CreateNewDataSourceWizard(64-bit).

ClickingFinishbringsuptheMicrosoftHiveODBCDriverDSNSetuppagefor

64-bitDSNsetup(seeFigure14.5).Youmustspecifythefollowingsettingsfor

theDSN:

DataSourceName—ThisisthenameforyourDSN.



FIGURE14.5MicrosoftHiveODBCDriveDSNsetup(64-bit).

Description—ThisisanoptionaldescriptionforyourDSN.

Host—Specifytheclustername,IPaddress,orhostnameoftheHiveserver.

FortheHDInsightservice,itshouldbeintheformof
name>.azurehdinsight.net.

Port—Specifytheportnumberonwhichtheserviceislistening.Thedefaultis

443.

Database—Specifythenameofthedatabasetorefertowhenaquerydoesnot

explicitlyspecifyit.Youcanstillexecutequeriesinvolvingotherdatabasesby

explicitlyspecifyingthedatabasenameinthequerywithatwo-partnaming

convention(.).

HiveServerType—YoucankeepthedefaultofHiveServer2.

Note

HiveServer2isthelatestimplementationthataddressestheconcurrency

limitationimposedbyHiveServer1.Moreover,authenticationisavailable

onlyforHiveServer2.



Authentication—DependingonwhetheryouareconnectingtoHDInsight

ServiceorHDInsightemulator,selecteitherWindowsAzureHDInsight

ServiceorWindowsAzureHDInsightEmulator.Youcanspecifyausername

andpasswordtoconnecttotheHDInsightcluster.

ClicktheTestbuttontoverifyconnectivitywiththeHDInsightcluster,basedon

thedetailsprovided.

ClickAdvancedOptionstospecifyadvancedsettingsfortheHiveODBCDSN

(seeFigure14.6).



FIGURE14.6AdvancedoptionsofHiveODBCDSN.

UseNativeQuery—Bydefault,HiveODBCDriverusesSQLConnectorto

translatestandardSQL-92queriesintoequivalentHiveQLqueries.Youcan

checkthisoptiontoturnoffthisfeatureifyouaredirectlypassingHiveQL

queries.

FastSQLPrepare—CheckthisoptiontodeferHiveQLqueryexecutionto

retrievetheresultsetmetadataforSQLPrepare.Iftheresultsetmetadataisnot

requiredaftercallingSQLPrepare,enablingthisoptionwillimprove

performance.

DriverConfigTakePrecedence—CheckthisoptiontoallowODBCDriver–

wideconfigurationstotakeprecedenceoverconnectionstringandDSN

settings.

UseAsyncExec—Checkthisoptiontousetheasynchronousversionofthe

APIcallagainsttheHiveServerwhenexecutingaquery.

GetTableswithQuery—ThisisapplicableonlywhenyouareusingHive

Server2.Turnonthisoptiontoretrievethenamesoftablesinaparticular



databaseusingtheSHOWTABLESquery.

UnicodeSQLCharacterTypes—CheckthisoptiontoenabletheODBC

DrivertoreturnUnicodecharacters(oftypeSQL_WVARCHARor

SQL_WCHAR)insteadofnon-Unicodecharacters(oftypeSQL_VARCHARor

SQL_CHAR).

ShowHIVE_SYSTEMTable—CheckthisoptiontoenabletheODBCdriver

toreturntheHIVE_SYSTEMpseudotableforcatalogfunctioncallssuchas

SQLTablesandSQLColumns.Thepseudotableisunderthepseudoschema

HIVE_SYSTEM.ThetablehastwoStringtypecolumns,ENVKEYand

ENVVALUE.Wedemonstratethisinthesection“AccessingHiveDatafrom

SQLServer,”inHour15.

RowsFetchedPerBlock—Youcanspecifyany32-bitpositiveintegerto

indicatethenumberofrowstobundleintoasingleblock.Thedefaultvalueof

10,000rowsperblockworksbestinmostscenarios.

DefaultStringColumnLength—Specifythemaximumdatalengthfor

columns,withthestringdatatype,tobeconsideredduringqueryexecution.

GOTO Welookmoreatthestringcolumnlengthsettinginthesection

“AccessingHiveDatafromSQLServer,”inHour15.

BinaryColumnLength—Specifythemaximumdatalengthforcolumns,with

binarydatatype,tobeconsideredduringqueryexecution.

DecimalColumnScale—Specifythemaximumnumberofdigitstotheright

ofthedecimalpointfornumericdatatypes.

AsyncExecPollInterval(ms)—Specifythetime(inmilliseconds)between

eachpollofthequeryexecutionstatus(asynchronousRPCcallusedtoexecute

aqueryagainstHive).

Optionally,youcanclicktheServerSidePropertiesbutton(seeFigure14.6)to

add,modify,ordeleteserver-sideproperties.Todoso,youspecifyappropriate

keysandvaluestoconfiguretheHiveODBCDrivertoapplyeachserver-side

propertyyousetbyexecutingaquerywhenopeningasessiontoHiveServer2.



Configuringthe32-BitDriver

IntheODBCDataSourceAdministrator(32-Bit)dialogbox,gototheSystemDSNtab

andclicktheAddbutton(seeFigure14.7).



FIGURE14.7MicrosoftHiveODBCDriverDSNsetup(32-bit).

IntheCreateNewDataSourcewizard,selectMicrosoftHiveODBCDriver(referto

Figure14.4)andclicktheFinishbutton.Ifyoudon’tfindMicrosoftHiveODBCDriver

listedthere,itindicatesthatthe32-bitHiveODBCDriverisnotyetinstalled.

ClickingFinishbringsuptheMicrosoftHiveODBCDriverDSNSetuppagefor32-bit

DSNsetup(seeFigure14.7).Specifythevaluesasmentionedintheprevioussection

“SettingUptheHiveODBCDriver.”

Asdiscussedintheprevioussection,youcanclicktheTestbuttontoverifyconnectivity

withHDInsightcluster.ClicktheAdvancedOptionsbuttontospecifyotheradvanced

settings.



IntroductiontoMicrosoftPowerBI

MicrosoftPowerBIisacloud-basedself-serviceBIsolutionfortheenterprise.Itenables

userstogetinsightintovirtuallyanytypeofdata(includingdatafromHDInsightcluster)

inthefamiliarMicrosoftExcel,whichusershavebeenusingforseveraldecades.

PowerBIempowersendusersandprovidestoolstotakecareofend-to-endbusiness

intelligencescenarioforSelf-ServiceBIsolution—forexample:

LeveragingPowerQueryfordatadiscoveryandcombiningdatafromdifferent



sourcesincludingHDInsightcluster

UsingPowerPivottomodelthedatafromdifferentsources,suchasdefiningthe

relationship,creatinghierarchies,usingKPIsandmeasures,andsoon)

Whenthemodelisready,visualizingthedatafromdifferentperspectiveswith

differentintuitiveandinteractivevisualizationoptionsinPowerViewandPower

Map

Note

Forourpurpose,endusersarepeoplewithalllevelsofskills,includingdata

analysts,powerusers,businessusers,datastewards,andfolksfromtheIT

department.

PowerBIisabroadumbrellathatcontainsfourmajorExceladd-ins(PowerQuery,

PowerPivot,PowerView,andPowerMap).PowerBIalsointegrateswithOffice365,

which,inturn,isbuiltonascalable,manageable,andtrustedcloudplatform.This

integrationprovidesbetterself-serviceanalyticsinthecloud,bettercollaboration

capabilities:Userscannowsharereportstheyhavecreatedwithotherfolksinthe

organizationwiththehelpofOffice365onlineservices,accessthesereportsfrom

anywhereinabrowseroronmobiledevices,askquestionsofdatainanaturallanguage,

andmore.

Note

YoucanlearnmoreaboutPowerBIathttp://www.microsoft.com/enus/server-cloud/solutions/business-intelligence/self-service.aspx.



AccessingHiveDatafromMicrosoftExcel

MicrosoftExcelisoneofthemostwidelyusedapplicationsglobally.Peoplefromall

departmentsuseMicrosoftExcelintheirdailylives.Toempowerallthoseusers,

MicrosoftsupportsaccessingHive-baseddatainfamiliarExcelfromHDInsightcluster—

soit’snotjustlimitedtoITpeople.

ThefollowingTryItYourselfexercisedemonstrateshowyoucanuseMicrosoftExcel

2013toaccessHive-baseddatafromHDInsightcluster.

TryItYourself:AccessingHive-BasedDatafromanHDInsightCluster

Tobegin,openanewExcelfile,clicktheDatatab,clickFromOtherSources,and

thenclickFromMicrosoftQuery(seeFigure14.8).



FIGURE14.8AccessingHive-baseddatausingMicrosoftQuery.

Tip

YoucanalsochooseFromDataConnectionWizard,whichhasasimilar

interface,touseHiveODBCDrivertoretrieveHive-baseddata.

OntheChooseDataSourcescreen,youcanselecttheDSNyoucreatedearlierin

thesection“SettingUptheHiveODBCDriver”(seeFigure14.9).(Becausewe

have32-bitExcel,wecanseeonly32-bitDSNshere.)ClickOKtocontinue;you

arepromptedtospecifytheusernameandpasswordtoconnecttotheHDInsight

cluster.



FIGURE14.9ChooseDataSourcescreen.



OntheQueryWizard—ChooseColumnsscreen,youcanseealltheHivetables

availableontheconnectedHDInsightcluster.Youcanchooseatablefromtheleft

andaddcolumnsofthetable,toberetrieved,intherighttextbox(seeFigure

14.10).



FIGURE14.10Choosingtablesandcolumns.

Tip

Inthisexercise,wehaveselectedonlyonetable;however,youcanselect

multipletablesanddefinetherelationshipbetweenthemtogetthe

combinedresultset.

OntheQueryWizard—FilterDatascreen,youcanspecifyanyfilterifyouwant

torestricttherowtobereturned(seeFigure14.11).



FIGURE14.11Specifyafilter,ifany.

OntheQueryWizard—SortOrderscreen,youcanspecifycolumn(s)andtheir

sortorder,ifany.Inourcase,wewanttheresultsettobesortedonthe

OnTimeDeparturePercentagecolumnindescendingorder.



OntheQueryWizard—Finishscreen,youcanspecifywhereyouwantthe

retrieveddatatobeplaced.ClicktheSaveQuerybuttontosavethequeryifyou

wanttouseitlater.ClickingtheFinishbuttonstartsretrievingdatafromthe

HDInsightcluster;dependingonyourqueryandtheamountofdata,itmighttake

sometime.

AftertheHDInsightclusterretrievesthedata,itshouldlooklikeFigure14.12.

BringingindatafromHDInsightclusteriseasyinyourfamiliarExceltool.

Dependingonyourneed,youcanapplyanyoftheformattingoptionsorfeatures

availableinExcel.



FIGURE14.12RetrieveddatainExcelfromtheHivetable.

Forexample,basedonthedataretrieved,wewantedtocreateapiechart.Wefirst

formattedOnTimeDeparturePercentagetohavesingleadigitafterthe

decimal.Thenweinsertedapiechart(seeFigure14.13).



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Hour 14. Consuming HDInsight Data from Microsoft BI Tools over Hive ODBC Driver: Part 1

Tải bản đầy đủ ngay(0 tr)

×