Tải bản đầy đủ - 0 (trang)
Chapter 18.  The Ext2 and Ext3 Filesystems

Chapter 18.  The Ext2 and Ext3 Filesystems

Tải bản đầy đủ - 0trang

18.1.GeneralCharacteristicsofExt2

Unix-likeoperatingsystemsuseseveraltypesoffilesystems.

Althoughthefilesofallsuchfilesystemshaveacommonsubset

ofattributesrequiredbyafewPOSIXAPIssuchasstat(),each

filesystemisimplementedinadifferentway.

ThefirstversionsofLinuxwerebasedontheMINIXfilesystem.

AsLinuxmatured,theExtendedFilesystem(ExtFS)was

introduced;itincludedseveralsignificantextensions,but

offeredunsatisfactoryperformance.TheSecondExtended

Filesystem(Ext2)wasintroducedin1994;besidesincluding

severalnewfeatures,itisquiteefficientandrobustandis,

togetherwithitsoffspringExt3,themostwidelyusedLinux

filesystem.

ThefollowingfeaturescontributetotheefficiencyofExt2:

WhencreatinganExt2filesystem,thesystemadministrator

maychoosetheoptimalblocksize(from1,024to4,096

bytes),dependingontheexpectedaveragefilelength.For

instance,a1,024-blocksizeispreferablewhentheaverage

filelengthissmallerthanafewthousandbytesbecause

thisleadstolessinternalfragmentationthatis,lessofa

mismatchbetweenthefilelengthandtheportionofthe

diskthatstoresit(seethesection"MemoryArea

Management"inChapter8,whereinternalfragmentation

fordynamicmemorywasdiscussed).Ontheotherhand,

largerblocksizesareusuallypreferableforfilesgreater

thanafewthousandbytesbecausethisleadstofewerdisk

transfers,thusreducingsystemoverhead.

WhencreatinganExt2filesystem,thesystemadministrator

maychoosehowmanyinodestoallowforapartitionofa

givensize,dependingontheexpectednumberoffilestobe



storedonit.Thismaximizestheeffectivelyusabledisk

space.

Thefilesystempartitionsdiskblocksintogroups.Each

groupincludesdatablocksandinodesstoredinadjacent

tracks.Thankstothisstructure,filesstoredinasingleblock

groupcanbeaccessedwithaloweraveragediskseektime.

Thefilesystempreallocatesdiskdatablockstoregularfiles

beforetheyareactuallyused.Thus,whenthefileincreases

insize,severalblocksarealreadyreservedatphysically

adjacentpositions,reducingfilefragmentation.

Fastsymboliclinks(seethesection"HardandSoftLinks"in

Chapter1)aresupported.Ifthesymboliclinkrepresentsa

shortpathname(atmost60characters),itcanbestoredin

theinodeandcanthusbetranslatedwithoutreadingadata

block.

Moreover,theSecondExtendedFilesystemincludesother

featuresthatmakeitbothrobustandflexible:

Acarefulimplementationoffile-updatingthatminimizesthe

impactofsystemcrashes.Forinstance,whencreatinga

newhardlinkforafile,thecounterofhardlinksinthedisk

inodeisincreasedfirst,andthenewnameisaddedintothe

properdirectorynext.Inthisway,ifahardwarefailure

occursaftertheinodeupdatebutbeforethedirectorycan

bechanged,thedirectoryisconsistent,eveniftheinode's

hardlinkcounteriswrong.Deletingthefiledoesnotleadto

catastrophicresults,althoughthefile'sdatablockscannot

beautomaticallyreclaimed.Ifthereverseweredone

(changingthedirectorybeforeupdatingtheinode),the

samehardwarefailurewouldproduceadangerous

inconsistency:deletingtheoriginalhardlinkwouldremove

itsdatablocksfromdisk,yetthenewdirectoryentrywould



refertoaninodethatnolongerexists.Ifthatinodenumber

wereusedlaterforanotherfile,writingintothestale

directoryentrywouldcorruptthenewfile.

Supportforautomaticconsistencychecksonthefilesystem

statusatboottime.Thechecksareperformedbythe

e2fsckexternalprogram,whichmaybeactivatednotonly

afterasystemcrash,butalsoafterapredefinednumberof

filesystemmounts(acounterisincreasedaftereachmount

operation)orafterapredefinedamountoftimehaselapsed

sincethemostrecentcheck.

Supportforimmutablefiles(theycannotbemodified,

deleted,orrenamed)andforappend-onlyfiles(datacanbe

addedonlytotheendofthem).

CompatibilitywithboththeUnixSystemVRelease4and

theBSDsemanticsoftheusergroupIDforanewfile.In

SVR4,thenewfileassumestheusergroupIDofthe

processthatcreatesit;inBSD,thenewfileinheritsthe

usergroupIDofthedirectorycontainingit.Ext2includesa

mountoptionthatspecifieswhichsemantictouse.

EveniftheExt2filesystemisamature,stableprogram,several

additionalfeatureshavebeenconsideredforinclusion.Someof

themhavealreadybeencodedandareavailableasexternal

patches.Othersarejustplanned,butinsomecases,fieldshave

alreadybeenintroducedintheExt2inodeforthem.Themost

significantfeaturesbeingconsideredare:



Blockfragmentation

Systemadministratorsusuallychooselargeblocksizesfor

accessingdisks,becausecomputerapplicationsoftendeal

withlargefiles.Asaresult,smallfilesstoredinlargeblocks



wastealotofdiskspace.Thisproblemcanbesolvedby

allowingseveralfilestobestoredindifferentfragmentsof

thesameblock.



Handlingoftransparentlycompressedandencryptedfiles

Thesenewoptions,whichmustbespecifiedwhencreating

afile,allowuserstotransparentlystorecompressedand/or

encryptedversionsoftheirfilesondisk.



Logicaldeletion

Anundeleteoptionallowsuserstoeasilyrecover,ifneeded,

thecontentsofapreviouslyremovedfile.



Journaling

Journalingavoidsthetime-consumingcheckthatis

automaticallyperformedonafilesystemwhenitisabruptly

unmountedforinstance,asaconsequenceofasystem

crash.

Inpractice,noneofthesefeatureshasbeenofficiallyincluded

intheExt2filesystem.OnemightsaythatExt2isvictimofits

success;ithasbeenthepreferredfilesystemadoptedbymost

Linuxdistributioncompaniesuntilafewyearsago,andthe

millionsofuserswhoreliedoniteverydaywouldhavelooked

suspiciouslyatanyattempttoreplaceExt2withsomeother

filesystem.

ThemostcompellingfeaturemissingfromExt2isjournaling,

whichisrequiredbyhigh-availabilityservers.Toprovidefora

smoothtransition,journalinghasnotbeenintroducedinthe



Ext2filesystem;rather,aswe'lldiscussinthelatersection"The

Ext3Filesystem,"amorerecentfilesystemthatisfully

compatiblewithExt2hasbeencreated,whichalsooffers

journaling.Userswhodonotreallyrequirejournalingmay

continuetousethegoodoldExt2filesystem,whiletheothers

willlikelyadoptthenewfilesystem.Nowadays,most

distributionsadoptExt3asthestandardfilesystem.



18.2.Ext2DiskDataStructures

ThefirstblockineachExt2partitionisnevermanagedbythe

Ext2filesystem,becauseitisreservedforthepartitionboot

sector(seeAppendixA).TherestoftheExt2partitionissplit

intoblockgroups,eachofwhichhasthelayoutshowninFigure

18-1.Asyouwillnoticefromthefigure,somedatastructures

mustfitinexactlyoneblock,whileothersmayrequiremore

thanoneblock.Alltheblockgroupsinthefilesystemhavethe

samesizeandarestoredsequentially,thusthekernelcan

derivethelocationofablockgroupinadisksimplyfromits

integerindex.



Figure18-1.LayoutsofanExt2partitionandof

anExt2blockgroup



Blockgroupsreducefilefragmentation,becausethekerneltries

tokeepthedatablocksbelongingtoafileinthesameblock

group,ifpossible.Eachblockinablockgroupcontainsoneof

thefollowingpiecesofinformation:

Acopyofthefilesystem'ssuperblock



Acopyofthegroupofblockgroupdescriptors

Adatablockbitmap

Aninodebitmap

Atableofinodes

Achunkofdatathatbelongstoafile;i.e.,datablocks

Ifablockdoesnotcontainanymeaningfulinformation,itis

saidtobefree.

AsyoucanseefromFigure18-1,boththesuperblockandthe

groupdescriptorsareduplicatedineachblockgroup.Onlythe

superblockandthegroupdescriptorsincludedinblockgroup0

areusedbythekernel,whiletheremainingsuperblocksand

groupdescriptorsareleftunchanged;infact,thekerneldoesn't

evenlookatthem.Whenthee2fsckprogramexecutesa

consistencycheckonthefilesystemstatus,itreferstothe

superblockandthegroupdescriptorsstoredinblockgroup0,

andthencopiesthemintoallotherblockgroups.Ifdata

corruptionoccursandthemainsuperblockorthemaingroup

descriptorsinblockgroup0becomeinvalid,thesystem

administratorcaninstructe2fscktorefertotheoldcopiesof

thesuperblockandthegroupdescriptorsstoredinablock

groupsotherthanthefirst.Usually,theredundantcopiesstore

enoughinformationtoallowe2fscktobringtheExt2partition

backtoaconsistentstate.

Howmanyblockgroupsarethere?Well,thatdependsbothon

thepartitionsizeandtheblocksize.Themainconstraintisthat

theblockbitmap,whichisusedtoidentifytheblocksthatare

usedandfreeinsideagroup,mustbestoredinasingleblock.

Therefore,ineachblockgroup,therecanbeatmost8xb

blocks,wherebistheblocksizeinbytes.Thus,thetotal



numberofblockgroupsisroughlys/(8xb),wheresisthe

partitionsizeinblocks.

Forexample,let'sconsidera32-GBExt2partitionwitha4-KB

blocksize.Inthiscase,each4-KBblockbitmapdescribes32K

datablocksthatis,128MB.Therefore,atmost256block

groupsareneeded.Clearly,thesmallertheblocksize,the

largerthenumberofblockgroups.



18.2.1.Superblock

AnExt2disksuperblockisstoredinanext2_super_blockstructure,

whosefieldsarelistedinTable18-1.[*]The__u8,__u16,and_

_u32datatypesdenoteunsignednumbersoflength8,16,and

32bitsrespectively,whilethe__s8,__s16,__s32datatypes

denotesignednumbersoflength8,16,and32bits.To

explicitlyspecifytheorderinwhichthebytesofawordor

double-wordarestoredondisk,thekernelalsomakesuseof

the__le16,__le32,__be16,and__be32datatypes;theformer

twotypesdenotethelittle-endianorderingforwordsand

double-words(theleastsignificantbyteisstoredatthehighest

address),respectively,whilethelattertwotypesdenotethe

big-endianordering(themostsignificantbyteisstoredatthe

highestaddress).

[*]ToensurecompatibilitybetweentheExt2andExt3filesystems,theext2_super_blockdatastructure

includessomeExt3-specificfields,whicharenotshowninTable18-1.



Table18-1.ThefieldsoftheExt2superblock

Type



Field



Description



__le32



s_inodes_count



Totalnumberofinodes



__le32



s_blocks_count



Filesystemsizeinblocks



__le32



s_r_blocks_count



Numberofreservedblocks



__le32



s_free_blocks_count



Freeblockscounter



__le32



s_free_inodes_count



Freeinodescounter



__le32



s_first_data_block



Numberoffirstusefulblock(always1)



__le32



s_log_block_size



Blocksize



__le32



s_log_frag_size



Fragmentsize



__le32



s_blocks_per_group



Numberofblockspergroup



__le32



s_frags_per_group



Numberoffragmentspergroup



__le32



s_inodes_per_group



Numberofinodespergroup



__le32



s_mtime



Timeoflastmountoperation



__le32



s_wtime



Timeoflastwriteoperation



__le16



s_mnt_count



Mountoperationscounter



__le16



s_max_mnt_count



Numberofmountoperationsbeforecheck



__le16



s_magic



Magicsignature



__le16



s_state



Statusflag



__le16



s_errors



Behaviorwhendetectingerrors



__le16



s_minor_rev_level



Minorrevisionlevel



__le32



s_lastcheck



Timeoflastcheck



__le32



s_checkinterval



Timebetweenchecks



__le32



s_creator_os



OSwherefilesystemwascreated



__le32



s_rev_level



Revisionlevelofthefilesystem



__le16



s_def_resuid



DefaultUIDforreservedblocks



__le16



s_def_resgid



DefaultusergroupIDforreservedblocks



__le32



s_first_ino



Numberoffirstnonreservedinode



__le16



s_inode_size



Sizeofon-diskinodestructure



__le16



s_block_group_nr



Blockgroupnumberofthissuperblock



__le32



s_feature_compat



Compatiblefeaturesbitmap



__le32



s_feature_incompat



Incompatiblefeaturesbitmap



__le32



s_feature_ro_compat



Read-onlycompatiblefeaturesbitmap



__u8[16]



s_uuid



128-bitfilesystemidentifier



char[16]



s_volume_name



Volumename



char[64]



s_last_mounted



Pathnameoflastmountpoint



__le32



s_algorithm_usage_bitmap



Usedforcompression



__u8



s_prealloc_blocks



Numberofblockstopreallocate



__u8



s_prealloc_dir_blocks



Numberofblockstopreallocatefordirectories



__u16



s_padding1



Alignmenttoword



__u32[204]



s_reserved



Nullstopadout1,024bytes



Thes_inodes_countfieldstoresthenumberofinodes,whilethe

s_blocks_countfieldstoresthenumberofblocksintheExt2

filesystem.

Thes_log_block_sizefieldexpressestheblocksizeasapowerof

2,using1,024bytesastheunit.Thus,0denotes1,024-byte

blocks,1denotes2,048-byteblocks,andsoon.The

s_log_frag_sizefieldiscurrentlyequaltos_log_block_size,because

blockfragmentationisnotyetimplemented.

Thes_blocks_per_group,s_frags_per_group,ands_inodes_per_group

fieldsstorethenumberofblocks,fragments,andinodesineach

blockgroup,respectively.

Somediskblocksarereservedtothesuperuser(ortosome

otheruserorgroupofusersselectedbythes_def_resuidand

s_def_resgidfields).Theseblocksallowthesystemadministrator

tocontinuetousethefilesystemevenwhennomorefree

blocksareavailablefornormalusers.

Thes_mnt_count,s_max_mnt_count,s_lastcheck,ands_checkinterval

fieldssetuptheExt2filesystemtobecheckedautomaticallyat

boottime.Thesefieldscausee2fscktorunafterapredefined

numberofmountoperationshasbeenperformed,orwhena

predefinedamountoftimehaselapsedsincethelast

consistencycheck.(Bothkindsofcheckscanbeusedtogether.)

Theconsistencycheckisalsoenforcedatboottimeifthe

filesystemhasnotbeencleanlyunmounted(forinstance,after

asystemcrash)orwhenthekerneldiscoverssomeerrorsinit.

Thes_statefieldstoresthevalue0ifthefilesystemismounted

orwasnotcleanlyunmounted,1ifitwascleanlyunmounted,

and2ifitcontainserrors.



18.2.2.GroupDescriptorandBitmap

Eachblockgrouphasitsowngroupdescriptor,anext2_group_desc



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Chapter 18.  The Ext2 and Ext3 Filesystems

Tải bản đầy đủ ngay(0 tr)

×