Tải bản đầy đủ
Figure 5.5 SPARC sun4u 32- and 64-Bit Process Address Space

Figure 5.5 SPARC sun4u 32- and 64-Bit Process Address Space

Tải bản đầy đủ

Virtual Address Spaces

135

Intel x86
0xFFFFFFFF

256-MB Kernel Context

0xE0000000

Libraries

HEAP– malloc(), sbrk()

Executable – DATA
0x8048000

Executable – TEXT
Stack

0x0

Figure 5.6 Intel x86 Process Address Space
memory allocator manages the heap area; thus, arbitrarily sized memory objects
can be allocated and freed. The general-purpose memory allocator is implemented
with malloc() and related library calls.
A process grows its heap space by making the sbrk() system call. The sbrk()
system call grows the heap segment by the amount requested each time it is
called. A user program does not need to call sbrk() directly because the malloc() library calls sbrk() when it needs more space to allocate from. The
sbrk() system call is shown below.
void *sbrk(intptr_t incr);

The heap segment is virtual memory, so requesting memory with malloc and
sbrk does not allocate physical memory, it merely allocates the virtual address
space. Only when the first reference is made to a page within the allocated virtual
memory is physical memory allocated—one page at a time. The memory system
transparently achieves this “zero fill on demand” allocation because a page fault
occurs the first time a page is referenced in the heap, and then the segment driver
recognizes the first memory access and simply creates a page at that location
on-the-fly.

136

Solaris Memory Architecture

Memory pages are allocated to the process heap by zero-fill-on-demand and then
remain in the heap segment until the process exits or until they are stolen by the
page scanner. Calls to the memory allocator free() function do not return physical memory to the free memory pool; free() simply marks the area within the
heap space as free for later use. For this reason, it is typical to see the amount of
physical memory allocated to a process grow, but unless there is a memory shortage, it will not shrink, even if free() has been called.
The heap can grow until it collides with the memory area occupied by the
shared libraries. The maximum size of the heap depends on the platform virtual
memory layout and differs on each platform. In addition, on 64-bit platforms, processes may execute in either 32- or 64-bit mode. As shown in Figure 5.5 on
page 134, the size of the heap can be much larger in processes executing in 64-bit
mode. Table 5-1 shows the maximum heap sizes and the operating system requirements that affect the maximum size.
Table 5-1 Maximum Heap Sizes
Solaris Version
Solaris 2.5
Solaris 2.5.1
Solaris 2.5.1 with patch
103640-08 or greater

Maximum Heap
Size
2 Gbytes
2 Gbytes
3.75 Gbytes

Solaris 2.5.1 with patch
103640-23 or greater
Solaris 2.6

3.75 Gbytes

Solaris 2.7 32 bit mode

3.75 Gbytes
3.90 Gbytes
16 Tbytes on
UltraSPARC-I and -II

Solaris 2.7 64 bit mode

3.75 Gbytes

Notes

Need to be root to
increase limit above 2
GB with ulimit(1M).
Do not need to be root to
increase limit.
Need to increase beyond
2 GB with ulimit(1M).
(Non-sun4u platform)
(sun4u platforms)
Virtually unlimited.

5.3.5 The Stack
The process stack is mapped into the address space with an initial allocation and
then grows downward. The stack, like the heap, grows on demand, but no library
grows the stack; instead, a different mechanism triggers this growth.
Initially, a single page is allocated for the stack, and as the process executes and
calls functions, it pushes the program counter, arguments, and local variables onto
the stack. When the stack grows larger than one page, the process causes a page
fault, and the kernel notices that this is a stack segment page fault and grows the
stack segment.

Virtual Address Spaces

137

5.3.6 Address Space Management
The Solaris kernel is implemented with a central address management subsystem
that other parts of the kernel call into. The address space module is a wrapper
around the segment drivers, so that subsystems need not know what segment
driver is used for a memory range. The address space object shown in Figure 5.7 is
linked from the process’s address space and contains pointers to the segments that
constitute the address space.
struct seg
struct proc
p_as

s_base
s_size
s_as
s_prev
s_next
s_ops

256-MB Kernel Context
Stack
Libraries

struct seg
struct as
a_segs
a_size
a_nsegs
a_flags
a_hat
a_tail
a_watchp

s_base
s_size
s_as
s_prev
s_next
s_ops

HEAP– malloc(), sbrk()

struct seg
s_base
s_size
s_as
s_prev
s_next
s_ops

Executable – DATA
Executable – TEXT

Figure 5.7 The Address Space
The address space subsystem manages the following functions:







Duplication of address spaces, for fork()
Destruction of address spaces, for exit()
Creation of new segments within an address space
Removal of segments from an address space
Setting and management of page protection for an address space
Page fault routing for an address space

138

Solaris Memory Architecture

• Page locking and advice for an address space
• Management of watchpoints for an address space
Recall that the process and kernel subsystems call into the address space subsystem to manage their address spaces. The address space subsystem consists of a
series of functions, grouped to perform the functions listed above. Although the
subsystem has a lot of entry points, the implementation is fairly simple because
most of the functions simply look up which segment the operation needs to operate on and then route the request to the appropriate segment driver.
A call to the as_alloc() function creates an address space, but as_alloc()is
invoked only once—when the system boots and the init process is created. After
the init process is created, all address spaces are created by duplication of the init
process’s address space with fork(). The fork() system call in turn calls the
as_dup() function to duplicate the address space of current process as it creates a
new process, and the entire address space configuration, including the stack and
heap, is replicated at this point.
The behavior of vfork() at this point is somewhat different. Rather than calling
as_dup() to replicate the address space, vfork() creates a new process by borrowing the parent’s existing address space. The vfork function is useful if the fork
is going to call exec() since it saves all the effort of duplicating the address space
that would otherwise have been discarded once exec() is called. The parent process is suspended while the child is using its address space, until exec() is called.
Once the process is created, the address space object is allocated and set up. The
Solaris 7 data structure for the address space object is shown below.
struct as {
kmutex_t a_contents;
/* protect certain fields in the structure */
uchar_t a_flags;
/* as attributes */
uchar_t a_vbits;
/* used for collecting statistics */
kcondvar_t a_cv;
/* used by as_rangelock */
struct hat *a_hat;
/* hat structure */
struct hrmstat *a_hrm; /* ref and mod bits */
caddr_t a_userlimit;
/* highest allowable address in this as */
union {
struct seg *seglast;
/* last segment hit on the addr space */
ssl_spath *spath;
/* last search path in seg skiplist */
} a_cache;
krwlock_t a_lock;
/* protects fields below + a_cache */
int
a_nwpage;
/* number of watched pages */
struct watched_page *a_wpage;
/* list of watched pages (procfs) */
seg_next a_segs;
/* segments in this address space. */
size_t a_size;
/* size of address space */
struct seg *a_tail;
/* last element in the segment list. */
uint_t a_nsegs;
/* number of elements in segment list */
uchar_t a_lrep;
/* representation of a_segs: see #defines */
uchar_t a_hilevel;
/* highest level in the a_segs skiplist */
uchar_t a_unused;
uchar_t a_updatedir;
/* mappings changed, rebuild as_objectdir */
vnode_t **a_objectdir; /* object directory (procfs) */
size_t a_sizedir;
/* size of object directory */
};

Header File

Virtual Address Spaces

139

Address space fault handling is performed in the address space subsystem; some of
the faults are handled by the common address space code, and others are redirected to the segment handlers. When a page fault occurs, the Solaris trap handlers call the as_fault() function, which looks to see what segment the page
fault occurred in by calling the as_setat() function. If the fault does not lie in
any of the address space’s segments, then as_fault() sends a SIGSEGV signal to
the process. If the fault does lie within one of the segments, then the segment’s
fault method is called and the segment handles the page fault.
Table 5-2 lists the segment functions in alphabetical order.
Table 5-2 Solaris 7 Address Space Functions
Method
as_addseg()
as_alloc()
as_clearwatch()
as_ctl()
as_dup()
as_exec()

as_fault()
as_findseg()
as_free()
as_gap()

as_getmemid()
as_getprot()
as_map()
as_memory()
as_pagelock()

Description
Creates a new segment and links it into the address
space.
Creates a new address space object (only called from the
kernel for the init process).
Clears all watch points for the address space.
Sends memory advice to an address range for the address
space.
Duplicates the entire address space.
Special code for exec to move the stack segment from its
interim place in the old address to the right place in the
new address space.
Handles a page fault in the address space.
Finds a segment containing the supplied virtual address.
Destroys the address space object; called by exit().
Finds a hole of at least the specified size within [base,
base + len). If the flag supplied specifies AH_HI, the hole
will have the highest possible address in the range. Otherwise, it will have the lowest possible address. If the flag
supplied specifies AH_CONTAIN, the hole will contain the
address addr. If an adequate hole is found, base and len
are set to reflect the part of the hole that is within range,
and 0 is returned. Otherwise, −1 is returned.
Calls the segment driver containing the supplied address
to find a unique ID for this segment.
Gets the current protection settings for the supplied
address.
Maps a file into the address space.
Returns the next range within [base, base + len) that is
backed with “real memory.”
Locks a page within an address space by calling the segment page lock function.

140

Solaris Memory Architecture

Table 5-2 Solaris 7 Address Space Functions (Continued)
Method
as_pagereclaim()

Description
Retrieves a page from the free list for the address supplied.
as_pageunlock()
Unlocks a page within the address space.
as_rangebroadcast() Wakes up all threads waiting on the address space condition variable.
as_rangelock()
Locks the pages for the supplied address range.
as_rangeunlock()
Unlocks the pages for the supplied address range.
as_rangewait()
Waits for virtual addresses to become available in the
specified address space. AS_CLAIMGAP must be held by
the caller and is reacquired before returning to the caller.
as_setat()
Finds a segment containing the supplied address.
as_setprot()
Sets the virtual mapping for the interval from [addr :
addr + size) in address space as to have the specified protection.
as_setwatch()
Sets a watchpoint for the address. On a system without
watchpoint support, does nothing.
as_swapout()
Swaps the pages associated with the address space to secondary storage, returning the number of bytes actually
swapped.
as_unmap()
Unmaps a segment from the address space.

5.3.7 Virtual Memory Protection Modes
We break each process into segments so that we can treat each part of the address
space differently. For example, the kernel maps the machine code portion of the
executable binary into the process as read-only to prevent the process from modifying its machine code instructions. The virtual memory subsystem does this by taking advantage of the hardware MMU’s virtual memory protection capabilities.
Solaris relies on the MMU having the following protection modes:
• Read — The mapping is allowed to be read from.
• Write — The mapping is allowed to be written to.
• Executable — The mapping is allowed to have machine codes executed within
its address range.
The implementation of protection modes is done in the segment and HAT layers.

5.3.8 Page Faults in Address Spaces
The Solaris virtual memory system uses the hardware MMU’s memory management capabilities. MMU-generated exceptions tell the operating system when a
memory access cannot continue without the kernel’s intervention, by interrupting

Virtual Address Spaces

141

the executing process with a trap (see “Entering Kernel Mode” on page 28) and
then invoking the appropriate piece of memory management code. Three major
types of memory-related hardware exceptions can occur: major page faults, minor
page faults, and protection faults.
A major page fault occurs when an attempt to access a virtual memory location
that is mapped by a segment does not have a physical page of memory mapped to
it and the page does not exist in physical memory. The page fault allows the virtual memory system to hide the management of physical memory allocation from
the process. The virtual memory system traps accesses to memory that the process believes is accessible and arranges to have either a new page created for that
address (in the case of the first access) or copies in the page from the swap device.
Once the memory system places a real page behind the memory address, the process can continue normal execution. If a reference is made to a memory address
that is not mapped by any segment, then a segmentation violation signal (SIGSEGV) is sent to the process. The signal is sent as a result of a hardware exception
caught by the processor and translated to a signal by the address space layer.
A minor page fault occurs when an attempt is made to access a virtual memory
location that resides within a segment and the page is in physical memory, but no
current MMU translation is established from the physical page to the address
space that caused the fault. For example, a process maps in the libc.so library
and makes a reference to a page within it. A page fault occurs, but the physical
page of memory is already present and the process simply needs to establish a
mapping to the existing physical page. Minor faults are also referred to as
attaches.
A page protection fault occurs when a program attempts to access a memory
address in a manner that violates the preconfigured access protection for a memory segment. Protection modes can enable any of read, write, or execute access. For
example, the text portion of a binary is mapped read-only, and if we attempt to
write to any memory address within that segment, we will cause a memory protection fault. The memory protection fault is also initiated by the hardware MMU as
a trap that is then handled by the segment page fault handling routine.

142

Solaris Memory Architecture

Figure 5.8 shows the relationship between a virtual address space, its segments,
and the hardware MMU.

The address space determines

Stack

3 from the address of the fault

The segment driver
which segment the fault occurred
4 fault handler is called
in and calls the segment driver.
to handle the fault by
bringing it in from
Address Space
swap.
(points to vnode segment
.
Vnode Segment
Driver
driver)
.

Libraries

segvn

Segment Size
seg_fault()

segvn_fault()

HEAP

vop_getpage()

Virtual Base Address
DATA

Page Fault
(trap)

TEXT

1

2

5

sun4u
hat layer

swapfs

sun4u
sf-mmu
A byte is touched in
the heap space, causing
an MMU page fault.

6
Swap

Page

The page is Space
copied from
swap to memory.

Figure 5.8 Virtual Address Space Page Fault Example
In the figure, we see what happens when a process accesses a memory location
within its heap space that does not have physical memory mapped to it. This has
most likely occurred because the page of physical memory has previously been stolen by the page scanner as a result of a memory shortage.
1. A reference is made to a memory address that does not map to a physical
page of memory. In this example, the page has been paged out and now
resides on the swap device.
2. When the process accesses the address with no physical memory behind it,
the MMU detects the invalid reference and causes a trap to occur on the processor executing the code of the running thread. The fault handler recognizes
this as a memory page fault and establishes which segment the fault occurred
in by comparing the address of the fault to the addresses mapped by each
segment.
3. The address space as_fault() routine compares the address of the fault
with the addresses mapped by each segment and then calls the page_fault

Memory Segments

143

routine of the segment driver for this segment (in this case, the vnode segment driver).
4. The segment driver allocates and maps page of memory by calling into the
HAT layer and then copies the contents of the page from the swap device.
5. The segment driver then reads the page in from the backing store by calling
the getpage() function of the backing store’s vnode.
6. The backing store for this segment is the swap device, so the swap device
getpage() function is called to read in the page from the swap device.
Once this process is completed, the process can continue execution.

5.4

Memory Segments
Another example of the object-oriented approach to memory management is the
memory “segment” object. Memory segments manage the mapping of a linear
range of virtual memory into an address space. The mapping is between the
address space and some type of device. The objective of the memory segment is to
allow both memory and devices to be mapped into an address space. Traditionally,
this required hard-coding memory and device information into the address space
handlers for each device. The object architecture allows different behaviors for different segments.
For example, one segment might be a mapping of a file into an address space
(with mmap), and another segment might be the mapping of a hardware device into
the process’s address space (a graphics framebuffer). In this case, the segment
driver provides a similar view of linear address space, even though the file mapping operation with mmap uses pages of memory to cache the file data, whereas the
framebuffer device maps the hardware device into the address space.
The flexibility of the segment object allows us to use virtually any abstraction to
represent a linear address space that is visible to a process, regardless of the real
facilities behind the scenes.
struct seg {
caddr_t s_base;
size_t s_size;
struct as *s_as;
seg_next s_next;
struct seg *s_prev;
struct seg_ops *s_ops;
void *s_data;
};

/*
/*
/*
/*
/*
/*
/*

base virtual address */
size in bytes */
containing address space */
next seg in this address space */
prev seg in this address space */
ops vector: see below */
private data for instance */

Header File