Saturday, May 2, 2015

Process Address Space (Linux)

Process Address Space

  • Linux (and all modern OSs) virtualizes physical memory.
  • i.e. processes do not directly address physical memory.
  • Each process gets a unique virtual address space.
  • This address space is linear; addresses start at zero, and increments continously upto the max of 32-bit/64-bit integer value.
  • A part of this address space (~1GB) is reserved for the kernel. 

Memory Pages

  • The virtual address space is broken into memory pages.
  • A memory page is 4KB for 32-bit OSs and 8KB for 64-bit OSs.
  • A page is the smallest unit that the MMU (Memory Management Unit) can manage.
  • A page can be either valid or invalid.
  • A valid page is memory that has been allocated for the process; this page could be in physical memory or on disk (swap partition).
  • An invalid page is memory that has not been allocated (or freed).
  • Accessing an invalid page causes a segmentation fault.
  • Different virtual memory pages of different processes, could map to the same physical page; eg: standard C lib will have only one copy in physical memory, but will be mapped by different virtual memory pages of multiple processes.

Paging

  • A valid page, which has been swapped to disk, cannot be directly accessed; it needs to be brought to physical memory first; 
  • This task is handled automatically by the OS, and is not visible to the process. 
  • When a process tries to access a valid page which is on disk, MMU generates a page fault, which is captured by the OS kernel; kernel then page-in that page from disk to RAM, transparently.
  • When page-in is required, some existing pages (least used) might be paged-out, to make room for the new pages.

Memory Regions

  • Virtual memory pages of a process are grouped into several blocks called memory regions (or memory areas).
  • Text segment: contains the process's read only data -- program code, string literals, cont vars etc.
  • Stack: contains the process's execution stack -- local vars etc.
    • stack dynamically grows/shrinks as the stack depth increases/decreases.
    • multi-threaded processes contains one stack per thread.
  • Heap (data segment): contains the process's dynamically allocated memory.

References: