Wednesday 18 July 2018

Huge Pages

If you’re a C++ programmer, you know that objects in memory have certain addresses (i.e. the value of a pointer).
However, these addresses do not neccessarily represent physical addresses (i.e. an address in RAM). They represent addresses in Virtual memory. You CPU has a MMU (memory management unit) hardware that assists the kernel in mapping virtual memory to a physical location.
This approach has numerous advantages, but mainly it is useful for
  • Performance (for various different reasons)
  • Isolating programs, i.e. no program can just read the memory of another program.
The virtual memory space is divided into pages.
Each individual page points to some physical memory – it might point to a section of physical RAM.
Most pages you’re dealing with point either to the RAM or are swapped out, i.e. stored on a HDD or an SSD.
Normal pages are 4096 bytes long. Hugepages have a size of 2 Megabytes.

When a program accesses some memory page, the CPU needs to know which physical page to read the data from (i.e. a virtual-to-phyical address map).
The kernel contains a data structure (the page table) that contains all information about all the pages in use. Using this data structure, we could map the virtual address to a physical address.
TLB means that although you have to parse the page table the first time you access the page, all subsequent accesses to the page can be handled by the TLB, which is really fast!
if we assume the TLB has 512 entries, without hugepages we can map
4096\ \text{b} \cdot 512 = 2\ \text{MB}
but with hugepages we can map
2\ \text{MB} \cdot 512 = 1\ \text{GB}
So huge pages are great – they can lead to greatly increased performance almost without effort.

Swapping hugepages

 If there is an insufficient amount of physical memory (i.e. RAM) available, your kernel will move less-important (i.e. less often used) pages to your hard drive to free up some RAM fore more important pages.
In principle, the same goes for hugepages. But the kernel can only swap entire pages – not individual bytes.
Let’s assume we have a program like this:
  • char* mymemory = malloc(2*1024*1024); //We'll assume this is one hugepage!
  • // Fill mymemory with some data
  • // Do lots of other things,
  • // causing the mymemory page to be swapped out
  • // ...
  • // Access only the first byte
  • putchar(mymemory[0]);
In that case, the kernel will need to swap in (i.e. read) the entire 2 Megabytes from the HDD/SSD just for you to read a single byte. With normal pages, only 4096 bytes need to be read from the HDD/SSD.

No comments:

Post a Comment