GLibcHeap

二進制安全pwn基礎 /

本文將對 Glibc 堆上的內存管理作簡要介紹，部分內容翻譯自參考資料中的文章。略過了許多細節，主要是為了對新手友好。

默認讀者熟悉操作系統、C 語言及其運行機制，并且對于 C 中的函數調用棧有所了解。

什么是堆？

堆是每個程序被分配到的一塊內存區域，和棧的區別主要在于堆內存是動態分配的。也就是說，程序可以從heap段請求一塊內存，或者釋放一塊內存。

另外，堆內存是全局的，即在程序的任意位置都可以訪問到堆，并不一定要在調用malloc的那個函數里訪問。這是因為 C 語言使用指針指向動態分配的內存。但相比訪問棧上的靜態局部變量，使用指針也帶來了一定的開銷。

使用動態分配的內存

GLibc 采用 ptmalloc2 內存分配器管理堆內存，相比前身 dlmalloc，它增加了對多線程的支持。多線程的好處就不多贅述了。

借助stdlib.h我們可以使用malloc和free函數來操作堆內存：

char *buffer = (char *)malloc(10);

strcpy(buffer, "hello");
printf("%s\n", buffer);

free(buffer);

第一行分配了 10 字節給buffer，注意這里的強制類型轉換是必須的；第 2-3 行使用了buffer這塊內存，并在最后一行釋放。

下面是malloc和free函數的注釋：

/*
  malloc(size_t n)
  Returns a pointer to a newly allocated chunk of at least n bytes, or null
  if no space is available. Additionally, on failure, errno is
  set to ENOMEM on ANSI C systems.

  If n is zero, malloc returns a minumum-sized chunk. (The minimum
  size is 16 bytes on most 32bit systems, and 24 or 32 bytes on 64bit
  systems.)  On most systems, size_t is an unsigned type, so calls
  with negative arguments are interpreted as requests for huge amounts
  of space, which will often fail. The maximum supported value of n
  differs across systems, but is in all cases less than the maximum
  representable value of a size_t.
*/

/*
  free(void* p)
  Releases the chunk of memory pointed to by p, that had been previously
  allocated using malloc or a related routine such as realloc.
  It has no effect if p is null. It can have arbitrary (i.e., bad!)
  effects if p has already been freed.

  Unless disabled (using mallopt), freeing very large spaces will
  when possible, automatically trigger operations that give
  back unused memory to the system, thus reducing program footprint.
*/

注意，即使申請 0 字節內存，malloc依然會分配一個最小的 chunk；如果傳給free的參數是空指針，free不會做任何事，而如果傳入的是一個已經free過的指針，那么后果是不可預期的。這里尤其需要注意的是，與Java等語言不同，C 語言中釋放掉分配到的內存的責任在于程序員，并且分配到的內存只應使用一次。

這兩個函數在更底層上是使用brk()和mmap()這兩個系統調用來管理內存的。

兩個系統調用

注意申請內存時，Linux 內核只會先分配一段虛擬內存，真正使用時才會映射到物理內存上去。

brk()

brk()通過增加break location來獲取內存，一開始heap段的起點start_brk和heap段的終點brk指向同一個位置。

ASLR 關閉時，兩者指向 data/bss 段的末尾，也就是end_data
ASLR 開啟時，兩者指向 data/bss 段的末尾加上一段隨機 brk 偏移

GLibcHeap

注：注意與sbrk()的區別，后者是 C 語言庫函數，malloc源碼中的MORECORE就是調用的sbrk()。

mmap()

用于創建私有的匿名映射段，主要是為了分配一塊新的內存，且這塊內存只有調用mmap()的進程可以使用，所以稱之為私有的。與之進行相反操作的是munmap()，刪除一塊內存區域上的映射。

多線程與 Arena

前面提到，ptmalloc2 的一大改進就在于多線程，那么他是如何做到的呢？不難猜到，每個線程必定要維護一些獨立的數據結構，并且對這些數據結構的訪問是需要加鎖的。的確，在 ptmalloc2 中，每個線程擁有自己的freelist，也就是維護空閑內存的一個鏈表；以及自己的arena，一段連續的堆內存區域。特別地，主線程的arena叫做main_arena。注意只有main_arena可以訪問heap段和mmap映射區域，non_main_arena只能訪問mmap映射區域。

注：線程較多時，互斥鎖機制會導致性能下降。

當我們在程序中第一次申請內存時還沒有heap段，因此 132KB 的heap段，也就是我們的main_arena，會被創建（通過brk()），無論我們申請的內存是多大。對于接下來的內存申請，malloc都會從main_arena中嘗試取出一塊內存進行分配。如果空間不夠，main_arena可以通過brk()擴張；如果空閑空間太多，也可以縮小。

那么對于non_main_arena呢？前面提到它只能訪問mmap映射區域，因為在創建時它就是由mmap()創建的——1MB 的內存空間會被映射到進程地址空間，不過實際上只有 132KB 是可讀寫的，這 132KB 就是該線程的heap結構，或者叫non_main_arena。

注：當然了，當申請的空間大于 128KB 且arena中沒有足夠空間時，無論在哪個arena里都只能通過mmap()分配內存。

arena也不是和線程一對一的，實際上有數量限制：

For 32 bit systems:
     Number of arena = 2 * number of cores.
For 64 bit systems:
     Number of arena = 8 * number of cores.

而當我們free一小塊內存時，內存也不會直接歸還給內核，而是給 ptmalloc2 讓他去維護，后者會將空閑內存丟入 bin 中，或者說freelist中也可以。如果過了一會我們的程序又要申請內存，那么 ptmalloc2 就會從 bin 中找一塊空閑的內存進行分配，找不到的話才會去問內核要內存。

維護多個堆

前面提到，main_arena只有一個堆，并且可以靈活地放縮；non_main_arena則只能通過mmap()獲得一個堆。那么如果non_main_arena里分配的堆內存不夠了怎么辦？很簡單，再mmap()一次，創建一個新的堆。

所以，在non_main_arena里，我們必須考慮如何維護多個堆的問題。這里我們會涉及三個頭部：

heap_info：每個堆的頭部，main_arena是沒有的
malloc_state：arena的頭部，main_arena的這個部分是全局變量而不屬于堆段
malloc_chunk：每個 chunk 的頭部

具體一點，heap_info完整定義如下：

typedef struct _heap_info
{
  mstate ar_ptr; /* Arena for this heap. */
  struct _heap_info *prev; /* Previous heap. */
  size_t size;   /* Current size in bytes. */
  size_t mprotect_size; /* Size in bytes that has been mprotected
                           PROT_READ|PROT_WRITE.  */
  /* Make sure the following data is properly aligned, particularly
     that sizeof (heap_info) + 2 * SIZE_SZ is a multiple of
     MALLOC_ALIGNMENT. */
  char pad[-6 * SIZE_SZ & MALLOC_ALIGN_MASK];
} heap_info;

而malloc_state的完整定義如下：

struct malloc_state
{
  /* Serialize access.  */
  mutex_t mutex;

  /* Flags (formerly in max_fast).  */
  int flags;

  /* Fastbins */
  mfastbinptr fastbinsY[NFASTBINS];

  /* Base of the topmost chunk -- not otherwise kept in a bin */
  mchunkptr top;

  /* The remainder from the most recent split of a small request */
  mchunkptr last_remainder;

  /* Normal bins packed as described above */
  mchunkptr bins[NBINS * 2 - 2];

  /* Bitmap of bins */
  unsigned int binmap[BINMAPSIZE];

  /* Linked list */
  struct malloc_state *next;

  /* Linked list for free arenas.  Access to this field is serialized
     by free_list_lock in arena.c.  */
  struct malloc_state *next_free;

  /* Number of threads attached to this arena.  0 if the arena is on
     the free list.  Access to this field is serialized by
     free_list_lock in arena.c.  */
  INTERNAL_SIZE_T attached_threads;

  /* Memory allocated from the system in this arena.  */
  INTERNAL_SIZE_T system_mem;
  INTERNAL_SIZE_T max_system_mem;
};

其中INTERNAL_SIZE_T默認和size_t相同：

#ifndef INTERNAL_SIZE_T
#define INTERNAL_SIZE_T size_t
#endif

在后面介紹 chunk 和 bin 的時候，我們會發現其中幾個字段的作用，malloc_chunk我們也會在后面看到。

對于arena中只有單個堆的情況：

GLibcHeap

對于non_main_arena中有多個堆的情況：

GLibcHeap

注意到有多個堆的情況下，舊的堆的 Top chunk 會被認為是普通的空閑塊。

Chunk 的結構

通俗地說，一塊由分配器分配的內存塊叫做一個 chunk，包含了元數據和用戶數據。具體一點，chunk 完整定義如下：

struct malloc_chunk {
  INTERNAL_SIZE_T      mchunk_prev_size;  /* Size of previous chunk (if free).  */
  INTERNAL_SIZE_T      mchunk_size;       /* Size in bytes, including overhead. */
  struct malloc_chunk* fd;                /* double links -- used only if free. */
  struct malloc_chunk* bk;
  /* Only used for large blocks: pointer to next larger size.  */
  struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
  struct malloc_chunk* bk_nextsize;
};

typedef struct malloc_chunk* mchunkptr;

這里出現的6個字段均為元數據。

一個 chunk 可以是以下幾種類型之一：

已分配的（Allocated chunk）
空閑的（Free chunk）
Top chunk
Last Remainder chunk

我們一個一個來看。

Allocated chunk

GLibcHeap

第一個部分（32 位上 4B，64 位上 8B）叫做prev_size，只有在前一個 chunk 空閑時才表示前一個塊的大小，否則這里就是無效的，可以被前一個塊征用（存儲用戶數據）。

這里的前一個chunk，指內存中相鄰的前一個，而不是freelist鏈表中的前一個。PREV_INUSE代表的“前一個chunk”同理。

第二個部分的高位存儲當前 chunk 的大小，低 3 位分別表示：

P: PREV_INUSE 之前的 chunk 已經被分配則為 1
M: IS_MMAPED 當前 chunk 是mmap()得到的則為 1
N: NON_MAIN_ARENA 當前 chunk 在non_main_arena里則為 1

對應源碼如下：

/* size field is or'ed with PREV_INUSE when previous adjacent chunk in use */
#define PREV_INUSE 0x1

/* extract inuse bit of previous chunk */
#define prev_inuse(p)       ((p)->size & PREV_INUSE)


/* size field is or'ed with IS_MMAPPED if the chunk was obtained with mmap() */
#define IS_MMAPPED 0x2

/* check for mmap()'ed chunk */
#define chunk_is_mmapped(p) ((p)->size & IS_MMAPPED)

/* size field is or'ed with NON_MAIN_ARENA if the chunk was obtained
   from a non-main arena.  This is only set immediately before handing
   the chunk to the user, if necessary.  */
#define NON_MAIN_ARENA 0x4

/* check for chunk from non-main arena */
#define chunk_non_main_arena(p) ((p)->size & NON_MAIN_ARENA)

你可能會有幾個困惑：

fd、bk、fd_nextsize、bk_nextsize這幾個字段去哪里了？
對于已分配的 chunk 來說它們沒用，所以也被征用了，用來存儲用戶數據。
為什么第二個部分的低 3 位就這么被吞了而不會影響size？
這是因為malloc會將用戶申請的內存大小轉化為實際分配的內存，以此來滿足（至少）8字節對齊的要求，同時留出額外空間存放 chunk 頭部。由于（至少）8字節對齊了，低3位自然就沒用了。在獲取真正的size時，會忽略低3位：

/*
   Bits to mask off when extracting size

   Note: IS_MMAPPED is intentionally not masked off from size field in
   macros for which mmapped chunks should never be seen. This should
   cause helpful core dumps to occur if it is tried by accident by
   people extending or adapting this malloc.
 */
#define SIZE_BITS (PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)

/* Get size, ignoring use bits */
#define chunksize(p)         ((p)->size & ~(SIZE_BITS))

malloc是如何將申請的大小轉化為實際分配的大小的呢？
核心在于request2size宏：

/* pad request bytes into a usable size -- internal version */

#define request2size(req)                                         \
  (((req) + SIZE_SZ + MALLOC_ALIGN_MASK < MINSIZE)  ?             \
   MINSIZE :                                                      \
   ((req) + SIZE_SZ + MALLOC_ALIGN_MASK) & ~MALLOC_ALIGN_MASK)

其中用到的其它宏定義：

#  define MALLOC_ALIGNMENT       (2 *SIZE_SZ)

/* The corresponding bit mask value */
#define MALLOC_ALIGN_MASK      (MALLOC_ALIGNMENT - 1)

/* The smallest possible chunk */
#define MIN_CHUNK_SIZE        (offsetof(struct malloc_chunk, fd_nextsize))

/* The smallest size we can malloc is an aligned minimal chunk */
#define MINSIZE  \
  (unsigned long)(((MIN_CHUNK_SIZE+MALLOC_ALIGN_MASK) & ~MALLOC_ALIGN_MASK))

這里還有一個mem指針，是做什么用的？
這是調用malloc時返回給用戶的指針。實際上，真正的chunk 是從chunk指針開始的。

/* The corresponding word size */
#define SIZE_SZ                (sizeof(INTERNAL_SIZE_T))

/* conversion from malloc headers to user pointers, and back */

#define chunk2mem(p)   ((void*)((char*)(p) + 2*SIZE_SZ))
#define mem2chunk(mem) ((mchunkptr)((char*)(mem) - 2*SIZE_SZ))

用戶申請的內存大小就是用戶數據可用的內存大小嗎？
不一定，原因還是字節對齊問題。要獲得可用內存大小，可以用malloc_usable_size()獲得，其核心函數是：

static size_t
musable (void *mem)
{
  mchunkptr p;
  if (mem != 0)
    {
      p = mem2chunk (mem);

      if (__builtin_expect (using_malloc_checking == 1, 0))
        return malloc_check_get_size (p);

      if (chunk_is_mmapped (p))
        return chunksize (p) - 2 * SIZE_SZ;
      else if (inuse (p))
        return chunksize (p) - SIZE_SZ;
    }
  return 0;
}

Free chunk

GLibcHeap

首先，prev_size必定存儲上一個塊的用戶數據，因為 Free chunk 的上一個塊必定是 Allocated chunk，否則會發生合并。

接著，多出來的fd指向同一個 bin 中的前一個 Free chunk，bk指向同一個 bin 中的后一個 Free chunk。

這里提到了 bin，我們將在后面介紹。

此外，對于 large bins 中的 Free chunk，fd_nextsize與bk_nextsize會生效，分別指向 large bins 中前一個（更大的）和后一個（更小的）空閑塊。

Top chunk

一個arena頂部的 chunk 叫做 Top chunk，它不屬于任何 bin。當所有 bin 中都沒有空閑的可用 chunk 時，我們切割 Top chunk 來滿足用戶的內存申請。假設 Top chunk 當前大小為 N 字節，用戶申請了 K 字節的內存，那么 Top chunk 將被切割為：

一個 K 字節的 chunk，分配給用戶
一個 N-K 字節的 chunk，稱為 Last Remainder chunk

后者成為新的 Top chunk。如果連 Top chunk 都不夠用了，那么：

在main_arena中，用brk()擴張 Top chunk
在non_main_arena中，用mmap()分配新的堆

注：Top chunk 的 PREV_INUSE 位總是 1

Last Remainder chunk

當需要分配一個比較小的 K 字節的 chunk 但是 small bins 中找不到滿足要求的，且 Last Remainder chunk 的大小 N 能滿足要求，那么 Last Remainder chunk 將被切割為：

一個 K 字節的 chunk，分配給用戶
一個 N-K 字節的 chunk，成為新的 Last Remainder chunk

它的存在使得連續的小空間內存申請，分配到的內存都是相鄰的，從而達到了更好的局部性。

Bin 的結構

bin 是實現了空閑鏈表的數據結構，用來存儲空閑 chunk，可分為：

10 個 fast bins，存儲在fastbinsY中
1 個 unsorted bin，存儲在bin[1]
62 個 small bins，存儲在bin[2]至bin[63]
63 個 large bins，存儲在bin[64]至bin[126]

還是一個一個來看。

fast bins

非常像高速緩存 cache，主要用于提高小內存分配效率。相鄰空閑 chunk 不會被合并，這會導致外部碎片增多但是free效率提升。注意 fast bins 是 10 個 LIFO 的單鏈表。最后三個鏈表保留未使用。

chunk大小（含chunk頭部）：0x10-0x40（64位0x20-0x80）B，相鄰bin存放的大小相差0x8（0x10）B。

GLibcHeap

注：加入 fast bins 的 chunk，它的IN_USE位（準確地說，是下一個 chunk 的PREV_INUSE位）依然是 1。這就是為什么相鄰的“空閑”chunk 不會被合并，因為它們根本不會被認為是空閑的。

關于fastbin最大大小，參見宏DEFAULT_MXFAST：

#ifndef DEFAULT_MXFAST
#define DEFAULT_MXFAST     (64 * SIZE_SZ / 4)
#endif

在初始化時，這個值會被賦值給全局變量global_max_fast。

申請fast chunk時遵循first fit原則。釋放一個fast chunk時，首先檢查它的大小以及對應fastbin此時的第一個chunk old的大小是否合法，隨后它會被插入到對應fastbin的鏈表頭，此時其fd指向old。

unsorted bin

非常像緩沖區 buffer，大小超過 fast bins 閾值的 chunk 被釋放時會加入到這里，這使得 ptmalloc2 可以復用最近釋放的 chunk，從而提升效率。

unsorted bin 是一個雙向循環鏈表，chunk 大小：大于global_max_fast。

GLibcHeap

當程序申請大于global_max_fast內存時，分配器遍歷unsorted bin，每次取最后的一個unsorted chunk。

如果unsorted chunk滿足以下四個條件，它就會被切割為一塊滿足申請大小的chunk和另一塊剩下的chunk，前者返回給程序，后者重新回到unsorted bin。
- 申請大小屬于small bin范圍
- unosrted bin中只有該chunk
- 這個chunk同樣也是last remainder chunk
- 切割之后的大小依然可以作為一個chunk
否則，從unsorted bin中刪除unsorted chunk。
- 若unsorted chunk恰好和申請大小相同，則直接返回這個chunk
- 若unsorted chunk屬于small bin范圍，插入到相應small bin
- 若unsorted chunk屬于large bin范圍，則跳轉到3。
此時unsorted chunk屬于large bin范圍。
- 若對應large bin為空，直接插入unsorted chunk，其fd_nextsize與bk_nextsize指向自身。
- 否則，跳轉到4。
到這一步，我們需按大小降序插入對應large bin。
- 若對應large bin最后一個chunk大于unsorted chunk，則插入到最后
- 否則，從對應large bin第一個chunk開始，沿fd_nextsize（即變小）方向遍歷，直到找到一個chunk fwd，其大小小于等于unsorted chunk的大小
  - 若fwd大小等于unsorted chunk大小，則插入到fwd后面
  - 否則，插入到fwd前面

直到找到滿足要求的unsorted chunk，或無法找到，去top chunk切割為止。

small bins

小于 0x200（0x400）B 的 chunk 叫做 small chunk，而 small bins 可以存放的就是這些 small chunks。chunk 大小同樣是從 16B 開始每次+8B。

small bins 是 62 個雙向循環鏈表，并且是 FIFO 的，這點和 fast bins 相反。同樣相反的是相鄰的空閑 chunk 會被合并。

chunk大小：0x10-0x1f0B（0x20-0x3f0），相鄰bin存放的大小相差0x8（0x10）B。

釋放非fast chunk時，按以下步驟執行：

若前一個相鄰chunk空閑，則合并，觸發對前一個相鄰chunk的unlink操作
若下一個相鄰chunk是top chunk，則合并并結束；否則繼續執行3
若下一個相鄰chunk空閑，則合并，觸發對下一個相鄰chunk的unlink操作；否則，設置下一個相鄰chunk的PREV_INUSE為0
將現在的chunk插入unsorted bin。
若size超過了FASTBIN_CONSOLIDATION_THRESHOLD，則盡可能地合并fastbin中的chunk，放入unsorted bin。若top chunk大小超過了mp_.trim_threshold，則歸還部分內存給OS。

#ifndef DEFAULT_TRIM_THRESHOLD
#define DEFAULT_TRIM_THRESHOLD (128 * 1024)
#endif

#define FASTBIN_CONSOLIDATION_THRESHOLD  (65536UL)

large bins

大于等于 0x200（0x400）B 的 chunk 叫做 large chunk，而 large bins 可以存放的就是這些 large chunks。

large bins 是 63 個雙向循環鏈表，插入和刪除可以發生在任意位置，相鄰空閑 chunk 也會被合并。chunk 大小就比較復雜了：

前 32 個 bins：從 0x200B 開始每次+0x40B
接下來的 16 個 bins：每次+0x200B
接下來的 8 個 bins：每次+0x1000B
接下來的 4 個 bins：每次+0x8000B
接下來的 2 個 bins：每次+0x40000B
最后的 1 個 bin：只有一個 chunk，大小和 large bins 剩余的大小相同

注意同一個 bin 中的 chunks 不是相同大小的，按大小降序排列。這和上面的幾種 bins 都不一樣。而在取出chunk時，也遵循best fit原則，取出滿足大小的最小chunk。

內存分配流程

我覺得這類復雜的流程比較需要靠流程圖來理解，因此我畫了一下：

GLibcHeap

相關宏：

#define NBINS             128
#define NSMALLBINS         64
#define SMALLBIN_WIDTH    MALLOC_ALIGNMENT
#define SMALLBIN_CORRECTION (MALLOC_ALIGNMENT > 2 * SIZE_SZ)
#define MIN_LARGE_SIZE    ((NSMALLBINS - SMALLBIN_CORRECTION) * SMALLBIN_WIDTH)

#define in_smallbin_range(sz)  \
  ((unsigned long) (sz) < (unsigned long) MIN_LARGE_SIZE)

#ifndef DEFAULT_MMAP_THRESHOLD_MIN
#define DEFAULT_MMAP_THRESHOLD_MIN (128 * 1024)
#endif

#ifndef DEFAULT_MMAP_THRESHOLD
#define DEFAULT_MMAP_THRESHOLD DEFAULT_MMAP_THRESHOLD_MIN
#endif

內存釋放流程

GLibcHeap

文章引用自https://github.com/SignorMercurio/Heap-Tutorials

參考資料

Heap Exploitation
Understanding glibc malloc
Syscalls used by malloc
glibc 內存管理 ptmalloc 源代碼分析
Painless intro to the Linux userland heap

本文章首發在網安wangan.com 網站上。