Linux System Programming 学习笔记(九) 内存管理

时间:2022-08-11 00:22:52

1. 进程地址空间

Linux中,进程并不是直接操作物理内存地址,而是每个进程关联一个虚拟地址空间
内存页是memory management unit (MMU) 可以管理的最小地址单元
机器的体系结构决定了内存页大小,32位系统通常是 4KB, 64位系统通常是 8KB
内存页分为 valid or invalid:
A valid page is associated with an actual page of data,例如RAM或者磁盘上的文件
An invalid page is not associated with anything,表现为 未使用未分配的地址空间,存取无效内存页将会导致 段错误
 
如果有效内存页与外存数据相关联,进程就不能存取内存页直到数据交换至物理内存
当进程直接存取关联到磁盘文件的内存页时,会产生 缺页中断,然后内核透明地将数据从外存磁盘置换到 物理内存
there is considerably more virtual memory than physical memory
Paging out is the process of moving data from physical memory to secondary storage
 

2. 动态内存

/* obtaining dynamic memory */
#include <stdlib.h>
void * malloc (size_t size);
The contents of the memory are undefined
切记:每次动态分配内存都必须严格严查返回值是否为NULL,表示分配失败
 
/* returns a pointer to a block of memory suitable for holding an array of nr elements, each of size bytes */
#include <stdlib.h>
void * calloc (size_t nr, size_t size);
注意:calloc() zeros all bytes in the returned chunk of memory,默认0值初始化
程序员必须优先使用 calloc() 确保分配的内存已0值初始化
calloc() 进行0值初始化 比 memset 速度要快
/* resizing (making larger or smaller) existing allocations */
#include <stdlib.h>
void * realloc (void *ptr, size_t size);
It returns a pointer to the newly sized memory. 
参数size为0时,此时realloc相当于 free
参数ptr为NULL时,此时realloc相当于 malloc
 

3. 数据对齐

编写跨平台代码时,必须小心数据对齐要求
大多数情况下,编译器和C标准库透明的处理数据对齐情况,Linux系统中,malloc系列函数分配的内存 在32位系统是 8-byte 对齐,在64位系统是16-byte 对齐
posix_memalign() 是改变系统默认数据对齐规则的函数
因为数据对齐要求,所以动态内存分配得到的实际大小 一定是 大于或等于 请求大小
/* actual allocation size of the chunk of memory pointed to by ptr */
#include <malloc.h>
size_t malloc_usable_size (void *ptr);
dereferencing a cast of a pointer from one type of variable to a different type is usually a violation of the strict aliasing rule
 

4. 管理数据段

#include <unistd.h>
int brk (void *end);
void * sbrk (intptr_t increment);
brk() sets the break point (the end of the data segment) to the address specified by end.
sbrk() increments the end of the data segment by increment bytes, which may be a positive or negative delta.
 

5. 匿名内存映射

glibc的内存分配综合使用了 brk、sbrk数据段管理 和 mmap 内存映射
伙伴内存分配(buddy memory allocation scheme):
将数据段划分为一系列大小为2的指数幂的内存块,返回一个与请求大小最适应的内存块
释放内存时,如果相邻的划分块也被标记位free,那么将合并内存块
优点:速度快,简单
缺点:导致大量 内部碎片
内部碎片:实际分配到的内存块比请求大小要多,导致 分配的内存块使用率低下
外部碎片:系统中空闲内存块总和要比请求大小多,但是没有 单一内存块满足请求大小,导致 系统内存块使用率低下
 
the heap is not shrunk after each free. Instead, the malloc() implementation keeps freed memory around for a subsequent allocation.
但是当申请较大内存时,如果释放后,仍然将此内存保留以便后续分配,这将影响到系统内存使用
 
For large allocations, glibc does not use the heap. Instead, glibc creates an anonymous memory mapping to satisfy the allocation request
普通内存映射将内存映射到磁盘文件,匿名内存映射将内存映射到 a large, zero-filled block of memory
匿名内存映射发生在 堆内存之外,所以不会发生数据段内存碎片
 
匿名内存映射的好处:
a. 不用担心碎片问题,the mapping is unmapped,则内存立即归还系统
b. 可以调整块大小,且有可以调整的权限
c. 每次分配都存在单一内存映射中,不用担心全局的堆内存管理
d. 分配到的内存已经0值初始化,因为 kernel maps the application's anonymous pages to a zero-filled page via copy-on-write
 
匿名内存映射的坏处:
a. 每个内存映射大小必须是系统内存页面大小的整数倍,这将导致内存空间浪费,利用率低
b. 创建内存映射的开销 大于 堆内存分配的开销 
 
glibc's malloc() uses the data segment to satisfy small allocations and anonymous memory mappings to satisfy large allocations
默认临界值是128KB,小于或等于128KB则使用堆内存分配,大于128KB则使用匿名内存映射
 
创建匿名内存映射 只需要在mmap函数中将 fd参数设置为-1 (因为并不是映射到文件)
 
void *p;
p = mmap (NULL, /* do not care where */
* , /* 512 KB */
PROT_READ | PROT_WRITE, /* read/write */
MAP_ANONYMOUS | MAP_PRIVATE, /* anonymous, private */
−, /* fd (ignored) */
); /* offset (ignored) */
if (p == MAP_FAILED)
perror ("mmap");
else
/* 'p' points at 512 KB of anonymous memory... */

6. 基于栈的动态内存分配

/* make a dynamic memory allocation from the stack */
#include <alloca.h>
void * alloca (size_t size);
Usage is identical to malloc(), but you do not need to (indeed, must not) free the allocated memory
注意 alloca是在栈上进行动态内存分配,并且 不需要使用free释放内存
This means you cannot use this memory once the function that calls alloca() returns! However, because you don't have to do any  cleanup by calling free()
 
POSIX未定义alloca函数,所以不适合编写跨平台程序
 

7. 变长数组 Variable-Length Arrays

C99引进 变长数组,数组的长度在运行时动态确定,而不是在编译时静态确定
for (i = ; i < n; ++i) {
char foo[i + ];
/* use 'foo'... */
}
On each iteration of the loop, foo is dynamically created and automatically cleaned up when it falls out of scope
 

8. 选择内存分配机制

Linux System Programming  学习笔记(九) 内存管理 
 

9. 内存操作

/* memset() sets the n bytes starting at s to the byte c and returns s */
#include <string.h>
void * memset (void *s, int c, size_t n);
/* compares two chunks of memory for equivalence */
#include <string.h>
int memcmp (const void *s1, const void *s2, size_t n);

因为结构体通常涉及到数据对齐,所以使用memcmp来比较两个结构体是不安全的

/* are two dinghies identical? (BROKEN) */
int compare_dinghies (struct dinghy *a, struct dinghy *b)
{
return memcmp (a, b, sizeof (struct dinghy));
}

上述代码不安全,应该分别比较每个结构体成员:

/* are two dinghies identical? */
int compare_dinghies (struct dinghy *a, struct dinghy *b)
{
int ret;
if (a->nr_oars < b->nr_oars)
return −;
if (a->nr_oars > b->nr_oars)
return ;
ret = strcmp (a->boat_name, b->boat_name);
if (ret)
return ret;
/* and so on, for each member... */
}
/* copies the first n bytes of src to dst, returning dst */
#include <string.h>
void * memmove (void *dst, const void *src, size_t n);

memmove可以正确处理内存区重叠的情况(部分dst位于src之内)

#include <string.h>
void * memcpy (void *dst, const void *src, size_t n)

memcpy在内存区出现重叠时 属于未定义行为

/* scans the n bytes of memory pointed at by s for the character c */
#include <string.h>
void * memchr (const void *s, int c, size_t n);

10. 锁定内存

Linux实现的 内存页置换, which means that pages are paged in from disk as needed and paged out to disk when no longer needed
/* “locking”one or more pages into physical memory, ensuring that they are never paged out to disk */
#include <sys/mman.h>
int mlock (const void *addr, size_t len);

mlock() locks the virtual memory starting at addr and extending for len bytes into physical memory

/*  mlockall() locks all of the pages in the current process's address space into physical memory. */
#include <sys/mman.h>
int mlockall (int flags);