开大页的好处
- 降低页表管理开销与查找开销,提高TLB命中率,提高oracle使用内存的性能。
- 避免SWAP交换(标准大页在操作系统层面是不能被SWAPED OUT的)。
- 当时农信统一支付压力测试,未开大页时,TPS 能跑到 580,开了大页后, TPS 跑到了 760,提高了 31%。
大页降低页表开销的计算示例
LINUX内存的默认页块大小是4K:
# getconf -a | grep -i pagesize
PAGESIZE 4096
页表项负责虚拟内存地址到物理内存地址的映射关系,并记录页表状态,每个页表项大概4个字节,即4B。
对于1G内存:
若页块大小是4K,则产生页数:110241024(K)/4(K)=262144
所有页表项消耗内存:4B*262144/1024/1024=1M
若页块大小是2M,则产生页数:1*1024/2=512
所有页表项消耗内存:4B*512/1024/1024=0.001953125M
对于32G内存:
4K页,所有页表项消耗内存:32M
2M页,所有页表项消耗内存:0.0625M
在Linux中,每个进程页表私有,各自有各自的页表。
如果有1000个并发进程,则:
若页块大小是4K,则 1000*32M~=32G
若页块大小是2M,则 1000*0.0625~=62.5M
当然,上面的计算中,每个进程的页表项是非常极端的情况。上面计算的是每个进程都把所有内存分配满,实际中这种情况不会存在,一个进程只会分配一些内存,也就是只,只会有一部分页表项。
例如,假如一个进程只分配到2G物理内存,则1000个进程的页表消耗:
4K页,所有页表项消耗内存:1000(1M2)~=2G
2M页,所有页表项消耗内存:1000(0.001953125M2)~=3.9M
什么时候开启大页
当内存较大时,在数据库启动中的alert日志有如下提示信息,建议我们开启大页:
************************ Large Pages Information *******************
Per process system memlock (soft) limit = 64 KB
Total Shared Global Region in Large Pages = 0 KB (0%)
Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide = 0 (0 KB)
Large Pages configured system wide = 0 (0 KB)
Large Page size = 2048 KB
RECOMMENDATION:
Total System Global Area size is 4786 MB. For optimal performance,
prior to the next instance restart:
1. Increase the number of unused large pages by
at least 2393 (page size 2048 KB, total size 4786 MB) system wide to
get 100% of the System Global Area allocated with large pages
2. Large pages are automatically locked into physical memory.
Increase the per process memlock (soft) limit to at least 4794 MB to lock
100% System Global Area's large pages into physical memory
********************************************************************
如何开启大页
禁用 AMM
AMM 与大页冲突,需要禁用。
打开 memlock 限制
memlock 的限制大小要大于大页的总大小,设置为-1表示不限制。
# vi /etc/security/limits.conf
oracle soft memlock -1
oracle hard memlock -1
设置大页的数量
# vi /etc/sysctl.conf
vm.nr_hugepages=???
大页数量的计算工式:sga_max_size(mb)/2 + 100
注意:生产中不建议执行 sysctl -p 在线生效,而是建议直接重启主机。生产中执行 sysctl -p 命令可能会挂起,在另外一个会话中检查大页数量,发现大页数量增长很慢,曾经遇到两三秒才生成一个大页的情况。
关闭透明大页
在 grub 配置文件中添加 kernel 参数 transparent_hugepage=never 即可。或者:
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled(立即生效)
文章后面对透明大页有更详细的说明。
重启主机后检查大页已经开启
这一步一定要做,生产中遇到过,因为在 limit.conf 文件中将 oracle 写成了 orcale 而导致大页分配了但未被Oracle使用上,跑了几年时间。
关于透明大页
透明大页介绍
Transparent Huge Pages (THP) are enabled by default in RHEL 6 for all applications. The kernel attempts to allocate hugepages whenever possible and any Linux process will receive 2MB pages if the mmap region is 2MB naturally aligned. The main kernel address space itself is mapped with hugepages, reducing TLB pressure from kernel code.
The kernel will always attempt to satisfy a memory allocation using hugepages. If no hugepages are available (due to non availability of physically continuous memory for example) the kernel will fall back to the regular 4KB pages. THP are also swappable (unlike hugetlbfs). This is achieved by breaking the huge page to smaller 4KB pages, which are then swapped out normally.
But to use hugepages effectively, the kernel must find physically continuous areas of memory big enough to satisfy the request, and also properly aligned. For this, a khugepaged kernel thread has been added. This thread will occasionally attempt to substitute smaller pages being used currently with a hugepage allocation, thus maximizing THP usage.
In userland, no modifications to the applications are necessary (hence transparent). But there are ways to optimize its use. For applications that want to use hugepages, use of posix_memalign() can also help ensure that large allocations are aligned to huge page (2MB) boundaries.
Also, THP is only enabled for anonymous memory regions. There are plans to add support for tmpfs and page cache. THP tunables are found in the /sys tree under /sys/kernel/mm/redhat_transparent_hugepage.
查看是否启用透明大页
# cat /sys/kernel/mm/redhat_transparent_hugepage/enabled(仅适用于红帽,或)
# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
使用命令查看时,如果输出结果为[always]表示透明大页启用了。[never]表示透明大页禁用、[madvise]表示只在MADV_HUGEPAGE标志的VMA中使用THP。
也可以这样查看:
# grep -i huge /proc/meminfo
AnonHugePages: 2279424 kB
HugePages_Total: 2396
HugePages_Free: 1004
HugePages_Rsvd: 1001
HugePages_Surp: 0
Hugepagesize: 2048 kB
为什么Oracle建议关闭透明大页
Transparent HugePages memory is enabled by default with Red Hat Enterprise Linux 6, SUSE Linux Enterprise Server 11, and Oracle Linux 6 with earlier releases of Oracle Linux Unbreakable Enterprise Kernel 2 (UEK2) kernels. Transparent HugePages memory is disabled in later releases of Oracle Linux UEK2 kernels. Transparent HugePages can cause memory allocation delays during runtime. To avoid performance issues, Oracle recommends that you disable Transparent HugePages on all Oracle Database servers. Oracle recommends that you instead use standard HugePages for enhanced performance.Transparent HugePages memory differs from standard HugePages memory because the kernel khugepaged thread allocates memory dynamically during runtime. Standard HugePages memory is pre-allocated at startup, and does not change during runtime. Transparent HugePages are similar to the HugePages that have been available in previous Linux releases. The main difference is that the Transparent HugePages are set up dynamically at run time by the khugepaged thread in kernel while the regular HugePages had to be preallocated at the boot up time. Because Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC, Oracle strongly advises to disable the use of Transparent HugePages. In addition, Transparent Hugepages may cause problems even in a single-instance database environment with unexpected performance problems or delays. As such, Oracle recommends disabling Transparent HugePages on all Database servers running Oracle.
- 从 RedHat 6, OEL 6, SLES 11 and UEK2 kernels 开始,系统缺省会启用 Transparent HugePages 用来提高内存管理的性能。透明大页和之前版本中的大页功能上类似。
- Transparent Huge Pages在32位的RHEL6中是不支持的(Transparent Huge Pages are not available on the 32-bit version of RHEL 6)。
- ORACLE官方不建议我们使用 RedHat 6, OEL 6, SLES 11 and UEK2 kernels 时开启透明大页(Transparent HugePages ),因为透明大页(Transparent HugePages )会导致性能问题,在RAC中,还可以导致节点异常重启。
- 安装Vertica Analytic Database时也必须关闭透明大页功能。