Recently, many LINUX environment databases of multiple customers have serious wait related to shared pool during peak business hours, including the library cache lock /latch: shared pool wait event, which leads to slow feedback from business users and failure to establish new business connections.
For the analysis of such problems, in addition to the analysis of whether there are blocker s, hard parsing / soft parsing problems in the database according to the ASH wait event, if it is not less than 11G version, it should also be concerned about whether there is information used by SWAP in the database log, such as WARNING: Heavy swapping observed on system in last 5 mins; if there is OSW monitoring, it can be seen from the command output of TOP/VMSTAT and so on Whether there is an obvious swap in/out page change action and the kswapd0 process in the TOP output? Usually, this kind of problem should be handled well in the LINUX system level memory configuration, such as setting hugepage/thp to reduce the space occupation of pagetable, setting vm.swappiness = 10 to avoid the page change; after the operation system level is set, the database level optimizes the SQL execution efficiency and related Parameter optimization can also be carried out.
In practice, it is usually necessary to solve the configuration of the operating system first. If there is still a serious sharing pool problem, it is more appropriate to turn to SQL parsing.
Recently, three ORACLE database system exceptions similar to LINUX are encountered. Two times are slow response, one is serious library cache lock, and all application systems / PLSQL DEV tools cannot be connected. Later, the database level is not set after the configuration of hugepage/thp at OS level and vm.swappiness = 10 is set. At present, the operation is stable.
Relevant processing information: 1. Information in ash
06:52:00 (2.0 min) 3,318 latch: shared pool 1,393 4.56 06:52:00 (2.0 min) 3318 CPU + Wait for CPU 1,086 3.55 06:52:00 (2.0 min) 3318 null event 603 1.97 06:54:00 (2.0 min) 11,338 latch: shared pool 8,310 27.18 06:54:00 (2.0 min) 11338 null event 2,073 6.78 06:54:00 (2.0 min) 11338 CPU + Wait for CPU 578 1.89 06:56:00 (2.0 min) 8,086 latch: shared pool 5,615 18.37 06:56:00 (2.0 min) 8086 CPU + Wait for CPU 1,441 4.71 06:56:00 (2.0 min) 8086 null event 828 2.71 06:58:00 (2.0 min) 2,690 CPU + Wait for CPU 2,277 7.45 06:58:00 (2.0 min) 2690 db file sequential read 131 0.43 06:58:00 (2.0 min) 2690 db file parallel read 98 0.32 07:00:00 (2.0 min) 2,079 CPU + Wait for CPU 1,471 4.81 07:00:00 (2.0 min) 2079 latch: shared pool 166 0.54 07:00:00 (2.0 min) 2079 db file sequential read 127 0.42</span>
2. System information collected by OSW
zzz ***Fri Sep 28 06:50:23 CST 2018 Cpu(s): 1.4%us, 4.8%sy, 0.0%ni, 93.5%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 65838444k total, 56004596k used, 9833848k free, 22240k buffers Swap: 32989180k total, 4589600k used, 28399580k free, 48619576k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 437 root 20 0 0 0 0 R 99.0 0.0 75:55.06 kswapd0 16204 oracle 20 0 24.2g 280m 277m R 18.3 0.4 0:00.19 oracle zzz ***Fri Sep 28 06:50:23 CST 2018 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 4 0 4589744 905784 80736 57481824 3 2 344 11 0 0 2 1 96 1 0 1 0 4589644 5919328 75564 52517876 148 0 14864 220 6591 5498 1 5 93 0 0 2 0 4589592 12053740 20948 46429312 20 0 188 664 11428 5077 2 5 93 0 0 zzz ***Fri Sep 28 06:50:34 CST 2018 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 10 4677432 39858184 1748 20807180 3 2 344 11 0 0 2 1 96 1 0 0 2 4698476 39894948 2992 20783744 156 21104 191336 21180 7454 5922 1 3 90 6 0 4 0 4722068 39904756 2956 20759608 576 24148 189208 24256 5432 5621 1 2 93 4 0
3. Information in ALERT logs of similar database systems -- not necessarily
WARNING: Heavy swapping observed on system in last 5 mins. pct of memory swapped in [4.99%] pct of memory swapped out [8.89%]. Please make sure there is no memory pressure and the SGA and PGA are configured correctly. Look at DBRM trace file for more details.