Yesterday the project came online, and after verification, I took a deep breath and went online perfectly without any problems.However, half an hour later, I received an alert from the Operations and Maintenance Team that several ports of a server were not detected.At that time, I heard that I was in a tight heart. No, we have observed the system logs after the server. All transactions are normal. If the system starts abnormally, the logs will not be printed.So I rushed to log on to the server between operations and detect the port.
netstat -anpl | grep 1099
netstat -anpl | grep 8900
The output shows that the port is indeed not started.But the system's ports for external services are all fine.So we found the system boot log and located the following exception:
So I searched the web for articles about Failed to write core dump, many of which were not consistent with the problem we encountered this time.
At the same time, we found an additional log file hs_err_pid13893.log in the startup script directory, which is the thread stack information of the JVM that had an exception at startup, as described below:
# # A fatal error has been detected by the Java Runtime Environment: # # SIGBUS (0x7) at pc=0x00007fa0a94e14e2, pid=13893, tid=0x00007f9e8fefe700 # # JRE version: Java(TM) SE Runtime Environment (8.0_172-b11) (build 1.8.0_172-b11) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.172-b11 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libzip.so+0x124e2] newEntry+0x62 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # --------------- T H R E A D --------------- Current thread (0x00007f9ed4111000): JavaThread "pool-FramePools-thread2" [_thread_in_native, id=14081, stack(0x00007f9e8fdfe000,0x00007f9e8feff000)] siginfo: si_signo: 7 (SIGBUS), si_code: 2 (BUS_ADRERR), si_addr: 0x00007f9efc0cb0fa Registers: RAX=0x00007f9ee830cc60, RBX=0x00007f9f6e0e1e00, RCX=0x0000000000000000, RDX=0x00007f9ee830cc60 RSP=0x00007f9e8fefb4b0, RBP=0x00007f9e8fefb500, RSI=0x0000000000000000, RDI=0x00007f9ee8000020 R8 =0x00007f9ee8000088, R9 =0x00007f9e8fefb580, R10=0x0000000000000048, R11=0x00000000c6bfe5f0 R12=0x00007f9efc0cb0dd, R13=0x00007f9ee830cc60, R14=0x000000001cb26b69, R15=0x00007f9f6e0de110 RIP=0x00007fa0a94e14e2, EFLAGS=0x0000000000010206, CSGSFS=0x0000000000000033, ERR=0x0000000000000004 TRAPNO=0x000000000000000e Top of Stack: (sp=0x00007f9e8fefb4b0) 0x00007f9e8fefb4b0: 00000000c6b13fe0 0000000100000000 0x00007f9e8fefb4c0: 00007f9f6e0de1e0 00007fa0b4c1a02e 0x00007f9e8fefb4d0: 00007f9ee830cc60 00007f9f6e0e1e00 0x00007f9e8fefb4e0: 00007f9f6e0e10b0 00007f9f6e0de110 0x00007f9e8fefb4f0: 000000001cb26b69 000000000000004f 0x00007f9e8fefb500: 00007f9e8fefb560 00007fa0a94e1c3f 0x00007f9e8fefb510: 00007f9ed4111000 0100000000000000 0x00007f9e8fefb520: 00007f9e8fefb580 00007f9e8fefb5d0 0x00007f9e8fefb530: 00007f9e8fefb5ce 00007f9e8fefb580 0x00007f9e8fefb540: 000000000000004f 00007f9ed41111f8 0x00007f9e8fefb550: 00007f9e8fefb9d8 0000000000000001 0x00007f9e8fefb560: 00007f9e8fefb9c0 00007fa0a94d2e90 0x00007f9e8fefb570: 00007fa09a4cf7a7 00007f9f6e0de110 0x00007f9e8fefb580: 672f63642f6d6f63 636e616e7265766f 0x00007f9e8fefb590: 61646174656d2f65 736c706d692f6174 0x00007f9e8fefb5a0: 656d69746e75722f 5f72656b6361502f 0x00007f9e8fefb5b0: 6465525f736e6f63 56626d4362625a6d 0x00007f9e8fefb5c0: 4f534a5f6f745f31 007373616c632e4e 0x00007f9e8fefb5d0: 00007f9e8fefb650 0000000000800000 0x00007f9e8fefb5e0: 00007f9ee8000078 00007f9ee8000020 0x00007f9e8fefb5f0: 00007f9ee8000020 0000000000000000 0x00007f9e8fefb600: 0000000000008010 00000038ad67a567 0x00007f9e8fefb610: 00007f9ed4111000 0000000000008000 0x00007f9e8fefb620: 00007f9e8fefb690 00007f9e8fefb868 0x00007f9e8fefb630: 000000768fefb690 0000000000000200 0x00007f9e8fefb640: 0000000000000040 0000000000000000 0x00007f9e8fefb650: 0000000000000001 0000000000000008 0x00007f9e8fefb660: 0000000000008030 0000023000000801 0x00007f9e8fefb670: 0000007c0000009b 0000007600000078 0x00007f9e8fefb680: 00007fa09a515f20 00007f9ee8000020 0x00007f9e8fefb690: 0000000000008000 0000000000000000 0x00007f9e8fefb6a0: 0000000000000019 00000000c6bfe409 Instructions: (pc=0x00007fa0a94e14e2) 0x00007fa0a94e14c2: 00 48 c7 40 28 00 00 00 00 41 80 7f 30 00 0f 84 0x00007fa0a94e14d2: 8a 02 00 00 4c 8b 63 08 4d 2b 67 28 4d 03 67 18 0x00007fa0a94e14e2: 41 0f b6 5c 24 1d 41 0f b6 44 24 1c c1 e3 08 09 0x00007fa0a94e14f2: c3 41 0f b6 44 24 1e 88 45 bd 41 0f b6 54 24 20 Register to memory mapping: RAX=0x00007f9ee830cc60 is an unknown value RBX=0x00007f9f6e0e1e00 is an unknown value RCX=0x0000000000000000 is an unknown value RDX=0x00007f9ee830cc60 is an unknown value RSP=0x00007f9e8fefb4b0 is pointing into the stack for thread: 0x00007f9ed4111000 RBP=0x00007f9e8fefb500 is pointing into the stack for thread: 0x00007f9ed4111000 RSI=0x0000000000000000 is an unknown value RDI=0x00007f9ee8000020 is an unknown value R8 =0x00007f9ee8000088 is an unknown value R9 =0x00007f9e8fefb580 is pointing into the stack for thread: 0x00007f9ed4111000 R10=0x0000000000000048 is an unknown value R11=
Key points are as follows:
The general meaning is that the java virtual machine had an original exception at startup, and the framework that caused the problem is C [libzip.so+0x124e2] newEntry+0x62.Then I searched the web for C [libzip.so+0x124e2] newEntry+0x62, and finally I found this article: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8214788 This is an exception to JVM. When java operates on zip files, it calls the libzip.so Library of the linux system. If a zip file operation is done while the JVM is running and the zip file is modified, there is a risk that the JVM will crash.Searching in oracle's java bug database found many similar problems:
Of course, Oracle officially offers a solution:
1. Add in the JVM startup parameter: -Dsun.zip.disableMemoryMapping=true, but this scheme has a risk that it will cause a huge performance risk for the JVM when it reads jar files on disk, provided you read them over and over again.However, when I searched the Internet for similar questions, I found that some netizens added this startup parameter but encountered other problems.
2. Upgrade to JDK9.The problem is perfectly solved in this version.
Summary: Our system loads zip files at startup. Perhaps our operators restarted the next server immediately before other servers finished startup. By chance, both servers were operating on the same zip file, which resulted in JVM not starting completely.