Memory access error for mmap map

phenomenon

An open file descriptor is mapped to a piece of memory interval through mmap to read and write this interval. After a long time of operation, the memory access error sigbus error appears. GDB analyzes the corresponding core and some memory space unavailable errors appear.

problem analysis

Referring to man mmap, an error occurs when:

ERRORS
       EBADF  fd is not a valid file descriptor (and MAP_ANONYMOUS was not set).

       EACCES A  file  descriptor  refers  to a non-regular file.  Or MAP_PRIVATE was requested, but fd is not open for reading.  Or
              MAP_SHARED was requested and PROT_WRITE is set, but fd is not open in read/write (O_RDWR) mode.  Or PROT_WRITE is set,
              but the file is append-only.

       EINVAL We don't like start or length or offset.  (E.g., they are too large, or not aligned on a PAGESIZE boundary.)

       ETXTBSY
              MAP_DENYWRITE was set but the object specified by fd is open for writing.

       EAGAIN The file has been locked, or too much memory has been locked.

       ENOMEM No memory is available, or the process's maximum number of mappings would have been exceeded.

       ENODEV The underlying filesystem of the specified file does not support memory mapping.

       Use of a mapped region can result in these signals:

       SIGSEGV
              Attempted write into a region specified to mmap as read-only.

       SIGBUS Attempted  access  to a portion of the buffer that does not correspond to the file (for example, beyond the end of the
              file, including the case where another process has truncated the file).
According to the above instructions, it can be seen that when SIGBUS error occurs, either the accessed buffer is not within the file range, or the mapped file has been truncate d. However, the error I encountered was not encountered by calling mmap, but in the process of accessing buffer. How to analyze it?

Solutions and validation

First of all, the context environment of the system where the author is located is clarified, and the files related to mmap and the way of using memory interval are clarified. Then, according to the abnormal core, use GDB to see the stack of multiple threads accessing the file, and find that one thread is accessing the buffer of mmap, and another thread is actually reopening the file! To check the exception log, a thread has reopened a file that has been mmap.

Add defense code immediately, run the test again, and the problem disappears completely.

summary

In order to find the problem of mmap exception, we need to fully combine multiple thread stacks of core for analysis and troubleshooting to solve the problem.

Posted by helpwanted on Mon, 20 Jan 2020 09:16:34 -0800