The command line logs in to mysql and reports Segmentation fault. Troubleshooting

Keywords: MySQL

Disclaimer: This article only records the solutions to the problems encountered and the summary of learning experience. It is inevitable that there will be some thoughtless places. Please criticize and correct. Please do not use the content of this article for all your activities and behaviors. I will not bear any consequences arising therefrom.

Environment: CentOS 8.4 gcc8.4.1 mysql8.0.21 x86_ sixty-four

Problem Description: make the mysql8.0.21 source code. An error was encountered when linking. The ncurses library was missing. Later, the library was installed and make succeeded again. Therefore, start mysqld, connect with mysql -u root -p, enter the password, and press enter. A Segmentation fault occurs on the MySQL client.

There are warnings during the first make. The summary is as follows:

/opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/terminal.c: In function 'terminal_set':
/opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/terminal.c:877:6: warning: implicit declaration of function 'tgetent'; did you mean 'getenv'? [-Wimplicit-function-declaration]
  i = tgetent(el->el_terminal.t_cap, term);
      ^~~~~~~
      getenv
/opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/terminal.c:899:15: warning: implicit declaration of function 'tgetflag'; did you mean 'tigetflag'? [-Wimplicit-function-declaration]
   Val(T_am) = tgetflag("am");
               ^~~~~~~~
               tigetflag
/opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/terminal.c:908:15: warning: implicit declaration of function 'tgetnum'; did you mean 'tigetnum'? [-Wimplicit-function-declaration]
   Val(T_co) = tgetnum("co");
               ^~~~~~~
               tigetnum
/opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/terminal.c:917:19: warning: implicit declaration of function 'tgetstr'; did you mean 'tigetstr'? [-Wimplicit-function-declaration]
       char *tmp = tgetstr(strchr(t->name, *t->name), &area);
                   ^~~~~~~
                   tigetstr

Analysis process:

In order to generate the core file, the operating system needs to be configured as follows. You can add content at the end of / etc/rc.d/rc.local file and add execution permission to rc.local, as follows:

echo "core-%t" > /proc/sys/kernel/core_pattern

It means to generate a core dump file prefixed with core - in the current directory of the executing program, followed by a timestamp and process number, such as core-1637149273.2955, where 1637149273 is the timestamp and 2955 is the process number. After configuration, the machine needs to be restarted to take effect permanently; If you don't want to restart and just want to take effect temporarily, you can also execute this statement as root to modify the core_pattern file content.

Modify core_ After the content of the pattern file takes effect, the Segmentation fault occurs again on the mysql client, so there is a core file.

gdb bt views the function stack information of the core file as follows:

#0  0x00000000004e4eed in terminal_alloc (el=0x286eee0, t=<optimized out>, cap=0x52a9aaa0 <error: Cannot access memory at address 0x52a9aaa0>)
    at /opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/terminal.c:350
#1  0x00000000004e5da7 in terminal_set (el=el@entry=0x286eee0, term=<optimized out>, term@entry=0x0)
    at /opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/terminal.c:900
#2  0x00000000004e5ee1 in terminal_init (el=el@entry=0x286eee0) at /opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/terminal.c:297
#3  0x00000000004ea220 in el_init_internal (prog=0x7ffd52a9c6c9 "./mysql", fin=0x7fcd6f9c09c0 <_IO_2_1_stdin_>, 
    fout=0x7fcd6f9c16e0 <_IO_2_1_stdout_>, ferr=0x7fcd6f9c1600 <_IO_2_1_stderr_>, fdin=0, fdout=fdout@entry=1, fderr=2, flags=128)
    at /opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/el.c:139
#4  0x00000000004e22d5 in rl_initialize () at /opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/readline.c:297
#5  0x00000000004e2b55 in read_history (filename=filename@entry=0x286eec0 "/root/.mysql_history")
    at /opt/resource/mysql-8.0.21/extra/libedit/libedit-20190324-3.1/src/readline.c:1359
#6  0x000000000040924a in main (argc=<optimized out>, argv=<optimized out>) at /opt/resource/mysql-8.0.21/client/mysql.cc:1403

Take a look at the content near line 350 of terminal. C:

Look at the contents of this line. It should be caused by illegal access to the memory address. Let's look at the content near line 900 of terminal. C:

From line 900, there should be a problem with the return value of the tgetstr function. Tgetstr is a function in ncurse library. In order to better solve this problem, it is necessary to understand some basic concepts of terminal programming.

Enter: man 3 tgetstr to view the help of this function. As shown below, we can know that there are three terminal capabilities: Boolean value, digital value and string value, and this function is the terminal capability used to obtain string value. At the same time, we can also know that this function is used for applications using termcap library, and will be converted to the value in terminfo Library in the background.

In CentOS 6, termcap database is used to describe terminal capabilities. cat /etc/termcap can be used to view the capabilities of all terminals. / etc/termcap is an ASCII file; In CentOS 7 and 8, terminfo database is used, which is located in / usr/share/terminfo. infocmp can be used to view the capabilities of the current terminal. The terminfo database stores the compiled content. For a detailed description of terminal capabilities, please refer to this article: termcap - terminal function database - Fan Weisheng - blog Park .

Here is a section of instructions for tgetstr (source: The Termcap Library - Interrogate ):

tgetstr

Use tgetstr to get a string value. It returns a pointer to a string which is the capability value, or a null pointer if the capability is not present in the terminal description. There are two ways tgetstr can find space to store the string value:

You can ask tgetstr to allocate the space. Pass a null pointer for the argument area, and tgetstr will use malloc to allocate storage big enough for the value. Termcap will never free this storage or refer to it again; you should free it when you are finished with it. This method is more robust, since there is no need to guess how much space is needed. But it is supported only by the GNU termcap library.

You can provide the space. Provide for the argument area the address of a pointer variable of type char *. Before calling tgetstr, initialize the variable to point at available space. Then tgetstr will store the string value in that space and will increment the pointer variable to point after the space that has been used. You can use the same pointer variable for many calls to tgetstr. There is no way to determine how much space is needed for a single string, and no way for you to prevent or handle overflow of the area you have provided. However, you can be sure that the total size of all the string values you will obtain from the terminal description is no greater than the size of the description (unless you get the same capability twice). You can determine that size with strlen on the buffer you provided to tgetent. See below for an example. Providing the space yourself is the only method supported by the Unix version of termcap.

From the figure below, we can see that the area on line 900 points to buf, which is the second usage mentioned in English above, that is, the caller allocates storage.

Add print to terminal.c to see the changes of buf, area and tgetstr values:

char buf[TC_BUFSIZE];
printf("buf addr:%p\n", buf);

...

for (t = tstr; t->name != NULL; t++) {
	/* XXX: some systems' tgetstr needs non const */
	//terminal_alloc(el, t, tgetstr(strchr(t->name, *t->name),
	//    &area));
    char *tmp = tgetstr(strchr(t->name, *t->name), &area);
    printf("area:%p\n", area);
    printf("tgetstr ret val:%p\n", tmp);
    terminal_alloc(el, t, tmp);
}

The printing results are as follows:

buf addr:0x7ffe0ec93660

area:0x7ffe0ec93664(1st for loop)

tgetstr ret val:0xec93660(1st for loop)

It can be found that the return value of tgetstr in the first for loop is the value after buf is truncated by 4 bytes. It should be the same as the value of buf. Therefore, an illegal memory access error will be generated, resulting in segmentation fault.

At this point, the question is puzzling. Why was it cut off?

At this time, I remembered the warning error of the compilation times (written at the beginning of the article) : implicit declaration of function. This warning is caused by the lack of function prototype declaration, that is, the dependent ncurse library is not installed during the first compilation, resulting in the lack of header file term.h, resulting in the lack of function prototype declaration of tgetstr. Is this warning related to the truncation of function return value?

Test it through the following program,

foo.h

#ifndef __FOO_H__
#define __FOO_H__
void foo();
#endif

foo.c

#include <stdlib.h>
#include <stdio.h>
void foo()
{
  char buffer[1024];
  printf("buffer:%p\n", buffer);
  char *str = bar(buffer);
  printf("str:%p\n", str);
  printf("sizeof pointer:%d\n", sizeof(str));
  printf("sizeof int:%d\n", sizeof(int));
}

bar.c

#include <stdio.h>
char *bar(char *buffer)
{
  char *buf = buffer;
  printf("buf:%p\n", buf);
  return buf;
}

main.c

#include "foo.h"
int main(int argc, char *argv[])
{
  foo();
  return 0;
}

When compiling foo.c, the following warning is reported:,

$ gcc foo.c -c -o foo.o
foo.c: In function 'foo':
foo.c:7:15: warning: implicit declaration of function 'bar' [-Wimplicit-function-declaration]
   char *str = bar(buffer);
               ^~~
foo.c:7:15: warning: initialization of 'char *' from 'int' makes pointer from integer without a cast [-Wint-conversion]

Execution results:

$ ./main
buffer:0x7ffd563a8720
buf:0x7ffd563a8720
str:0x563a8720
sizeof pointer:8
sizeof int:4

It is not difficult to see from the output results that the return value of the bar() function that returns the pointer is truncated and only the lower 32 bits are reserved. This problem also occurs in: c - 64 bit function returns 32 bit pointer - Stack Overflow  , By default all return values are int. so if a prototype is missing for function then compiler treats the return value as 32-bit and generates code for 32-bit return value. Those when your upper 4 bytes gets truncated

For the 64 bit system, because the int is 4 bytes and the pointer is 8 bytes, there is a problem of truncation, which is easy to cause program crash. The 32-bit system should not have this problem. Therefore, on the 64 bit system, pay attention to the potential problems caused by the compilation warning. In addition, develop a good compilation habit, and it is best not to have a warning.

Posted by renegade44 on Sun, 05 Dec 2021 21:36:04 -0800