嗨喽,我是春哥,今天主要介绍段错误以及调试方法,经常遇到段错误,对C语言的理解才会更深。
个人建议收藏此文,这应该是介绍调试方法比较全面的了。
先介绍一下什么是段错误,段错误就意味着你访问了错误的内存段,一种情况是你没有这个内存段的权限,另一种情况就是根本不存在对应的物理地址,比如0地址。
我们知道,系统运行程序时会给程序分配一段内存空间,通常这个值由gdtr来保存,

方法一:利用gdb逐步查找段错误
这种方法应该是用的最多的,作为学生,可能写的程序简单用不上gdb,但是学习gdb调试真的很重要(本人表示在学校没用过,但是实习期间基本上天天用)学会gdb很重要,不止gdb,各种调试方法都要学会。
alex@alex-Yan:~/work$ gcc -g -rdynamic -o a a.calex@alex-Yan:~/work$ gdb aGNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-gitCopyright (C) 2018 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying"and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:<http://www.gnu.org/software/gdb/bugs/>.Find the GDB manual and other documentation resources online at:<http://www.gnu.org/software/gdb/documentation/>.For help, type "help".Type "apropos word" to search for commands related to "word"...Reading symbols from a...done.(gdb) runStarting program: /home/alex/work/aProgram received signal SIGSEGV, Segmentation fault.0x00005555555547dd in main () at a.c:66 *ptr = 0;
这是用gdb调试段错误的步骤,这里面很清晰的告诉我们,段错误的位置在a.c文件的第6行,连语句都告诉我们了。然后他还说,程序收到了SIGSEGV信号而终止,然后查阅文档(man 7 signal),发现SIGSEGV默认的handler动作是打印“段错误”的出错信息,并产生core文件,所以,除了gdb调试,我们还可以分析core文件。
方法二:分析core文件
这个我之前听说过,但是在我实习之前我都没发现掌握好调试技术多么重要,所以我实习期间基本上都是现学现用(当然那个时候不像现在没工作能静下来学习,有点浮躁)。
一般core路径和可执行文件路径一样,也可以指定core的生成路径。core生成之后,用“gdb ./可执行文件 core”命令查看core。
alex@alex-Yan:~/work$ ulimit -c0alex@alex-Yan:~/work$ ulimit -c 1000alex@alex-Yan:~/work$ ulimit -c1000alex@alex-Yan:~/work$ ./a段错误 (核心已转储)alex@alex-Yan:~/work$ lsa a.c core
我们的系统在默认情况下是将core文件限制为0的,也就上面指向ulimit -c出现的0,意思是不生成core文件,所以要生成core文件我们需要修改core的限制,当然上面的只能解决临时的问题,要想修改为开机就限制1000,需要修改配置文件。
接下来用gdb来调试一下core看看:
alex@alex-Yan:~/work$ gdb ./a coreGNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-gitCopyright (C) 2018 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying"and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:<http://www.gnu.org/software/gdb/bugs/>.Find the GDB manual and other documentation resources online at:<http://www.gnu.org/software/gdb/documentation/>.For help, type "help".Type "apropos word" to search for commands related to "word"...Reading symbols from ./a...done.[New LWP 11664]Core was generated by `./a'.Program terminated with signal SIGSEGV, Segmentation fault.#0 0x00005603d28647dd in main () at a.c:66 *ptr = 0;
一下就找到了位置,跟gdb调试一样的,这两种方法真是太好用了。
方法三:出现段错误时启动调试
我们先来看一段代码,别问我代码意思,我也是抄的代码,还没看懂,分享出来,大家有需要的自提。这段代码的作用就是出现段错误时启动gdb调试。
void dump(int signo){char buf[1024];char cmd[1024];FILE *fh;snprintf(buf,sizeof(buf),"/proc/%d/cmdline",getpid());if(!(fh=fopen(buf,"r")))exit(0);if(!fgets(buf,sizeof(buf),fh))exit(0);fclose(fh);if(buf[strlen(buf)-1]=='\n')buf[strlen(buf)-1]='\0';snprintf(cmd,sizeof(cmd),"gdb %s %d",buf,getpid());system(cmd);exit(0);}
然后在main函数开头调用signal(SIGSEGV,&dump);
alex@alex-Yan:~/work$ gcc -g -o a a.calex@alex-Yan:~/work$ sudo ./aGNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-gitCopyright (C) 2018 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying"and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:<http://www.gnu.org/software/gdb/bugs/>.Find the GDB manual and other documentation resources online at:<http://www.gnu.org/software/gdb/documentation/>.For help, type "help".Type "apropos word" to search for commands related to "word"...Reading symbols from ./a...done.Attaching to program: home/alex/work/a, process 12048Reading symbols from lib/x86_64-linux-gnu/libc.so.6...Reading symbols from usr/lib/debug//lib/x86_64-linux-gnu/libc-2.27.so...done.done.Reading symbols from lib64/ld-linux-x86-64.so.2...Reading symbols from usr/lib/debug//lib/x86_64-linux-gnu/ld-2.27.so...done.done.0x00007f770e258457 in __GI___waitpid (pid=12049, stat_loc=stat_loc@entry=0x7ffd81d97638, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:3030 ../sysdeps/unix/sysv/linux/waitpid.c: 没有那个文件或目录.(gdb) bt#0 0x00007f770e258457 in __GI___waitpid (pid=12049, stat_loc=stat_loc@entry=0x7ffd81d97638, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:30#1 0x00007f770e1c3177 in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:149#2 0x000055ad4af339ba in dump (signo=11) at a.c:25#3 <signal handler called>#4 0x000055ad4af339ec in main () at a.c:33
编译时记得和平时相比要加-g选项,然后后面的运行可执行文件a我也不知道为啥非要用root权限,我之前看的其他大佬的文章是不用提权的,有粉丝指定这个怎么解决可以私聊我。相互学习,共同进步。
写到这里,我们突然发现前三种方法都要用到gdb,下面再介绍一种不用
gdb的调试方法。
方法四:利用backtrace和objdump进行分析
将dump重写
void dump(int signo){void *array[10];size_t size;char **strings;size_t i;size = backtrace (array, 10);strings = backtrace_symbols (array, size);printf ("Obtained %zd stack frames.\n", size);for (i = 0; i < size; i++)printf ("%s\n", strings[i]);free (strings);exit(0);}
编译运行
alex@alex-Yan:~/work$ gcc -g -o a a.calex@alex-Yan:~/work$ ./aObtained 6 stack frames../a(+0x82b) [0x56338059082b]/lib/x86_64-linux-gnu/libc.so.6(+0x3f040) [0x7f67fda2a040]./a(+0x8c1) [0x5633805908c1]./a(+0x8e6) [0x5633805908e6]/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f67fda0cbf7]./a(+0x71a) [0x56338059071a]
这里没有很明显标示出来错误信息,我们用dbjdump反汇编程序,定位到0x5633805908c1的位置
00000000000008ae <test_dump>:8ae: 55 push %rbp8af: 48 89 e5 mov %rsp,%rbp8b2: 48 8d 05 d7 00 00 00 lea 0xd7(%rip),%rax # 990 <_IO_stdin_used+0x20>8b9: 48 89 45 f8 mov %rax,-0x8(%rbp)8bd: 48 8b 45 f8 mov -0x8(%rbp),%rax8c1: c6 00 00 movb $0x0,(%rax)8c4: 90 nop8c5: 5d pop %rbp8c6: c3 retq
这儿是表示方法的原因,前13位是一样的,所以只看后三位。
以上五种方法,不能说哪个重要哪个不重要,只是面对不同的场景有不同的方法,所以大家五种方法都要掌握。




