Wednesday 17 May 2017

Linux Debug Tips

Linux Applications Debugging Techniques/Stack corruption

Stack corruption is rather hard to diagnose. Luckily, gcc 4.x can instrument the code to check for stack corruption:

  • -fstack-protector
  • -fstack-protector-all
gcc will add guard variables and code to check for buffer overflows upon exiting a function.

Sysfs or procfs

Procfs is much more generic interface to provide information to user space.All the process related information and some parameters are also passed to user space through profs file system,

Whereas Sysfs is much more modern implementation like procfs.But it is implemented by following Linux driver model and hierarchy. Sysfs stores driver,device information alongside of process information too.As Greg KH told "sysfs is "1 value per file"" . so can be used for full user level driver implementation.

Monitor certain process 

 if you want monitor process identifier (PID) 4360 and 4358, you type:
$ top -p 4360,4358

$time top -b -n 1
This will give you illustration how fast top works on single round
 Troubleshooting a “slow” process
[root@oel6 ~]# ps -ef | grep find
root     27288 27245  4 11:57 pts/0    00:00:01 find . -type f
root     27334 27315  0 11:57 pts/1    00:00:00 grep find
 
[root@oel6 ~]# top -cbp 27288
top - 11:58:15 up 7 days,  3:38,  2 users,  load average: 1.21, 0.65, 0.47
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.1%us,  0.1%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2026460k total,  1935780k used,    90680k free,    64416k buffers
Swap:  4128764k total,   251004k used,  3877760k free,   662280k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
27288 root      20   0  109m 1160  844 D  0.0  0.1   0:01.11 find . -type f 
 Normally when a process seems to be stuck like that (0% CPU usually means that the process is stuck in some blocking system call – which causes the kernel to put the process to sleep).
strace on that process to trace in which system call the process is currently stuck. Also if the process is actually not completely stuck, but returns from a system call and wakes up briefly every now and then, it would show up in strace.
[root@oel6 ~]# strace -cp 27288
Process 27288 attached - interrupt to quit

^C
^Z
[1]+  Stopped                 strace -cp 27288
Oops, the strace command itself got hung too! It didn’t print any output for a long time and didn’t respond to CTRL+C.

Let’s try pstack then (on Linux, pstack is just a shell script wrapper around the GDB debugger). While pstack does not see into kernel-land, it will still give us a clue about which system call was requested (as usually there’s a corresponding libc library call in the top of the displayed userland stack):
 [root@oel6 ~]# pstack 27288
^C
^Z
[1]+  Stopped                 pstack 27288
 
Pstack also got stuck without returning anything!
 
[root@oel6 ~]# ps -flp 27288
 F S UID        PID  PPID  C PRI  NI ADDR SZ  WCHAN  STIME TTY          TIME CMD
 0 D root     27288 27245  0  80   0 - 28070 rpc_wa 11:57 pts/0    00:00:01 find . -type f
 
[root@oel6 ~]# cat /proc/27288/wchan 
 rpc_wait_bit_killable
 
this process is waiting for some RPC call.
 
 let’s figure out whether this process is completely stuck or not. The /proc/PID/status can tell us that on modern kernels.
[root@oel6 ~]# cat /proc/27288/status 
Name: find
 State: D (disk sleep)
 Tgid: 27288 
Pid: 27288 
PPid: 27245
TracerPid: 0 
Uid: 0 0 0 0
 Gid: 0 0 0 0
 FDSize: 256
 Groups: 0 1 2 3 4 6 10 
 VmPeak:   112628 kB
 VmSize:   112280 kB 
VmLck:        0 kB
 VmHWM:     1508 kB
 VmRSS:     1160 kB 
VmData:      260 kB 
VmStk:      136 kB 
VmExe:      224 kB
 VmLib:     2468 kB 
VmPTE:       88 kB
 VmSwap:        0 kB
 Threads: 1  

Append data type for Macro

UL is commonly appended to the end of a numerical constant to mark an unsigned long. UL (or L for long) is necessary because it tells the compiler to treat the value as a long value.

#define LONG_MAX ((long)(~0UL>>1)) 

#define ULONG_MAX (~0UL)  

No comments:

Post a Comment