Linux Applications Debugging Techniques/Stack corruption
Stack corruption is rather hard to diagnose. Luckily, gcc 4.x can instrument the code to check for stack corruption:
- -fstack-protector
- -fstack-protector-all
Sysfs or procfs
Procfs is much more generic interface to provide information to user space.All the process related information and some parameters are also passed to user space through profs file system,
Whereas Sysfs is much more modern implementation like procfs.But it is implemented by following Linux driver model and hierarchy. Sysfs stores driver,device information alongside of process information too.As Greg KH told "sysfs is "1 value per file"" . so can be used for full user level driver implementation.
Monitor certain process
if you want monitor process identifier (PID) 4360 and 4358, you type:
$ top -p 4360,4358
$time top -b -n 1
This will give you illustration how fast top works on single round
Troubleshooting a “slow” process
[root@oel6 ~]# ps -ef | grep find
root 27288 27245 4 11:57 pts/0 00:00:01 find . -type f
root 27334 27315 0 11:57 pts/1 00:00:00 grep find
[root@oel6 ~]# top -cbp 27288
top - 11:58:15 up 7 days, 3:38, 2 users, load average: 1.21, 0.65, 0.47
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.1%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2026460k total, 1935780k used, 90680k free, 64416k buffers
Swap: 4128764k total, 251004k used, 3877760k free, 662280k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27288 root 20 0 109m 1160 844 D 0.0 0.1 0:01.11 find . -type f
Normally when a process seems to be stuck like that (0% CPU usually means that the process is stuck in some blocking system call – which causes the kernel to put the process to sleep).
strace
on that process to trace in which system call the process is currently stuck. Also if the process is actually not completely stuck, but returns from a system call and wakes up briefly every now and then, it would show up in strace. [root@oel6 ~]# strace -cp 27288
Process 27288 attached - interrupt to quit
^C
^Z
[1]+ Stopped strace -cp 27288
Oops, the strace command itself got hung too! It didn’t print any output for a long time and didn’t respond to CTRL+C.
Let’s try pstack then (on Linux, pstack is just a shell script wrapper around the
GDB
debugger). While pstack does not see into kernel-land, it will still give us a clue about which system call was requested (as usually there’s a corresponding libc library call in the top of the displayed userland stack):
[root@oel6 ~]# pstack 27288
^C ^Z [1]+ Stopped pstack 27288
Pstack also got stuck without returning anything!
[root@oel6 ~]# ps -flp 27288
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
0 D root 27288 27245 0 80 0 - 28070 rpc_wa 11:57 pts/0 00:00:01 find . -type f
[root@oel6 ~]# cat /proc/27288/wchan
rpc_wait_bit_killable
this process is waiting for some RPC call.
let’s figure out whether this process is completely stuck or not. The /proc/PID/status can tell us that on modern kernels.
[root@oel6 ~]# cat /proc/27288/status
Name: find
State: D (disk sleep)
Tgid: 27288
Pid: 27288
PPid: 27245
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 256
Groups: 0 1 2 3 4 6 10
VmPeak: 112628 kB
VmSize: 112280 kB
VmLck: 0 kB
VmHWM: 1508 kB
VmRSS: 1160 kB
VmData: 260 kB
VmStk: 136 kB
VmExe: 224 kB
VmLib: 2468 kB
VmPTE: 88 kB
VmSwap: 0 kB
Threads: 1
No comments:
Post a Comment