Oracle DBA and beyond; these are practical tips for day to day DBA operation and maintenance; a place where you would come to look for a quick fix for a burning situation. I hope that by sharing all these, we all will become better in what we do. And on the way, I hope to save you some sweat :-)
Thursday, 2 May 2013
Linux: How to diagnose oracle server process getting stuck, in oracle 11G
The example below is shows how to diagnose a stuck oracle server process, on Linux.
PID 11264 is an oracle server process, which is getting stuck.
First, we use the "strace" Linux command, which replaces "tusc" used in HP-UX systems:
[box1@TESTDB]/u01/app/oracle/admin/TESTDB/diag/rdbms/camssdb/TESTDB/trace >strace -fp 11264
Process 11264 attached - interrupt to quit
times({tms_utime=5630562, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377336221
times({tms_utime=5630562, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377336221
times({tms_utime=5630562, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377336221
times({tms_utime=5630562, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377336221
getrusage(RUSAGE_SELF, {ru_utime={56325, 616204}, ru_stime={6, 55079}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={56325, 616204}, ru_stime={6, 55079}, ...}) = 0
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
read(13, "\0BC\7\320\0\n\0\0\0\1\0\0\0\0e\1hK\363\367\"\24\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
times({tms_utime=5636503, tms_stime=606, tms_cutime=0, tms_cstime=0}) = 3377342164
times({tms_utime=5636503, tms_stime=606, tms_cutime=0, tms_cstime=0}) = 3377342164
Second, we use lsof Linux command:
[box1@TESTDB]/u01/app/oracle/admin/TESTDB/diag/rdbms/camssdb/TESTDB/trace >/usr/sbin/lsof -p 11264 |grep 13
oracle 11264 oracle cwd DIR 253,9 4096 1062513 /u01/app/oracle/product/11.1.0.7/dbs
oracle 11264 oracle DEL REG 0,13 25100301 /3
oracle 11264 oracle mem REG 253,0 139504 229689 /lib64/ld-2.5.so
oracle 11264 oracle mem REG 253,0 615136 229429 /lib64/libm-2.5.so
oracle 11264 oracle mem REG 253,9 2513705 1579856 /u01/app/oracle/product/11.1.0.7/lib/libhasgen11.so
oracle 11264 oracle mem REG 253,9 13159 1579985 /u01/app/oracle/product/11.1.0.7/lib/libskgxn2.so
oracle 11264 oracle mem REG 253,9 1062133 1579956 /u01/app/oracle/product/11.1.0.7/lib/libocr11.so
oracle 11264 oracle 5r DIR 0,3 0 738197513 /proc/11264/fd
oracle 11264 oracle 8r DIR 0,3 0 738197513 /proc/11264/fd
oracle 11264 oracle 11u REG 253,118 2097160192 7913475 /amssdb_petcamssdb/ora_data00/PAMSSDB/system_CAMSSDB_01.dbf
oracle 11264 oracle 13u IPv4 1506971131 TCP anacaj:ncube-lm->box2.qc.bell.ca:17551 (ESTABLISHED) ---------------------> This is what we are looking for
oracle 11264 oracle 14u REG 253,124 20971528192 13336587 /amssdb_petcamssdb/ora_data06/PAMSSDB/pool_data_CAMSSDB_03.dbf
oracle 11264 oracle 15u REG 253,118 10485768192 7913479 /amssdb_petcamssdb/ora_data00/PAMSSDB/pool_ix_CAMSSDB_01.dbf
oracle 11264 oracle 24u REG 253,124 10485768192 13336585 /amssdb_petcamssdb/ora_data06/PAMSSDB/abp_ix_l2_CAMSSDB_05.dbf
oracle 11264 oracle 29u REG 253,118 15728648192 7913478 /amssdb_petcamssdb/ora_data00/PAMSSDB/pool_data_CAMSSDB_01.dbf
Last step, login to box2 and look for port 17551:
/usr/sbin/lsof |grep 17551
Subscribe to:
Post Comments (Atom)
On HP-UX, use "tusc" instead of strace, since strace works on Linux only.
ReplyDelete