Oracle DB and beyond: practical tips, Q&A: 2013

Tuesday, 3 December 2013

How to choose the sequence file number to use for restore/duplicate RMAN command?

Connect to the RMAN respositor:

rman target /

List the available backups; choose the latest sequence log number and add one when you run the restore/duplicate DB command.
In the example below, I've used 15628+1

RMAN>list backup;

BS Key Size Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ --------------------
282 302.50K DISK 00:00:00 03-DEC-2013 13:53:14
BP Key: 284 Status: AVAILABLE Compressed: YES Tag: ENV_52_DB_BKP
Piece Name: /TESTDB11/ora_data02/RMAN_backup/TESTDB11/TESTDB11_UAT52_TBS_DB_BKUP_13:52:58-12-03-2013_1_287.rman

List of Archived Logs in backup set 282
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- -------------------- ---------- ---------
1 15625 10218583388335 03-DEC-2013 13:35:02 10218583388844 03-DEC-2013 13:45:02
1 15626 10218583388844 03-DEC-2013 13:45:02 10218583389173 03-DEC-2013 13:53:13
1 15627 10218583389173 03-DEC-2013 13:53:13 10218583389183 03-DEC-2013 13:53:13

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
283 Full 4.47G DISK 00:14:43 03-DEC-2013 14:08:01
BP Key: 285 Status: AVAILABLE Compressed: YES Tag: ENV_52_DB_BKP
Piece Name: /TESTDB11/ora_data02/RMAN_backup/TESTDB11/TESTDB11_UAT52_TBS_DB_BKUP_13:52:58-12-03-2013_1_288.rman
List of Datafiles in backup set 283
File LV Type Ckp SCN Ckp Time Name
---- -- ---- ---------- -------------------- ----
1 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/system_TESTDB11_01.dbf
2 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/sysaux_TESTDB11_01.dbf
3 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/undotbs_TESTDB11_01.dbf
4 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/tools_TESTDB11_01.dbf
9 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/bmluapp52_TESTDB11_01.dbf
28 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/bmluref52_TESTDB11_01.dbf
47 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/bmluopr52_TESTDB11_01.dbf
67 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/bmluref52f_TESTDB11_01.dbf
70 Full 10218583389193 03-DEC-2013 13:53:18 /TESTDB11/ora_data00/bmluapp52f_TESTDB11_01.dbf

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
284 Full 1.33M DISK 00:00:01 03-DEC-2013 14:08:04
BP Key: 286 Status: AVAILABLE Compressed: YES Tag: ENV_52_DB_BKP
Piece Name: /TESTDB11/ora_data02/RMAN_backup/TESTDB11/TESTDB11_UAT52_TBS_DB_BKUP_13:52:58-12-03-2013_1_289.rman
Control File Included: Ckp SCN: 10218583390120 Ckp time: 03-DEC-2013 14:08:03

BS Key Size Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ --------------------
285 1.38M DISK 00:00:00 03-DEC-2013 14:08:05
BP Key: 287 Status: AVAILABLE Compressed: YES Tag: ENV_52_DB_BKP
Piece Name: /TESTDB11/ora_data02/RMAN_backup/TESTDB11/TESTDB11_UAT52_TBS_DB_BKUP_13:52:58-12-03-2013_1_290.rman

List of Archived Logs in backup set 285
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- -------------------- ---------- ---------
1 15628 10218583389183 03-DEC-2013 13:53:13 10218583390129 03-DEC-2013 14:08:05

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
286 Full 1.33M DISK 00:00:01 03-DEC-2013 14:08:08
BP Key: 288 Status: AVAILABLE Compressed: YES Tag: TAG20131203T140807
Piece Name: /TESTDB11/ora_data02/RMAN_backup/TESTDB11/ctl/CTLBKP_TESTDB11_13:52:58-12-03-2013.CTL
Control File Included: Ckp SCN: 10218583390148 Ckp time: 03-DEC-2013 14:08:07

The duplicate command looks like:

RUN {
SET NEWNAME FOR DATAFILE 1 TO '/testdb11/ora_data02/RMAN_backup/auxiliary_dest/dbf/system01.dbf';
SET NEWNAME FOR DATAFILE 2 TO '/testdb11/ora_data02/RMAN_backup/auxiliary_dest/dbf/sysaux.dbf';
SET NEWNAME FOR DATAFILE 3 TO '/testdb11/ora_data02/RMAN_backup/auxiliary_dest/dbf/undotbs01.dbf';
SET NEWNAME FOR DATAFILE 4 TO '/testdb11/ora_data02/RMAN_backup/auxiliary_dest/dbf/tools01.dbf';
SET NEWNAME FOR DATAFILE 9 TO '/testdb11/ora_data02/RMAN_backup/auxiliary_dest/dbf/bmluapp52.dbf';
set until sequence 15629 thread 1;
DUPLICATE TARGET DATABASE TO AUXU
TABLESPACE test52,TOOLS
LOGFILE
GROUP 1 ('/testdb11/ora_data02/RMAN_backup/auxiliary_dest/redo/redo01a.log',
'/testdb11/ora_data02/RMAN_backup/auxiliary_dest/redo/redo01b.log') SIZE 200M reuse,
GROUP 2 ('/testdb11/ora_data02/RMAN_backup/auxiliary_dest/redo/redo02a.log',
'/testdb11/ora_data02/RMAN_backup/auxiliary_dest/redo/redo02b.log') SIZE 200M reuse;

}

Wednesday, 27 November 2013

Oracle analytical functions to the rescue

Below is a typical example of how using analytical functions speeds up a query, which is having a subquery inside.

Original query:

SELECT
DISTINCT CUSTOMER_ID, SUBSCRIBER_NO, TEST3.TEST3, MODEL_DESC,
ALT_MODEL_DESC
FROM TEST1, TEST2, TEST3
WHERE TEST1.CUSTOMER_ID = 517091995
AND TEST1.UNIT_ESN = ESN
AND TEST1.ESN_SEQ_NO =
(SELECT MAX (ESN_SEQ_NO)
FROM TEST1 PD1
WHERE PD1.CUSTOMER_ID = TEST1.CUSTOMER_ID
AND PD1.SUBSCRIBER_NO = TEST1.SUBSCRIBER_NO
AND LENGTH(UNIT_ESN) <> 20 )
AND TEST2.TEST3 = TEST3.TEST3
AND ( TEST3.EXPIRATION_DATE IS NULL OR TEST3.EXPIRATION_DATE >= SYSDATE )

;

The new query, using analytical functions (much faster and accesing the TEST1 table only once):

SELECT
DISTINCT CUSTOMER_ID, SUBSCRIBER_NO, TEST3.TEST3, MODEL_DESC,
ALT_MODEL_DESC
FROM
(select UNIT_ESN,ESN_SEQ_NO,CUSTOMER_ID, SUBSCRIBER_NO, max(ESN_SEQ_NO) over (partition by CUSTOMER_ID,SUBSCRIBER_NO) as max_ESN_SEQ_NO
from TEST1 where LENGTH(UNIT_ESN) <> 20 ) phd, TEST2, TEST3
WHERE phd.CUSTOMER_ID = 517091995
and ESN_SEQ_NO=phd.max_ESN_SEQ_NO
AND UNIT_ESN = ESN
AND TEST2.TEST3 = TEST3.TEST3
AND ( TEST3.EXPIRATION_DATE IS NULL OR TEST3.EXPIRATION_DATE >= SYSDATE)
;

Tuesday, 26 November 2013

ORA-24005: must use DBMS_AQADM.DROP_QUEUE_TABLE to drop queue tables

I was getting this error while running the RMAN duplicate database in 11g. As I was duplicating only selected tablespaces over, oracle internally was trying to drop the unnecesary tablespaces and the whole duplicate process was failing with as below:

sql statement: drop tablespace "TEST1" including contents cascade constraints
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 11/26/2013 13:06:34
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of sql command on clone_default channel at 11/26/2013 13:06:34
RMAN-11003: failure during parse/execution of SQL statement: drop tablespace "TEST1" including contents cascade constraints
ORA-00604: error occurred at recursive SQL level 1
ORA-24005: must use DBMS_AQADM.DROP_QUEUE_TABLE to drop queue tables

RMAN> **end-of-file**

I was trying manually to drop the tablespace and I was getting the same error, but no queue tables whatsoever reside in this tablespace:

SQL> drop tablespace "TEST1" including contents cascade constraints;
drop tablespace "TEST1" including contents cascade constraints
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-24005: must use DBMS_AQADM.DROP_QUEUE_TABLE to drop queue tables

SQL> select * from dba_queue_tables where table_name in
(select segment_name from dba_segments where tablespace_name='TEST1';

no rows selected

At this point, my goal was to find out which statement is failing exactly.

Solution:

SQL> ALTER system SET EVENTS='24005 TRACE NAME ERRORSTACK FOREVER, LEVEL 1';

System altered.

I then checked the alert log file for ORA-24005 and then inside the trace file I've found the exact statement failing:

ORA-24005: must use DBMS_AQADM.DROP_QUEUE_TABLE to drop queue tables
----- Current SQL Statement for this session (sql_id=0000000000000) -----
drop table "florin"."DDD#DEF$_AQERROR" cascade constraints purge

In conclusion, oracle treats a table name with special characters as a queue table.
I was able to drop this table, using DBMS_AQADM.DROP_QUEUE_TABLE, indeed, and then I've dropped the tablespace without any issue.

Sunday, 13 October 2013

How to find out fast the partitioning keys in tables or indexes?

The key in this question is the word "fast"; there are many ways to find the partitioning key for a table or an index: using dbms_metadata.get_ddl, export/import using indexfile, ussing expdp and impdp, but the fastest way is to query the views below:

DBA_PART_KEY_COLUMNS
DBA_SUBPART_KEY_COLUMNS
USER_PART_KEY_COLUMNS
USER_SUBPART_KEY_COLUMNS
ALL_PART_KEY_COLUMNS
ALL_SUBPART_KEY_COLUMNS

They provide info for subpartitions also.

Example:

SQL> select * from DBA_PART_KEY_COLUMNS

2 where owner='CUSTO'

3 and NAME='TRB1_AUDIT_SUB_LOG'

4 order by COLUMN_POSITION;

OWNER NAME OBJEC COLUMN_NAME COLUMN_POSITION

------------------------------ ------------------------------ ----- ------------------------------ ---------------

CUSTO MY_AUDIT_LOG TABLE PERIOD_KEY 1

CUSTO MY_AUDIT_LOG TABLE SOURCE_MEMBER_ID 2

CUSTO MY_AUDIT_LOG TABLE SUB_APPL_ID 3

CUSTO MY_AUDIT_LOG TABLE PROCESS_TYPE_IND 4

Monday, 7 October 2013

Oracle Optimizer and the evergreen need for hints

The oracle optimizer is a software, trying to generate the best execution plan for a query, with the information it has. So in theory, if we provide it the most info we can, it should always be able to generate the best execution plan. But does it? Not always.
Even if we provide the optimizer fresh statistics of all the tables involved, there is still a risk that the optimizer
will take poor decisions, since it does not know how many rows will be returned by a join and it has to estimate the
result. Sometimes, wrongly estimating that a 2 table join will return a very small amount of rows, and this result being
used as the driving table in a Nested Loop join, can seriously hinder performance.

The solution in this particular case is to use the use_hash hint, to avoid the NL altogether.

Saturday, 5 October 2013

GATHER_PLAN_STATISTICS hint and dbms_xplan

In order to get detailed information about how many rows are being returned in each step of the execution plan, the hint GATHER_PLAN_STATISTICS comes in very handy, see the simple example below:

SELECT /*+ GATHER_PLAN_STATISTICS */ *
FROM emp e, dept d
WHERE e.deptno = d.deptno
AND e.ename = 'FRANK';

You actually have to run the above query, to be able to see that run stats.

SET LINE 200 pages 80

SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(format => 'ALLSTATS LAST'));

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 8 |
| 1 | NESTED LOOPS | | 1 | | 1 |00:00:00.01 | 9 |
| 2 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 8 |
|* 3 | TABLE ACCESS FULL | EMP | 1 | 1 | 1 |00:00:00.01 | 6 |
|* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | 1 | 1 |00:00:00.01 | 1 |
| 5 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 1 | 1 |00:00:00.01 | 1 |
--------------------------------------------------------------------------------------------------

The output above shows the estimated rows returned (E-Rows), the actual rows returned (A-Rows) and the actual time (A-Time) spent during each step of the execution.

Thursday, 12 September 2013

tar: Error exit delayed from previous errors

Another Linux related post: I was trying to tar up the oracle software and I was getting a strange message at the end of the tar output:

>tar -cvf 11.2.0.tar 11.2.0

11.2.0/
11.2.0/timingframework/
11.2.0/timingframework/TimingFramework.jar
11.2.0/timingframework/README
11.2.0/OPatch/
11.2.0/OPatch/opatchprereqs/
11.2.0/OPatch/opatchprereqs/prerequisite.properties
11.2.0/OPatch/opatchprereqs/opatch/
11.2.0/OPatch/opatchprereqs/opatch/runtime_prereq.xml

-----

tar: Error exit delayed from previous errors

The challenge here is to find out the root cause of this general error message.

Solution:

Run the tar command again, while redirecting the error output to a log file:

tar -cvf 11.2.0.tar 11.2.0 2> tar.log

In my case, the content of the log file solved the "mystery":

tar: 11.2.0/bin/nmhs: Cannot open: Permission denied

tar: 11.2.0/bin/nmb: Cannot open: Permission denied

tar: 11.2.0/bin/nmo: Cannot open: Permission denied

tar: Error exit delayed from previous errors

Sunday, 8 September 2013

Loading an Excel file using sqlldr and "corrupted" French characters

The issue: using as input file an Excel file, 1.csv , we load it using sqlldr utility to the database and when querying the newly inserted data, the French characters seem corrupted.

The control file used is as follows:

LOAD DATA
CHARACTERSET UTF8 ---------> solution provided by the developer :-)
infile '1.csv'
APPEND
INTO TABLE my_table
fields terminated by "," optionally enclosed by '"'
TRAILING NULLCOLS
(empno,alert_end_Date date "mm/dd/yyyy",alert_text char(500000),alert_title,emp_no)

Solution: use CHARACTERSET WE8ISO8859P1 inside the control file, or just remove the characterset entry.

LOAD DATA

CHARACTERSET WE8ISO8859P1 ---------> this has to be the same as nls_lang of the DB , see note below

infile '1.csv'

APPEND

INTO TABLE my_table

fields terminated by "," optionally enclosed by '"'

TRAILING NULLCOLS

(empno,alert_end_Date date "mm/dd/yyyy",alert_text char(500000),alert_title,emp_no)

Note : to find out the NLS LANG of the DB:

select DECODE(parameter, 'NLS_CHARACTERSET', 'CHARACTER SET',

'NLS_LANGUAGE', 'LANGUAGE',

'NLS_TERRITORY', 'TERRITORY') name,

value from v$nls_parameters

WHERE parameter IN ( 'NLS_CHARACTERSET', 'NLS_LANGUAGE', 'NLS_TERRITORY')

NAME VALUE

————- —————–

LANGUAGE AMERICAN

TERRITORY AMERICA

CHARACTER SET WE8ISO8859P1

Thursday, 8 August 2013

Foreign keys and "on delete cascade" for a few delete statements only

Let assume that we have 2 tables, in a parent-child relationship, but the foreign key was created without "on delete cascade"; we want to run a few delete statements on the parent table, which should be cascaded to the child table and after the completion we want to restore the initial mode.

One way to do it is to delete from the child table first and then from the parent table, but this may pose a challenge, we only have conditions for delete for the parent table.

The solution will be re-create the existing foreign key with "on delete cascade", run all the delete statements and after that re-create the FK again, this time exactly as it was before.

Below is a small example:

SQL> create table parent (
2 col1_parent number not null);

Table created.

SQL> alter table parent add constraint parent_pk primary key (col1_parent);

Table altered.

SQL> insert into parent values(1);

1 row created.

SQL> insert into parent values(2);

1 row created.

SQL> insert into parent values(3);

1 row created.

SQL> create table child (
2 col1_child number,
3 FOREIGN KEY (col1_child) REFERENCES parent(col1_parent) );

Table created.

* To delete 1 row from the parent table, we first must delete from the child:

SQL> delete from parent where COL1_PARENT=1;
delete from parent where COL1_PARENT=1
*
ERROR at line 1:
ORA-02292: integrity constraint (FLORINM.SYS_C001881207) violated - child
record found

SQL> delete from child where col1_child=1;

1 row deleted.

SQL> delete from parent where COL1_PARENT=1;

1 row deleted.

* Assume we'd like to delete a few rows from parent, that we'll be cascaded to the child, as a one time thing only:

SQL> insert into child values(2);

1 row created.

SQL> insert into child values(3);

1 row created.

SQL> insert into parent values(10);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from parent;

COL1_PARENT
-----------
2
3
10

SQL> select * from child;

COL1_CHILD
----------
2
3

1* select table_name,constraint_name,constraint_type from user_constraints
SQL> i
2 where table_name in ('PARENT','CHILD');

TABLE_NAME CONSTRAINT_NAME C
------------------------------ ------------------------------ -
CHILD SYS_C001881207 R
PARENT SYS_C001881205 C
PARENT PARENT_PK P

SQL> alter table CHILD add constraint CHILD_FK foreign key (col1_child) references parent(col1_parent) on delete cascade;

Table altered.

SQL> select * from parent;

COL1_PARENT
-----------
2
3
10

SQL> select * from child;

COL1_CHILD
----------
2
3

SQL> delete parent;

3 rows deleted.

SQL> select * from child;

no rows selected

SQL> alter table CHILD drop constraint CHILD_FK;

Table altered.

SQL> alter table CHILD add constraint CHILD_FK foreign key (col1_child) references parent(col1_parent);

Table altered.

Friday, 2 August 2013

How to check and modify the AWR settings?

I've decided to write this post, since every time I need to check/modify the AWR snapshot settings, I never remember the exact views and procedures to use.
So here they are:

1) Getting the current AWR snapshot settings:

SQL> select * from dba_hist_wr_control;

DBID SNAP_INTERVAL RETENTION TOPNSQL
---------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- ----------
2560013153 +00000 00:10:00.0 +00010 00:00:00.0 DEFAULT

What this means is that the snapshots are running every 10 minutes and the retention time is 10 days.

2) Modifying the snapshot interval to every 15 minutes and retention time to 31 days only:

SQL> execute dbms_workload_repository.modify_snapshot_settings( interval => 15,retention => 44640);

interval is in minutes, retention also.

SQL> select * from dba_hist_wr_control;

DBID SNAP_INTERVAL RETENTION TOPNSQL
---------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- ----------
2560013153 +00000 00:15:00.0 +000031 00:00:00.0 DEFAULT

Tuesday, 9 July 2013

Oracle 11g: datapump with option COMPRESSION=ALL

Test case: we have a relatively big table, around 60 GB of uncompressed data; the table is compressed and takes only 4 GB. We need to copy this data to a different database, as fast as possible and also without taking too much disk space.

Solution: datapump to the rescue :-)
While regular exp/imp utilities do not handle well compressed tables, expdp will export the table, while keeping it compressed. Sample of parfile used for the export:

userid=mydbuser/passwd
dumpfile=mytab_extract.dmp
logfile=mytab_extract.exp.log
parallel=16
tables=mytab
directory=big_dmp
compression=ALL

Conclusion: expdp is handling very well compressed tables, generating a dump file similar in size with the size of the compressed table itself.

Wednesday, 26 June 2013

How to remove duplicate rows from a table

Sometimes, the application is inserting "duplicate" rows in a table, let's assume the uniqueness is supposed to be ensured using col1, col2 and col3.
In the case the number of duplicate rows is reasonably small and that we don't really care which row we leave/delete out f the duplicate ones, we could use the delete below to eliminate these rows.
In the case where the number of duplicate is very big, we need to avoid the delete and use "create table as select", to leave only the desired rows.

DELETE FROM dup_table

WHERE rowid not in

(SELECT MIN(rowid)

FROM dup_table

GROUP BY column1, column2, column3);

Monday, 17 June 2013

How to trace remote oracle connections, on the client side?

In order to trace oracle remote connections on the client side, first we need to add a few entries to the sqlnet.ora file; for example, to set trace at the highest level, for oracle support, we need level 16:

box1:/u01/app/oracle/client_trace> cat /etc/sqlnet.ora
TRACE_LEVEL_CLIENT=16
TRACE_DIRECTORY_CLIENT=/u01/app/oracle/client_trace
LOG_DIRECTORY_CLIENT=/u01/app/oracle/client_trace
TRACE_TIMESTAMP_CLIENT=ON
DIAG_ADR_ENABLED=OFF

The last line, disabling diag is required starting with 11G.

In addition, there are two more steps to follow, to make sure that the trace/log file will be generated as expected:

1) Oracle will look for sqlnet.ora in $ORACLE_HOME/networking/admin, by default, so either update the file in this location or define TNS_ADMIN, in the environment where the connection to the DB is taking place.
2) Make sure the directory destination for trace/log files it has read/write permission for the owner of the process you want to trace; oracle will silently ignore your request and just it won't generate the trace file otherwise.

Friday, 7 June 2013

DBA_HIST_ACTIVE_SESS_HISTORY and locking history information

The AWR historical DBA_HIST* views are a real treasure of information. This time, we are interested in historical information related to locking. the question is, who was blocking my session last night, within a given timeframe?

dba_hist_active_sess_history to the rescue!

First step, we create our table as follows, to include only the timeframe we are interested in:

SQL> create table my_hist_active_sess_history nologging parallel(degree 4)

as select * from

dba_hist_active_sess_history

where SNAP_ID in ( select snap_id from dba_hist_snapshot where BEGIN_INTERVAL_TIME > to_timestamp('06-jun-13 10:00:00','dd-mon-yy hh24:mi:ss')

and BEGIN_INTERVAL_TIME <= to_timestamp('06-jun-13 11:00:00','dd-mon-yy hh24:mi:ss'));

The second step will be to query this table for the information we need; in my case, I'm looking for session which were blocked during this time:

SQL>

select event,sum(time_waited)/1000000 s,session_id,module,BLOCKING_SESSION

from my_hist_active_sess_history

where EVENT like 'enq%'

group by event,session_id,module,BLOCKING_SESSION

order by 2;

This is supposed to give us the blocking_session, together with the event and time waited.

Thursday, 30 May 2013

RMAN-06059: expected archived log not found

RMAN hot backup fails with the following error message:

RMAN>backup full database plus archivelog delete input;

RMAN-03002: failure of backup command at 05/30/2013 14:57:23
RMAN-06059: expected archived log not found, loss of archived log compromises recoverability
ORA-19625: error identifying file /TNXAMAES/RMAN/arc/RMAN1_1_816783266.arc
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

This is because I've manually deleted this file from the host, but RMAN metadata has still information about this file. The correct thing to do is to follow up the delete with the below RMAN command:

RMAN> delete expired archivelog all;

released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=266 device type=DISK
List of Archived Log Copies for database with db_unique_name RMAN
=====================================================================

Key Thrd Seq S Low Time
------- ---- ------- - ---------
1 1 1 X 30-MAY-13
Name: /TNXAMAES/RMAN/arc/RMAN1_1_816783266.arc

2 1 2 X 30-MAY-13
Name: /TNXAMAES/RMAN/arc/RMAN1_2_816783266.arc

3 1 3 X 30-MAY-13
Name: /TNXAMAES/RMAN/arc/RMAN1_3_816783266.arc

Do you really want to delete the above objects (enter YES or NO)? YES
deleted archived log
archived log file name=/TNXAMAES/RMAN/arc/RMAN1_1_816783266.arc RECID=1 STAMP=816788064
deleted archived log
archived log file name=/TNXAMAES/RMAN/arc/RMAN1_2_816783266.arc RECID=2 STAMP=816788065
deleted archived log
archived log file name=/TNXAMAES/RMAN/arc/RMAN1_3_816783266.arc RECID=3 STAMP=816788068
Deleted 3 EXPIRED objects

You can double check that now RMAN is up to date:

RMAN> crosscheck archivelog all;

released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=266 device type=DISK
validation succeeded for archived log
archived log file name=/TNXAMAES/RMAN/fra/RMAN1_4_816783266.arc RECID=4 STAMP=816788384
validation succeeded for archived log
archived log file name=/TNXAMAES/RMAN/fra/RMAN1_5_816783266.arc RECID=5 STAMP=816788385
validation succeeded for archived log
archived log file name=/TNXAMAES/RMAN/fra/RMAN1_6_816783266.arc RECID=6 STAMP=816788388
validation succeeded for archived log
archived log file name=/TNXAMAES/RMAN/fra/RMAN1_7_816783266.arc RECID=7 STAMP=816788388
validation succeeded for archived log
archived log file name=/TNXAMAES/RMAN/fra/RMAN/archivelog/2013_05_30/o1_mf_1_8_8th6dcmt_.arc RECID=8 STAMP=816791115
validation succeeded for archived log
archived log file name=/TNXAMAES/RMAN/fra/RMAN/archivelog/2013_05_30/o1_mf_1_9_8th88mbv_.arc RECID=9 STAMP=816793043
Crosschecked 6 objects

or

RMAN> list archivelog all;

Wednesday, 22 May 2013

Example of "for" loop in ksh and echo for special characters

Let's assume that we have a set of scripts in the current directories, all with .sql suffix and we want to print 2 lines of the beginning of each such script.

We'll use the "for" loop in ksh and we'll also use the escape character "\" to print the special character: "

>for f in `ls *.sql`
> do
> echo "set appinfo on" >> 1
> echo "select sys_context('USERENV', 'MODULE') as \"Running Script\" from dual;" >>1
> cat $f >> 1
> mv -f 1 $f
> done

How to print the name of the SQL script running, while running it?

Let's assume that for debugging purposes, we need to print the name of the SQL script that we are running, and this while running the script, meaning that we will need this info in the spool file.

What we need to do is to add 2 lines at the beginning of the script:

set appinfo on
select sys_context('USERENV', 'MODULE') as "Running Script" from dual;

Example:

$>cat /tmp/1.sql

set appinfo on
select sys_context('USERENV', 'MODULE') as "Running Script" from dual;

select * from dual;

$>sqlplus /

SQL*Plus: Release 11.1.0.7.0 - Production on Wed May 22 15:40:48 2013

Copyright (c) 1982, 2008, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> @/tmp/1

Running Script
--------------------------------------------------------------------------------
01@ /tmp/1.sql

D
-
X

Thursday, 16 May 2013

How to trace with binds a query which runs in seconds?

The challenge here is that we have an application which connects to the database, runs a very fast query ( a mater of seconds) and then they disconnect. How are we going not only to trace these queries, but also to catch the bind variables used?

The answer is, we'll use a logon trigger and "alter session set event".
Please see an example of such trigger below:

SQL> get cre_logon_trigger.sql
1 -- To be run as sys ***** --
2 create or replace trigger apptest_logon after logon on database
3 begin
4 if user='APPTEST'
5 then
6 execute immediate 'alter session set tracefile_identifier = ''apptest''';
7 ----execute immediate 'alter session set sql_trace=true';
8 execute immediate 'alter session set events ''10046 trace name context forever, level 4''';
9 end if;
10* end;

A few remarks:

Line #6: we are identifying the queries by the DB user used, apptest.
Line #8: we use level 4 for 10046 event, which means to trace bind variables.
Line #7: commented out, usually used to trace when simple trace files is required, no binds.

Tuesday, 14 May 2013

Oracle SQL Query Tuning Examples, Part 2

This is a 2nd post in a series which started with:

Oracle SQL Query Tuning Part 1

This time, we'll look into a query which is doing 2 nested loops, and we'll see that by adding parallel hint for the driving table of the first nested loop being executed, the query is being speed up considerably.

The unexpected here is that by their nature, Nested Loops are being execute serially, meaning that the driving tables is being accessed and then for each entry which satisfy the query, we go to the inner table and searching for matches. Now, by adding parallel hint for the driving table, even if it is a small table, we'll still gain in performance, because the nested loop is being run in parallel.

The example below is proving this assumption:

SELECT /*+ parallel(test2,4) */
test3.ban,test3.ban_status,test3.account_type,test3.account_sub_type,
test3.bill_cycle,test1.link_type,test2.*
FROM test1 anl , test2 adr , test3 ba
WHERE test1.address_id = test2.address_id
AND NVL(test1.expiration_date,'31-dec-4700')>SYSDATE
AND test1.id = test3.id
/

593 rows selected.

Elapsed: 00:00:06.17

Execution Plan
----------------------------------------------------------
Plan hash value: 2057958791

---------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7223 | 3491K| 3096 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 7223 | 3491K| 3096 | | | Q1,00 | P->S | QC (RAND) |
| 3 | NESTED LOOPS | | 7223 | 3491K| 3096 | | | Q1,00 | PCWP | |
| 4 | NESTED LOOPS | | 7223 | 3378K| 2962 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | | | | | | Q1,00 | PCWC | |
| 6 | TABLE ACCESS FULL | test2 | 31937 | 13M| 4 | | | Q1,00 | PCWP | |
|* 7 | TABLE ACCESS BY GLOBAL INDEX ROWID| test1 | 1 | 21 | 1 | ROWID | ROWID | Q1,00 | PCWP | |
|* 8 | INDEX RANGE SCAN | test1_7IX | 5 | | 1 | | | Q1,00 | PCWP | |
| 9 | PARTITION RANGE ITERATOR | | 1 | 16 | 1 | KEY | KEY | Q1,00 | PCWP | |
| 10 | TABLE ACCESS BY LOCAL INDEX ROWID | test3 | 1 | 16 | 1 | KEY | KEY | Q1,00 | PCWP | |
|* 11 | INDEX UNIQUE SCAN | test3_PK | 1 | | 1 | KEY | KEY | Q1,00 | PCWP | |
---------------------------------------------------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
18 recursive calls
0 db block gets
225299 consistent gets
43033 physical reads
0 redo size
38215 bytes sent via SQL*Net to client
917 bytes received via SQL*Net from client
41 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
593 rows processed

Without parallel:

SELECT
test3.ban,test3.ban_status,test3.account_type,test3.account_sub_type,
test3.bill_cycle,test1.link_type,test2.*
FROM test1 anl , test2 adr , test3 ba
WHERE test1.address_id = test2.address_id
AND NVL(test1.expiration_date,'31-dec-4700')>SYSDATE
AND test1.id = test3.id;

593 rows selected.

Elapsed: 00:02:17.19

Execution Plan
----------------------------------------------------------
Plan hash value: 2057958791

---------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7223 | 3491K| 9284 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 7223 | 3491K| 9284 | | | Q1,00 | P->S | QC (RAND) |
| 3 | NESTED LOOPS | | 7223 | 3491K| 9284 | | | Q1,00 | PCWP | |
| 4 | NESTED LOOPS | | 7223 | 3378K| 8882 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | | | | | | Q1,00 | PCWC | |
| 6 | TABLE ACCESS FULL | test2 | 31937 | 13M| 10 | | | Q1,00 | PCWP | |
|* 7 | TABLE ACCESS BY GLOBAL INDEX ROWID| test1 | 1 | 21 | 1 | ROWID | ROWID | Q1,00 | PCWP | |
|* 8 | INDEX RANGE SCAN | test1_7IX | 5 | | 1 | | | Q1,00 | PCWP | |
| 9 | PARTITION RANGE ITERATOR | | 1 | 16 | 1 | KEY | KEY | Q1,00 | PCWP | |
| 10 | TABLE ACCESS BY LOCAL INDEX ROWID | test3 | 1 | 16 | 1 | KEY | KEY | Q1,00 | PCWP | |
|* 11 | INDEX UNIQUE SCAN | test3_PK | 1 | | 1 | KEY | KEY | Q1,00 | PCWP | |
---------------------------------------------------------------------------------------------------------------------------------------------

P

Statistics
----------------------------------------------------------
27 recursive calls
3 db block gets
225016 consistent gets
43006 physical reads
728 redo size
37747 bytes sent via SQL*Net to client
917 bytes received via SQL*Net from client
41 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
593 rows processed

A nested loops join that uses an index to join two tables can be fully parallelized providing that the driving table is accessed by a table scan.

Friday, 10 May 2013

Oracle SQL Query Tuning Examples, Part 1

This is the first post in a series of interesting SQL tuning real life cases. These are, hopefully, more interesting than the usual tuning steps: looking for full table scans, missing indexes, missing statistics, obvious wrong execution plan.

The first examples is querying table1, which is about 100 GB in size , and it has an index on sys_creation_date.

select * from TABLE1 cs
where actv_code = 'MCN'
and trunc(sys_creation_date)=to_date('20130331','YYYYMMDD')
/

Execution Plan
----------------------------------------------------------
Plan hash value: 1027680822

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101K| 12M| 35815 | | | |
| 1 | PX COORDINATOR | | | | | | | |
| 2 | PX SEND QC (RANDOM)| :TQ10000 | 101K| 12M| 35815 | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 101K| 12M| 35815 | Q1,00 | PCWC | |
|* 4 | TABLE ACCESS FULL| TABLE1 | 101K| 12M| 35815 | Q1,00 | PCWP | |
------------------------------------------------------------------------------------------------------

The optimizer goes with a full table scan, since we have a function on the sys_creation_date and we don't have a function based index defined on trunc(sys_creation_date).

The interesting thing here is that we are looking for a date equal to March 31st 2013, and this can be accomplished by moving the trunc function to the right side and now the optimizer will choose the index.

SQL> l
1 select * from TABLE1 cs
2 where actv_code = 'MCN'
3* and sys_creation_date >= to_date('21030331','YYYYMMDD') and sys_creation_date < to_date('20130401','YYYYMMDD')
SQL> /

Execution Plan
----------------------------------------------------------
Plan hash value: 3230845712

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 132 | 0 |
|* 1 | FILTER | | | | |
|* 2 | TABLE ACCESS BY INDEX ROWID| TABLE1 | 1 | 132 | 3 |
|* 3 | INDEX RANGE SCAN | TABLE1_1IX | 15 | | 1 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(NULL IS NOT NULL)
2 - filter("ACTV_CODE"='MCN')
3 - access("SYS_CREATION_DATE">=TO_DATE(' 2103-03-31 00:00:00',
'syyyy-mm-dd hh24:mi:ss') AND "SYS_CREATION_DATE"<TO_DATE(' 2013-04-01
00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note
-----
- cpu costing is off (consider enabling it)

Note that we cannot use the between operator, since it is inclusive, and we don't want dates of April 1st.
So we've managed to avoid an expensive full table scan and to take advantage of the existing index, by re-writing the query; the two queries are logically identical.

Tuesday, 7 May 2013

Why oracle is reading "much more" data than needed?

Let's try a simple scenario; we are querying a table big_table, doing full table scan and we know that there are about 2 million blocks in the table. However, the query is reading much more data than these 2 million blocks. What is the reason? The simple answer is that oracle is reading data from Undo tablespace, to ensure read consistency. Is it possible that big_table was modified heavily and data was not committed yet.

One way to validate this assumption is to run the query below, where the SID for the FTS on big_table is 161:

SQL> select NAME from v$datafile where FILE# in ( select p1 from v$session_wait where sid in ( select sid from sid=161));

NAME
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/testdb_ora/ora_data01/undo01.dbf

We can say from the output that we are reading data from the Undo tablespace.

"truncate table" and oracle redo logs

Is it "truncate" statement written to the oracle redo logs? Yes, but just as any other DDL command, just as "truncate table <table_name>.

Let's look at an example below, where I've used the logmnr oracle utility to search through the redo logs:

First we create a table t and then truncate it:

SQL> create table t as select * from obj;

Table created.

SQL> truncate table t;

Table truncated.

Now we're going to switch the logfile and then initialize the Logmnr:

SQL> alter system switch logfile;

System altered.

SQL> execute DBMS_LOGMNR.ADD_LOGFILE('/pete_pbmlol/redoB1/redo_pbmlol_B1.dbf',options => dbms_logmnr.new);

PL/SQL procedure successfully completed.

SQL> execute DBMS_LOGMNR.START_LOGMNR(options => -
dbms_logmnr.dict_from_online_catalog);
>
PL/SQL procedure successfully completed.

We create a table based on the logmnr results, for convenience:

SQL> create table mycontents nologging parallel(degree 8)
2 tablespace tools
3 as select * from v$logmnr_contents;

Table created.

The moment of truth, we query sql_redo with regards to our table:

SQL> select sql_redo from mycontents where TABLE_NAME='T';

SQL_REDO
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
create table t as select * from obj;
truncate table t;
Unsupported

And there is no undo statement, as expected :-)

SQL> select sql_undo from mycontents where TABLE_NAME='T';

SQL_UNDO
--------------------------------------------------------------------------------

Unsupported

Thursday, 2 May 2013

Linux: How to diagnose oracle server process getting stuck, in oracle 11G

The example below is shows how to diagnose a stuck oracle server process, on Linux.
PID 11264 is an oracle server process, which is getting stuck.

First, we use the "strace" Linux command, which replaces "tusc" used in HP-UX systems:

[box1@TESTDB]/u01/app/oracle/admin/TESTDB/diag/rdbms/camssdb/TESTDB/trace >strace -fp 11264
Process 11264 attached - interrupt to quit
times({tms_utime=5630562, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377336221
times({tms_utime=5630562, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377336221
times({tms_utime=5630562, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377336221
times({tms_utime=5630562, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377336221
getrusage(RUSAGE_SELF, {ru_utime={56325, 616204}, ru_stime={6, 55079}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={56325, 616204}, ru_stime={6, 55079}, ...}) = 0
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
times({tms_utime=5632561, tms_stime=605, tms_cutime=0, tms_cstime=0}) = 3377338220
read(13, "\0BC\7\320\0\n\0\0\0\1\0\0\0\0e\1hK\363\367\"\24\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
times({tms_utime=5636503, tms_stime=606, tms_cutime=0, tms_cstime=0}) = 3377342164
times({tms_utime=5636503, tms_stime=606, tms_cutime=0, tms_cstime=0}) = 3377342164

Second, we use lsof Linux command:

[box1@TESTDB]/u01/app/oracle/admin/TESTDB/diag/rdbms/camssdb/TESTDB/trace >/usr/sbin/lsof -p 11264 |grep 13
oracle 11264 oracle cwd DIR 253,9 4096 1062513 /u01/app/oracle/product/11.1.0.7/dbs
oracle 11264 oracle DEL REG 0,13 25100301 /3
oracle 11264 oracle mem REG 253,0 139504 229689 /lib64/ld-2.5.so
oracle 11264 oracle mem REG 253,0 615136 229429 /lib64/libm-2.5.so
oracle 11264 oracle mem REG 253,9 2513705 1579856 /u01/app/oracle/product/11.1.0.7/lib/libhasgen11.so
oracle 11264 oracle mem REG 253,9 13159 1579985 /u01/app/oracle/product/11.1.0.7/lib/libskgxn2.so
oracle 11264 oracle mem REG 253,9 1062133 1579956 /u01/app/oracle/product/11.1.0.7/lib/libocr11.so
oracle 11264 oracle 5r DIR 0,3 0 738197513 /proc/11264/fd
oracle 11264 oracle 8r DIR 0,3 0 738197513 /proc/11264/fd
oracle 11264 oracle 11u REG 253,118 2097160192 7913475 /amssdb_petcamssdb/ora_data00/PAMSSDB/system_CAMSSDB_01.dbf
oracle 11264 oracle 13u IPv4 1506971131 TCP anacaj:ncube-lm->box2.qc.bell.ca:17551 (ESTABLISHED) ---------------------> This is what we are looking for
oracle 11264 oracle 14u REG 253,124 20971528192 13336587 /amssdb_petcamssdb/ora_data06/PAMSSDB/pool_data_CAMSSDB_03.dbf
oracle 11264 oracle 15u REG 253,118 10485768192 7913479 /amssdb_petcamssdb/ora_data00/PAMSSDB/pool_ix_CAMSSDB_01.dbf
oracle 11264 oracle 24u REG 253,124 10485768192 13336585 /amssdb_petcamssdb/ora_data06/PAMSSDB/abp_ix_l2_CAMSSDB_05.dbf
oracle 11264 oracle 29u REG 253,118 15728648192 7913478 /amssdb_petcamssdb/ora_data00/PAMSSDB/pool_data_CAMSSDB_01.dbf

Last step, login to box2 and look for port 17551:

/usr/sbin/lsof |grep 17551

Tuesday, 30 April 2013

"opatch lsinventory" and "opatch apply" errors and easy fix

This is an annoying error message that we receive sometimes, when trying to check which patches we have applied or even when trying to apply a DB patch. The output below is for version 11.1:

mybox@TESTDB]/u01/app/oracle/product/11.1.0.7/OPatch >./opatch lsinventory
Invoking OPatch 11.1.0.6.2

Oracle Interim Patch Installer version 11.1.0.6.2
Copyright (c) 2007, Oracle Corporation. All rights reserved.

Oracle Home : /u01/app/oracle/product/11.1.0.7
Central Inventory : n/a
from :
OPatch version : 11.1.0.6.2
OUI version : 11.1.0.7.0
OUI location : /u01/app/oracle/product/11.1.0.7/oui
Log file location : n/a

OPatch cannot find a valid oraInst.loc file to locate Central Inventory.

OPatch failed with error code 104

We are getting similar error when trying to apply the patch:

[mybox@TESTDB]/u01/app/oracle/product/9393222 >/u01/app/oracle/product/11.1.0.7/OPatch/opatch apply

Invoking OPatch 11.1.0.6.2

Oracle Interim Patch Installer version 11.1.0.6.2

Oracle Home : /u01/app/oracle/product/11.1.0.7

Central Inventory : n/a

from :

OPatch version : 11.1.0.6.2

OUI version : 11.1.0.7.0

OUI location : /u01/app/oracle/product/11.1.0.7/oui

Log file location : n/a

OPatch cannot find a valid oraInst.loc file to locate Central Inventory.

OPatch failed with error code 104

In both cases, the fix is very easy:

[mybox@TESTDB]/u01/app/oracle/product/11.1.0.7/OPatch >./opatch lsinventory -invPtrLoc /u01/app/oracle/oraInventory/oraInst.loc

Same fix for similar error when applying the patch:

[anacaj@CAMSSDB]/u01/app/oracle/product/9393222 >/u01/app/oracle/product/11.1.0.7/OPatch/opatch apply -invPtrLoc /u01/app/oracle/oraInventory/oraInst.loc

Thursday, 25 April 2013

Did you know you can retrieve this type of info from within the Oracle DB server?

This is a short post, showing how to retrieve data that can be useful from withing the database server.

The first example let you verify the OS platform name:

SQL> select PLATFORM_NAME from v$database;

PLATFORM_NAME
--------------------------------------------------------------------------------
Linux x86 64-bit

This example provides you with the time at a specific destination in the world:

SQL> SELECT systimestamp AT TIME ZONE 'Europe/Bucharest' FROM DUAL;

SYSTIMESTAMPATTIMEZONE'EUROPE/BUCHAREST'
---------------------------------------------------------------------------
14-MAR-13 06.06.31.573548 PM EUROPE/BUCHAREST

Wednesday, 24 April 2013

Tracing oracle session to identify errors in query

Sometimes, you need to trace an oracle session, in order to identify a failing query.
After you enable the trace, most often using the package dbms_monitor.session_trace_enable, you'll see something similar to below extract from the generated raw trace file (before running tkprof):

PARSING IN CURSOR #128 len=249 dep=0 uid=548 oct=6 lid=548 tim=1366217870145327 hv=4176214642 ad='c9f69fae0' sqlid='bs24wq7wfrymk'
update TEST1 set SYS_CREATION_DATE=SYS_CREATION_DATE,SYS_UPDATE_DATE=SYSDATE,OPERATOR_ID=:b0:b1,APPLICATION_ID=:b2:b3,DL_SERVICE_CODE=:b4,DL_UPDATE_STAMP=:b5:b6,AUTO_GEN_PYM_TYPE=:b7,BL_BILL_METH_DATE=TO_DATE(:b8,'YYYYMMDD') where BAN=:b9
END OF STMT
PARSE #128:c=0,e=28,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1739222879,tim=1366217870145325
EXEC #128:c=0,e=128,p=0,cr=3,cu=1,mis=0,r=0,dep=0,og=1,plh=1739222879,tim=1366217870145508
ERROR #128:err=1841 tim=1366217870145530

So now we have the failing DB statement and the error number.
All is left is to run "oerr", to identify the error:

>oerr ora 1841

01841, 00000, "(full) year must be between -4713 and +9999, and not be 0"
// *Cause: Illegal year entered
// *Action: Input year in the specified range

Friday, 19 April 2013

How to validate oracle redo/archived logs? ( Oracle 11g R1)

Recently we've learnt that one of our archived redo logs was corrupted (by sending it to a third party tool, CDC, which could not use it).
The question was, is there a way to validate the archived redo log?

The answer is yes and the method is:

SQL> alter system dump logfile '/u01/app/oracle/Dmp/TESTDB_1799337628_4642.arc';
alter system dump logfile '/u01/app/oracle/Dmp/TESTDB_1799337628_4642.arc'
*
ERROR at line 1:
ORA-00368: checksum error in redo log block
ORA-00353: log corruption near block 1822722 change 10154296873806 time
04/17/2013 03:14:22
ORA-00334: archived log: '/u01/app/oracle/Dmp/TESTDB_1799337628_4642.arc'

This command will generate a detailed trace file in the user dump destination.

We could even generate a smaller trace file, having the error inside, by running the same command, but on a block range:

SQL> alter system dump logfile '/u01/app/oracle/Dmp/TESTDB_1799337628_4642.arc' dba min 4642 1822700 dba max 4642 1822800;
alter system dump logfile '/u01/app/oracle/Dmp/TESTDB_1799337628_4642.arc' dba min 4642 1822700 dba max 4642 1822800
*
ERROR at line 1:
ORA-00368: checksum error in redo log block
ORA-00353: log corruption near block 1820688 change 10154296880830 time
04/17/2013 03:14:24
ORA-00334: archived log: '/u01/app/oracle/Dmp/TESTDB_1799337628_4642.arc'

More details on My Oracle Supprt, ID 1031381.6

Tuesday, 9 April 2013

Startup fails with ORA-27154: post/wait create failed

The other day I was getting a strange message trying to bring up one of our Oracle 11g databases; on the same box, we have about 25 instances:

SQL> startup
ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpcreates

The error messages in this case are misleading, in my case there was no disk space issue. Many articles point in the direction of a issue with the number of UNIX semaphores in the system.

In my case, by decreasing the PROCESSES initialization parameter to 200, down from 800, solved the issue and the database came up.

Friday, 5 April 2013

Oracle 11G: How to disable the default scheduler maintenance Window for good?

Some workshops go by canceling the scheduler default scheduler maintenance window and prefer to manage statistics manually, by this having more control; besides, they don't want to overload the machine at any time.

So the run:

SQL>ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = '' SCOPE=BOTH;

Surprisingly, the alert log shows that the maintenance window keeps being activated, see sample of excerpt:

Setting Resource Manager plan SCHEDULER[0x2C44]:DEFAULT_MAINTENANCE_PLAN via scheduler window

Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

Solution:

SQL> ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = 'FORCE:' scope=both;

After this, the maintenance plan default window is gone, for good.

Tuesday, 2 April 2013

How to print current line and next line at once, using "awk" ?

A bit of what I call "awk magic" :-)

I have a simple text file, like this:

$cat my_file.txt

1
2
3
4
5
6

My desire output is like this:

1 2
3 4
5 6

Solution:

awk '{print $0 p; p=" "$0}' my_file.txt|awk -F" " '{print $2" "$1}'
1
1 2
2 3
3 4
4 5
5 6

We need to get rid of the 1st line ; the 2nd awk was needed to reverse the order of the 2 elements returned by awk utility.

This may be helpful sometimes, when you have a list of numbers and need to prepare like a script, which will take as parameters ranges.

Monday, 1 April 2013

How to trace "ORA-00942: table or view does not exist"

Is happening sometimes that either a third party application or a developer script/application is failing with:
ORA-00942: table or view does not exist
The question is, how to identify the "missing" table, without enabling some intensive tracking?
The answer is by enabling event 00942, as in the below example:

SQL> ALTER system SET EVENTS='00942 TRACE NAME ERRORSTACK FOREVER, LEVEL 1';

System altered.

The next step will be to ask to re-run the application and then identifying the error and the trace file name inside the alert.log.

After the problematic statement is identified, it is a good idea to disable the event:

SQL> ALTER system SET EVENTS='00942 TRACE NAME ERRORSTACK OFF';

System altered.

This is just an example of an error, this method will work for many other error messages.

UNIX "find" command: how to avoid "permission denied" error messages?

Sometimes, when we use "find" UNIX command to locate a file, by its name for example, we are getting a lot of "permission denied" messages, and it's very hard to analyze the output. This is true especially when we search starting with "/", meaning that we will look through all the directories on the box.

A quick fix is to redirect these messages to to /dev/null and hence getting a clean and readable output, as below:

[host1]/ >find . -type f -name "my_file" 2>/dev/null
./usr/florin/Test_Dir/my_file

Sunday, 31 March 2013

How to recover after mistakenly deleting a database file on Linux?

This can be a life saver; let's say that by mistake, one of the DBAs ( not you, of course), is deleting a datafile. You are under pressure to have the tablespace which contained that datafile back as soon as possible.
Luckily on Linux, you can recover a deleted file, as long as there is still some process attached to the deleted file. And in our case, the DBW is still attached to the file. The process is quite simple, but first of all don't panic and keep the DB up, otherwise the file is gone.

Below is the real test which I've done in one of our databases, together with some embedded comments:

Creating a tablespace and a table on it, dedicated to the test:

SQL> create tablespace test1 datafile '/MYDB/oradata/ora_data00/MYDB_test1_01.dbf' size 200M;

Tablespace created.

SQL> l
1* create table my_test tablespace test1 as select * from all_objects
where rownum < 101;

Table created.

SQL> select count(*) from my_test;

COUNT(*)
----------
100

SQL> l
1 select FILE_NAME from dba_data_files
2* where TABLESPACE_NAME='TEST1'
SQL> /

Datafile Name
----------------------------------------
/MYDB/oradata/ora_data00/MYDB_test1_01
.dbf

Deleting the data file:

rm -f /MYDB/oradata/ora_data00/MYDB_test1_01.dbf

Identifying the PID for the DBW processes and finding the fd (file descriptor) pointing to our deleted file:

[host1@BMLUS]/u01/app/oracle >ps -fu oracle|grep dbw

oracle 22343 1 0 17:28 ? 00:00:00 ora_dbw0_MYDB
oracle 22347 1 0 17:28 ? 00:00:00 ora_dbw1_MYDB
oracle 22351 1 0 17:28 ? 00:00:00 ora_dbw2_MYDB
oracle 22355 1 0 17:28 ? 00:00:00 ora_dbw3_MYDB

[host1@MYDB]/proc >ll 10139/fd|grep del
lrwx------ 1 oracle dba 64 Mar 4 01:30 9 -> /u01/app/oracle/product/11.1.0/dbs/lkinstMYDB (deleted)
l-wx------ 1 oracle dba 64 Mar 4 01:30 12 -> /u01/app/oracle/admin/MYDB/diag/rdbms/MYDB/MYDB/trace/MYDB_ora_10104.trm (deleted)
l-wx------ 1 oracle dba 64 Mar 4 01:30 11 -> /u01/app/oracle/admin/MYDB/diag/rdbms/MYDB/MYDB/trace/MYDB_ora_10104.trc (deleted)
lrwx------ 1 oracle dba 64 Mar 31 17:16 32 -> /MYDB/oradata/ora_data00/MYDB_test1_01.dbf (deleted)

Now trying to select from the table my_test:

SQL> select count(*) from my_test;

COUNT(*)
----------
100

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(*) from my_test;
select count(*) from my_test
*
ERROR at line 1:
ORA-01116: error in opening database file 9
ORA-01110: data file 9: '/MYDB/oradata/ora_data00/MYDB_test1_01.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

The 1st magic to do is to create a symbolic link as below, pointing to the file descriptor.

[host1@MYDB]/proc >ln -s /proc/10139/fd/32 /MYDB/oradata/ora_data00/MYDB_test1_01.dbf

SQL> select count(*) from my_test;

COUNT(*)
----------
100

Make the tablespace read only:

SQL> alter tablespace test1 read only;

Tablespace altered.

Copy the file to a different name:

cp /MYDB/oradata/ora_data00/MYDB_test1_01.dbf /MYDB/oradata/ora_data00/MYDB_test1_01_recovered.dbf

Shutdown abort:

SQL> shutdown abort
ORACLE instance shut down.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

Delete the symbolic link that we've created and rename the data file to its original name:

[host1@MYDB]/proc >ll /MYDB/oradata/ora_data00/MYDB_test1_01.dbf
lrwxrwxrwx 1 oracle dba 17 Mar 31 17:19 /MYDB/oradata/ora_data00/MYDB_test1_01.dbf -> /proc/10139/fd/32
[host1@MYDB]/proc >rm /MYDB/oradata/ora_data00/MYDB_test1_01.dbf
[host1@MYDB]/proc >mv /MYDB/oradata/ora_data00/MYDB_test1_01_recovered.dbf /MYDB/oradata/ora_data00/MYDB_test1_01.dbf

Startup the DB and make the tablespace read write.

SQL> startup
ORACLE instance started.

Total System Global Area 2689212416 bytes
Fixed Size 2162640 bytes
Variable Size 570425392 bytes
Database Buffers 2097152000 bytes
Redo Buffers 19472384 bytes
Database mounted.
Database opened.
SQL> alter tablespace test1 read write;

Tablespace altered.

SQL> conn /
Connected.
SQL> select count(*) from my_test;

COUNT(*)
----------
100

We are done, back to life, in one piece :-)

Thursday, 28 March 2013

What was the password used to create db link?

You have a database link in one of your Oracle 11g database accounts and you want to create the exact db link in a different account. The only issue is that the user password is blank in user_db_links, due to security reasons. So how to find out the password and to "duplicate" this db link?
In reality, you don't even need to know the password, you can query sys.link$ and use "identified by values" when creating the db link:

SQL> conn / as sysdba
Connected.
SQL> select NAME,HOST,userid,PASSWORDX
from sys.link$ where name='TEST1';

SQL> /

NAME HOST USERID PASSWORDX
-------------------- ---------- ------------------------------ --------------------------------------------------------------------------------
TEST1 MYDB FLORIN 05177093C45ABC2976294C0780B2B5873E3CE6D80F94D9A314

So all is left is to connect and create the db link:

SQL> conn florin2/florin2
Connected.
SQL> create database link test1 connect to florin identified by values '05177093C45ABC2976294C0780B2B5873E3CE6D80F94D9A314' using 'MYDB';

Database link created.

SQL> select * from dual@test1;

Removing '\par' from text file, on Linux

You've just transferred a text file from Windows to Linux or HP, just to realize it has some strange characters inside. Most of the time, "dos2ux" will do the job, but not in this case. The '\par' is originating from an enriched text file on Windows.
In this case, sed Unix utility is coming to the rescue:

sed 's#\\par$##' My_Windows_file.txt

For more details, you may visit the link below:

http://www.linuxmisc.com/12-unix-shell/bcf4edc672afc06b.htm