About Oracle: 2009

Thursday, December 31, 2009

Paper about grouping

In 2009 I have presented "Alles Over Groeperen" twice in the Netherlands and the English version "All About Grouping" also twice, once at the Oracle OpenWorld Unconference and once at UKOUG. If you haven't been able to visit one of these presentations, and you are interested in the subject, then you can catch up by reading the paper that I just finished.

The usual sequence of events is to first produce a paper and then present the paper at conferences. When I was writing the paper, I discovered one or two new things. For example, I gained a preciser understanding of how a cube is implemented. Unfortunately, I did not include that in any of the presentations. So next time, I'll make sure to first write the paper and then present it. A nice lesson at the end of 2009.

Enjoy reading. If you have some remarks about the contents of the paper, then please leave a comment here or send an e-mail.

And I wish all readers a healthy 2010 and the wisdom to pursue your dreams.

Wednesday, December 16, 2009

Journaling using flashback data archives in 11.2.0.1?

In version 11.1.0.6, Oracle introduced flashback data archives under the marketing term Total Recall. Previously you could only flashback your table to an earlier point in time, as long as the needed undo information was available. With flashback data archives you can extend this period with as long as you specify. You create a flashback data archive, specify a retention period and associate the archive with a table. In version 11.1.0.6 there was a nasty bug, as I described in this post. In version 11.1.0.7, Oracle fixed this bug, and you could safely use flashback data archives for journaling, as described here. However, you could not modify the schema, for example add a constraint to a table with a flashback data archive associated, in 11.1.0.7. In the mentioned post I described how to overcome that, but it was cumbersome.

And then came 11.2.0.1. The new features guide mentions:

1.5.1.1 Flashback Data Archive Support for DDLs

Oracle Database 11g Release 2 (11.2) users can now use most DDL commands on tables that are being tracked with Flashback Data Archive. This includes:

* Add, Drop, Rename, Modify Column
* Drop, Truncate Partition
* Rename, Truncate Table
* Add, Drop, Rename, Modify Constraint

For more complex DDL (for example, upgrades and split table), the Disassociate and Associate PL/SQL procedures can be used to temporarily disable Total Recall on specified tables. The Associate procedure enforces schema integrity after association; the base table and history table schemas must be the same.

This feature makes it much easier to use the Total Recall option with complex applications that require the ability to modify the schema.

So now it seems the last restriction is removed and we can use this technology for journaling. Let's test how it works in 11.2.0.1 by using a similar setup as in my first two blog posts about this subject.

The tablespace already exists:

rwijk@ORA11GR2> create tablespace my_tablespace datafile 'extra_file.dat' size 10M
  2  /
create tablespace my_tablespace datafile 'extra_file.dat' size 10M
*
ERROR at line 1:
ORA-01543: tablespace 'MY_TABLESPACE' already exists

Create the flashback archive, a foreign key table and a table T with the flashback data archive associated:

rwijk@ORA11GR2> create flashback archive flashback_archive_10_years
  2  tablespace my_tablespace
  3  retention 10 year
  4  /

Flashback archive created.

rwijk@ORA11GR2> create table fktable (col number(10) primary key)
  2  /

Table created.

rwijk@ORA11GR2> create table t
  2  ( pkcol       number(10) primary key
  3  , fkcol       number(10) not null references fktable(col)
  4  , description varchar2(11)
  5  )
  6  flashback archive flashback_archive_10_years
  7  /

Table created.

The flashback data archive table is created:

rwijk@ORA11GR2> select archive_table_name
  2    from dba_flashback_archive_tables
  3   where table_name = 'T'
  4  /

ARCHIVE_TABLE_NAME
-----------------------------------------------------
SYS_FBA_HIST_75925

1 row selected.

rwijk@ORA11GR2> exec dbms_lock.sleep(15)

PL/SQL procedure successfully completed.

Add a row to the foreign key table and switch on auditing to get access to the user. This is done because all journaling information is available when using Flashback Version Query, except for the user who changed the data. Switching on auditing records much more than just the username (just do a describe of sys.aud$), but at least it gives the username without additional coding.

rwijk@ORA11GR2> insert into fktable (col) values (1)
  2  /

1 row created.

rwijk@ORA11GR2> audit insert,update,delete on t by access
  2  /

Audit succeeded.

And create the view that represents the journaling table:

rwijk@ORA11GR2> create view v
  2  as
  3  select tv.versions_operation jn_operation
  4       , ao.username jn_user
  5       , tv.versions_starttime jn_datetime
  6       , tv.versions_startscn jn_scn
  7       , tv.pkcol
  8       , tv.fkcol
  9       , tv.description
 10    from t versions between scn minvalue and maxvalue tv
 11       , user_audit_object ao
 12   where tv.versions_xid = ao.transactionid (+)
 13  /

View created.

Now add some data to see if it works:

rwijk@ORA11GR2> insert into t
  2   select level
  3        , 1
  4        , lpad('*',11,'*')
  5     from dual
  6  connect by level <= 10
  7  /

10 rows created.

rwijk@ORA11GR2> commit
  2  /

Commit complete.

rwijk@ORA11GR2> pause

rwijk@ORA11GR2> update t
  2     set description = 'a'
  3   where pkcol = 1
  4  /

1 row updated.

rwijk@ORA11GR2> commit
  2  /

Commit complete.

rwijk@ORA11GR2> pause

rwijk@ORA11GR2> delete t
  2   where pkcol = 7
  3  /

1 row deleted.

rwijk@ORA11GR2> commit
  2  /

Commit complete.

rwijk@ORA11GR2> pause

rwijk@ORA11GR2> select * from v order by jn_scn
  2  /

J JN_USER    JN_DATETIME                          JN_SCN PKCOL FKCOL DESCRIPTION
- ---------- -------------------------------- ---------- ----- ----- -----------
I RWIJK      16-DEC-09 03.02.16 PM               5032930     7     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930     2     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930    10     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930     9     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930     8     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930     1     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930     6     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930     5     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930     4     1 ***********
I RWIJK      16-DEC-09 03.02.16 PM               5032930     3     1 ***********
U RWIJK      16-DEC-09 03.02.34 PM               5032938     1     1 a
D RWIJK      16-DEC-09 03.02.34 PM               5032941     7     1 ***********

12 rows selected.

Everything is there: 10 inserted rows, 1 updated row and 1 deleted row.

Now, add an extra column with a check constraint on it. Adding a column was possible in 11.1.0.x as well, but adding the constraint was not possible. Now it is.

rwijk@ORA11GR2> alter table t add (status varchar2(3))
  2  /

Table altered.

rwijk@ORA11GR2> alter table t add constraint ck_status check (status in ('NEW','OLD'))
  2  /

Table altered.

Note though that it takes considerably more time when adding a constraint on a history-tracked table than on a regular table. On my laptop it consistently takes approximately 10 seconds.

Now adjust the journaling view to have it contain the extra column:

rwijk@ORA11GR2> create or replace view v
  2  as
  3  select tv.versions_operation jn_operation
  4       , ao.username jn_user
  5       , tv.versions_starttime jn_datetime
  6       , tv.versions_startscn jn_scn
  7       , tv.pkcol
  8       , tv.fkcol
  9       , tv.description
 10       , tv.status
 11    from t versions between scn minvalue and maxvalue tv
 12       , user_audit_object ao
 13   where tv.versions_xid = ao.transactionid (+)
 14  /

View created.

And select from the new view:

rwijk@ORA11GR2> select * from v order by jn_scn
  2  /

J JN_USER    JN_DATETIME                          JN_SCN PKCOL FKCOL DESCRIPTION STATUS
- ---------- -------------------------------- ---------- ----- ----- ----------- ------
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     1     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     7     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     4     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     2     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     3     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213    10     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     5     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     6     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     8     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     9     1 ***********
U RWIJK      16-DEC-09 03.19.18.000000000 PM     5034217     1     1 a

11 rows selected.

11 rows? Before adding the column and constraint, there were 12! The last one - the delete - has disappeared. To find out if it is the last one or the delete, I add another column with constraint, and do a delete followed by an insert. Will it remove the last (the insert) or the delete?

rwijk@ORA11GR2> delete t where pkcol in (3,5)
  2  /

2 rows deleted.

rwijk@ORA11GR2> commit
  2  /

Commit complete.

rwijk@ORA11GR2> insert into t values ( 11, 1, 'bla', 'OLD' )
  2  /

1 row created.

rwijk@ORA11GR2> commit
  2  /

Commit complete.

rwijk@ORA11GR2> alter table t add (status2 varchar2(3))
  2  /

Table altered.

rwijk@ORA11GR2> alter table t add constraint ck_status2 check (status2 in ('NEW','OLD'))
  2  /

Table altered.

rwijk@ORA11GR2> pause

rwijk@ORA11GR2> create or replace view v
  2  as
  3  select tv.versions_operation jn_operation
  4       , ao.username jn_user
  5       , tv.versions_starttime jn_datetime
  6       , tv.versions_startscn jn_scn
  7       , tv.pkcol
  8       , tv.fkcol
  9       , tv.description
 10       , tv.status
 11       , tv.status2
 12    from t versions between scn minvalue and maxvalue tv
 13       , user_audit_object ao
 14   where tv.versions_xid = ao.transactionid (+)
 15  /

View created.

rwijk@ORA11GR2> select * from v order by jn_scn
  2  /

J JN_USER    JN_DATETIME                          JN_SCN PKCOL FKCOL DESCRIPTION STATUS STA
- ---------- -------------------------------- ---------- ----- ----- ----------- ------ ---
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     1     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     7     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     3     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     5     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     2     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     9     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213    10     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     4     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     6     1 ***********
I RWIJK      16-DEC-09 03.19.15.000000000 PM     5034213     8     1 ***********
U RWIJK      16-DEC-09 03.19.18.000000000 PM     5034217     1     1 a
I RWIJK      16-DEC-09 03.19.53.000000000 PM     5034390    11     1 bla         OLD

12 rows selected.

So it is the delete that is automatically removed.

Finally I want to know whether it was the addition of the column or the addition of the constraint that made the delete statement disappear. Since this behaviour was not there in 11.1.0.7 when you could add a column, I was tempted to believe it is the addition of the constraint. To know for sure, I repeated the above piece of code without adding the constraint:

rwijk@ORA11GR2> select * from v order by jn_scn
  2  /

J JN_USER    JN_DATETIME                          JN_SCN PKCOL FKCOL DESCRIPTION STATUS
- ---------- -------------------------------- ---------- ----- ----- ----------- ------
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     6     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     1     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     7     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     2     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521    10     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     9     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     8     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     5     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     4     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     3     1 ***********
U RWIJK      16-DEC-09 03.44.04.000000000 PM     5035535     1     1 a
D RWIJK      16-DEC-09 03.44.41.000000000 PM     5035710     3     1 ***********
D RWIJK      16-DEC-09 03.44.41.000000000 PM     5035710     5     1 ***********
I RWIJK      16-DEC-09 03.44.41.000000000 PM     5035713    11     1 bla         OLD

14 rows selected.

rwijk@ORA11GR2> pause

rwijk@ORA11GR2> alter table t add (status2 varchar2(3))
  2  /

Table altered.

rwijk@ORA11GR2> --alter table t add constraint ck_status2 check (status2 in ('NEW','OLD'))
rwijk@ORA11GR2> --/
rwijk@ORA11GR2> pause

rwijk@ORA11GR2> create or replace view v
  2  as
  3  select tv.versions_operation jn_operation
  4       , ao.username jn_user
  5       , tv.versions_starttime jn_datetime
  6       , tv.versions_startscn jn_scn
  7       , tv.pkcol
  8       , tv.fkcol
  9       , tv.description
 10       , tv.status
 11       , tv.status2
 12    from t versions between scn minvalue and maxvalue tv
 13       , user_audit_object ao
 14   where tv.versions_xid = ao.transactionid (+)
 15  /

View created.

rwijk@ORA11GR2> select * from v order by jn_scn
  2  /

J JN_USER    JN_DATETIME                          JN_SCN PKCOL FKCOL DESCRIPTION STATUS STA
- ---------- -------------------------------- ---------- ----- ----- ----------- ------ ---
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     1     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     7     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     3     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     5     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     2     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     9     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521    10     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     4     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     6     1 ***********
I RWIJK      16-DEC-09 03.43.29.000000000 PM     5035521     8     1 ***********
U RWIJK      16-DEC-09 03.44.04.000000000 PM     5035535     1     1 a
I RWIJK      16-DEC-09 03.44.41.000000000 PM     5035713    11     1 bla         OLD

12 rows selected.

So it is the addition of a column that made the delete disappear, and not the addition of the constraint.

The conclusion is that flashback data archives still cannot be used for journaling, but since the above seems like a quite obvious bug, a future patch set will probably solve it in the near future. And when that happens, it finally looks promising to use for journaling.

Wednesday, December 9, 2009

SYS_GUID()

A colleague of mine sent me this link, and wanted me to have a look at point 5:

Choose sys_guid() instead of sequences for populating meaningless PK-columns.
Traditional sequences are used for populating an ID-column.
Sequences have several disadvantages:
• Sequences have to be created, maintained and deployed.
• Sequences need PL/SQL-code to be fetched (the ‘select into’ construction is obsolete in 11g, but defaulting a column with sequence.nextval still does not work).
• When you introduce triggers to fetch the .nextval value, you introduce significant undesired performance penalties
• When moving your data from data from one environment to another (e.g. copy PRODUCTION data to ACCEPTANCE), you have to synchronize the .LASTVAL of all sequences.
• Fetching sequences can be a performance-critical in big environment (bulk fetch in Data warehouse for instance) which has to be fixed by increasing the CACHE property
Use guid's to overcome the disadvantages of sequences:
create table employees
( guid raw(32) default sys_guid() not null primary key);

I knew about the SYS_GUID() function, but didn't realize you can use it as a default value for a primary key column. That sure is convenient. Where other RDBMS all have some sort of autonumber/identity column to automatically fill an ID column, Oracle uses sequences, which is a little more cumbersome to implement. However, I found the story not really balanced. Before addressing each point and mentioning some of the disadvantages of SYS_GUID, let's see how it works. Using sequences, a simple setup can be like this:

rwijk@ORA11GR1> create sequence seq cache 1000
  2  /

Sequence created.

Here I create the sequence with "cache 1000", because I want a session to fetch 1000 numbers at once, instead of one each time. This reduces the overhead for Oracle to maintain the SYS.SEQ$ table. Gaps in the numbers might be large this way, but this shouldn't bother you. If it does bother you, you likely shouldn't be using a sequence, as they are never gap-free. A table is created without any reference to the sequence:

rwijk@ORA11GR1> create table t_seq
  2  ( id     number(9) primary key
  3  , filler varchar2(1000)
  4  )
  5  /

Table created.

To fill the id column with the sequence value, I prefer to just use the nextval function directly in my api. Another option is to use database triggers, but I'm on a mission to avoid those whenever possible. The next procedure represents such an api:

rwijk@ORA11GR1> create procedure p_seq
  2  is
  3  begin
  4    insert into t_seq (id,filler)
  5     select seq.nextval
  6          , lpad('*',1000,'*')
  7       from dual
  8    connect by level <= 20000
  9    ;
 10  end;
 11  /

Procedure created.

Using the SYS_GUID() function, a similar setup would look like this:

rwijk@ORA11GR1> create table t_sysguid
  2  ( id     raw(16) default sys_guid() primary key
  3  , filler varchar2(1000)
  4  )
  5  /

Table created.

rwijk@ORA11GR1> create procedure p_sysguid
  2  is
  3  begin
  4    insert into t_sysguid (filler)
  5     select lpad('*',1000,'*')
  6       from dual
  7    connect by level <= 20000
  8    ;
  9  end;
 10  /

Procedure created.

One object less (the sequence), and you don't need to bother about the ID column in the insert statement. The mentioned disadvantages of sequences per point:

Sequences have to be created, maintained and deployed
While obviously true, I don't think it is a major point. A sequence never comes alone, so we are already creating, maintaining and deploying database objects. But it adds up a little indeed.

Sequences need PL/SQL-code to be fetched (the ‘select into’ construction is obsolete in 11g, but defaulting a column with sequence.nextval still does not work)
You don't necessarily need PL/SQL to use a sequence, as shown above. So this is untrue. Although it is true that you cannot default a column with sequence.nextval.

When you introduce triggers to fetch the .nextval value, you introduce significant undesired performance penalties
So don't introduce database triggers to fetch the .nextval value :-)

When moving your data from data from one environment to another (e.g. copy PRODUCTION data to ACCEPTANCE), you have to synchronize the .LASTVAL of all sequences
This is true. Very minor disadvantage though, in my opinion. If we refresh the acceptance database, we don't only refresh the data, but also the objects itself.

Fetching sequences can be a performance-critical in big environment (bulk fetch in Data warehouse for instance) which has to be fixed by increasing the CACHE property.
The solution to the possible problem is already presented here.

No advantages of sequences over SYS_GUID are mentioned, so here are some of mine:

A RAW(16) column takes up more space than a NUMBER column.
The SYS_GUID is a RAW(16) taking up 16 bytes, whereas as an ID populated by a sequences typically is a NUMBER(6), NUMBER(9) or NUMBER(12). A reasonable average is NUMBER(9) which takes up only 4 bytes. When you have foreign key columns in your table, this effect is multiplied.

Ad-hoc querying has become more cumbersome.
Which one would you rather type. This:

select id from t_sysguid where id = '48000DA3C76E41A789C7E8925A91E28A'

Or this:

select id from t_seq where id = 1234

You can no longer see the order of inserts by the ID column.
When using the cache property and multiple sessions, this is neither necessarily true for sequence populated ID columns, but there is a strong correlation between the number and the time it was populated. This "information" is not there with SYS_GUID() populated columns.

Performance seems to be slightly less when using SYS_GUID()
The results of my test varied a lot. But generally it favoured the sequence approach. Here is the output of one of the more representative tests:

rwijk@ORA11GR1> begin
  2    runstats_pkg.rs_start;
  3    p_seq;
  4    runstats_pkg.rs_middle;
  5    p_sysguid;
  6    runstats_pkg.rs_stop(1000);
  7  end;
  8  /
Run1 draaide in 1670 hsecs
Run2 draaide in 2220 hsecs
Run1 draaide in 75.23% van de tijd

Naam                                                      Run1        Run2    Verschil
STAT.recursive calls                                     1,162          49      -1,113
LATCH.enqueue hash chains                                1,626         508      -1,118
STAT.hot buffers moved to head of LRU                    1,213          75      -1,138
STAT.dirty buffers inspected                             5,613       3,266      -2,347
STAT.free buffer requested                               6,215       3,482      -2,733
STAT.free buffer inspected                               6,678       3,577      -3,101
STAT.redo subscn max counts                              4,147         753      -3,394
STAT.heap block compress                                 3,678       7,119       3,441
LATCH.cache buffer handles                               4,922           0      -4,922
STAT.Heap Segment Array Inserts                         12,512       7,156      -5,356
STAT.HSC Heap Segment Block Changes                     12,582       7,156      -5,426
STAT.calls to get snapshot scn: kcmgss                   6,752         152      -6,600
LATCH.checkpoint queue latch                            17,879       9,818      -8,061
LATCH.cache buffers lru chain                           27,793      13,845     -13,948
LATCH.object queue header operation                     40,308      25,913     -14,395
STAT.consistent gets                                    26,810       7,302     -19,508
STAT.consistent gets from cache                         26,810       7,302     -19,508
STAT.consistent gets from cache (fastpath)              19,687         145     -19,542
STAT.redo entries                                       42,552      71,492      28,940
STAT.db block gets from cache (fastpath)                21,080      57,202      36,122
STAT.session logical reads                              98,920     144,900      45,980
LATCH.simulator hash latch                               5,752      56,028      50,276
STAT.db block gets from cache                           72,110     137,598      65,488
STAT.db block gets                                      72,110     137,598      65,488
STAT.db block changes                                   63,379     129,357      65,978
LATCH.sequence cache                                   150,100           0    -150,100
LATCH.cache buffers chains                             396,737     572,383     175,646
STAT.undo change vector size                         2,564,000   4,921,872   2,357,872
STAT.physical read total bytes                      17,776,640  23,355,392   5,578,752
STAT.physical read bytes                            17,776,640  23,355,392   5,578,752
STAT.cell physical IO interconnect bytes            17,776,640  23,355,392   5,578,752
STAT.physical IO disk bytes                         17,776,640  23,355,392   5,578,752
STAT.redo size                                      59,361,376  66,097,504   6,736,128

Run1 latches totaal versus run2 -- verschil en percentage
Run1      Run2  Verschil     Pct
679,069   707,672    28,603  95.96%

PL/SQL procedure successfully completed.

So while I am certainly not saying no to SYS_GUID(), I thought there was a little more to the story than the all-positive points in the link.

If you know some more advantages or disadvantages from one over the other, please say so in a comment.

Thursday, December 3, 2009

UKOUG 2009

UKOUG Conference Series, Technology & E-Business Suite 2009 is over. Three days were filled with highly technical presentations in several streams. It was my first time, and I hope it wasn't my last time.

The presentations I saw were:

John Scott - Building Scalable Applications with Oracle Application Express
Joel Goodman - Oracle Database Links part 2 - Distributed Transactions
Jonathan Lewis - Writing Optimal SQL
Randolf Geist - Understanding the different modes of System Statistics
Wolfgang Breitling - Seeding Statistics
Anthony Rayner - Building Rich Web Applications! Inside look into the Oracle APEX 4.0 Development Lab
Graham Wood - What's new in the Oracle Database?
Connor McDonald - 11g features for Developers
Jonathan Lewis - Introducing Partitions
Julian Dyke - Vital Statistics
Andrew Clarke - PL/SQL API Design Workshop
Randolf Geist - Everything you always wanted to know about FIRST_ROWS_N but were afraid to ask

On Wednesday I had my own presentation "All About Grouping". Unfortunately I was starting to get sick on Tuesday afternoon. My walk from the hotel to the International Convention Centre included passing a nice Christmas market. On Wednesday morning everyone on this market looked like they thought it was cold (a few degrees Celsius), but I was walking there sweating like crazy and with a running nose. I skipped a few sessions, ate only the fruit and salad from my lunch bag, hoping I would feel a little better when the presentation started. When I had started my presentation I noticed I forgot to plug in my presentation device. I decided to move on and just stay with the speakers stand. I started to sweat again heavily, and not (only) because of nerves. Half-way through my stomach became empty and started making some noise. Nice, with a microphone just above your stomach ... Previously, this presentation took 50 minutes. Because I had an hour, I decided to get one more example in. Still, I finished in 45 minutes this time, so I guess I should slow down some more. If anything, this was a great learning experience for me about presenting. Supposedly, the audience doesn't experience a presentation in the same way that the presenter does. I very much hope that's true.

The slightly adjusted presentation and demonstration scripts can be found here and here. A white paper about grouping will appear shortly on this blog.

Tuesday, November 24, 2009

Recursive subquery factoring

Version 11g release 2 introduced recursive subquery factoring or the recursive with clause. This is an extension to the SQL syntax with which you can do recursive/hierarchical queries. However, since version 2, Oracle has had the connect-by clause for hierarchical queries. And at first glance, the connect-by and the recursive-with seem very similar in what they can do. But on a second look, there are some really interesting differences. This post explores some of the similarities and the differences between the two.

Let's start with a query that will be familiar with everyone who has followed a SQL course at the start of their career: the classic example using the EMP table:

rwijk@ORA11GR2>  select lpad(' ', 2 * level - 2, ' ') || ename as ename
  2        , empno
  3        , mgr
  4        , level
  5     from emp
  6  connect by mgr = prior empno
  7    start with mgr is null
  8  /

ENAME                     EMPNO        MGR      LEVEL
-------------------- ---------- ---------- ----------
KING                       7839                     1
  JONES                    7566       7839          2
    SCOTT                  7788       7566          3
      ADAMS                7876       7788          4
    FORD                   7902       7566          3
      SMITH                7369       7902          4
  BLAKE                    7698       7839          2
    ALLEN                  7499       7698          3
    WARD                   7521       7698          3
    MARTIN                 7654       7698          3
    TURNER                 7844       7698          3
    JAMES                  7900       7698          3
  CLARK                    7782       7839          2
    MILLER                 7934       7782          3

14 rows selected.

The hierarchy is made visible by left padding the ename with spaces according to the level. The lower in the hierarchy, the more the ename is indented. Using the recursive with clause, the query becomes:

rwijk@ORA11GR2> with emps (ename,empno,mgr,lvl) as
  2  ( select ename
  3         , empno
  4         , mgr
  5         , 1
  6      from emp
  7     where mgr is null
  8     union all
  9    select emp.ename
 10         , emp.empno
 11         , emp.mgr
 12         , emps.lvl + 1
 13      from emp
 14           join emps on (emp.mgr = emps.empno)
 15  ) search depth first by empno set a
 16  select lpad(' ', 2 * lvl - 2, ' ') || ename as ename
 17       , empno
 18       , mgr
 19       , lvl
 20    from emps
 21    order by a
 22  /

ENAME                     EMPNO        MGR        LVL
-------------------- ---------- ---------- ----------
KING                       7839                     1
  JONES                    7566       7839          2
    SCOTT                  7788       7566          3
      ADAMS                7876       7788          4
    FORD                   7902       7566          3
      SMITH                7369       7902          4
  BLAKE                    7698       7839          2
    ALLEN                  7499       7698          3
    WARD                   7521       7698          3
    MARTIN                 7654       7698          3
    TURNER                 7844       7698          3
    JAMES                  7900       7698          3
  CLARK                    7782       7839          2
    MILLER                 7934       7782          3

14 rows selected.

The query has become a lot more verbose, but as you can see, it can do the same kind of hierarchical queries as the good old connect-by. With the recursive with clause you don't have a LEVEL pseudocolumn, but you can easily mimic one as above. The first part of the UNION ALL contains the start-with query, where the newly introduced column contains the number 1. The second part of the query contains the recursion where you refer to the name of the recursive-with clause. The select list contains "emps.lvl + 1" to calculate the level.

Using the same technique you can also mimic other helper functions. On the AMIS-blog, Lucas Jellema already wrote how to simulate SYS_CONNECT_BY_PATH, CONNECT_BY_ROOT, and even CONNECT_BY_ISLEAF. Now let's have a look at some of the differences:

Cycle detection

If your data contains a cycle, the query could run infinitely. So Oracle detects this situation and lets the query fail. We can see that behaviour by letting JONES be the manager of KING. Now KING manages JONES and JONES manages KING. A loop:

rwijk@ORA11GR2> update emp
  2     set mgr = 7566
  3   where ename = 'KING'
  4  /

1 row updated.

rwijk@ORA11GR2>  select lpad(' ', 2 * level - 2, ' ') || ename as ename
  2        , empno
  3        , mgr
  4        , level
  5     from emp
  6  connect by mgr = prior empno
  7    start with ename = 'KING'
  8  /
ERROR:
ORA-01436: CONNECT BY loop in user data

The recursive with clause also has cycle detection, of course:

rwijk@ORA11GR2> with emps (ename,empno,mgr,lvl) as
  2  ( select ename
  3         , empno
  4         , mgr
  5         , 1
  6      from emp
  7     where ename = 'KING'
  8     union all
  9    select emp.ename
 10         , emp.empno
 11         , emp.mgr
 12         , emps.lvl + 1
 13      from emp
 14           join emps on (emp.mgr = emps.empno)
 15  ) search depth first by empno set a
 16  select lpad(' ', 2 * lvl - 2, ' ') || ename as ename
 17       , empno
 18       , mgr
 19       , lvl
 20    from emps
 21   order by a
 22  /
  from emps
       *
ERROR at line 20:
ORA-32044: cycle detected while executing recursive WITH query

A different error code, but a similar message.

Since version 10, the connect-by has the NOCYCLE attribute and the CONNECT_BY_ISCYCLE pseudocolumn:

rwijk@ORA11GR2>  select lpad(' ', 2 * level - 2, ' ') || ename as ename
  2        , empno
  3        , mgr
  4        , level
  5        , connect_by_iscycle
  6     from emp
  7  connect by nocycle mgr = prior empno
  8    start with ename = 'KING'
  9  /

ENAME                     EMPNO        MGR      LEVEL CONNECT_BY_ISCYCLE
-------------------- ---------- ---------- ---------- ------------------
KING                       7839       7566          1                  0
  JONES                    7566       7839          2                  1
    SCOTT                  7788       7566          3                  0
      ADAMS                7876       7788          4                  0
    FORD                   7902       7566          3                  0
      SMITH                7369       7902          4                  0
  BLAKE                    7698       7839          2                  0
    ALLEN                  7499       7698          3                  0
    WARD                   7521       7698          3                  0
    MARTIN                 7654       7698          3                  0
    TURNER                 7844       7698          3                  0
    JAMES                  7900       7698          3                  0
  CLARK                    7782       7839          2                  0
    MILLER                 7934       7782          3                  0

14 rows selected.

Here you see that the query continues, and just stops processing the branch where the cycle was detected. The CONNECT_BY_ISCYCLE shows a "1" exactly at the spot where this cycle was detected.

The recursive with clause has a CYCLE clause you can use for something similar:

rwijk@ORA11GR2> with emps (ename,empno,mgr,lvl) as
  2  ( select ename
  3         , empno
  4         , mgr
  5         , 1
  6      from emp
  7     where ename = 'KING'
  8     union all
  9    select emp.ename
 10         , emp.empno
 11         , emp.mgr
 12         , emps.lvl + 1
 13      from emp
 14           join emps on (emp.mgr = emps.empno)
 15  ) search depth first by empno set a
 16    cycle empno set is_cycle to '1' default '0'
 17  select lpad(' ', 2 * lvl - 2, ' ') || ename as ename
 18       , empno
 19       , mgr
 20       , lvl
 21       , is_cycle
 22    from emps
 23  /

ENAME                     EMPNO        MGR        LVL I
-------------------- ---------- ---------- ---------- -
KING                       7839       7566          1 0
  JONES                    7566       7839          2 0
    SCOTT                  7788       7566          3 0
      ADAMS                7876       7788          4 0
    KING                   7839       7566          3 1
    FORD                   7902       7566          3 0
      SMITH                7369       7902          4 0
  BLAKE                    7698       7839          2 0
    ALLEN                  7499       7698          3 0
    WARD                   7521       7698          3 0
    MARTIN                 7654       7698          3 0
    TURNER                 7844       7698          3 0
    JAMES                  7900       7698          3 0
  CLARK                    7782       7839          2 0
    MILLER                 7934       7782          3 0

15 rows selected.

But there is a clear difference here: the recursive with clause cannot spot the cycle, until it has processed another recursion level. So you'll see KING repeated here, and the is_cycle attribute is set to '1' at one level lower than the CONNECT_BY_ISCYCLE pseudocolumn did. Quassnoi provided a nice answer to my Stack Overflow question about this topic. He states:

CONNECT_BY_ISCYCLE checks the children (which are yet to be returned), while CYCLE checks the current row (which is already returned).

CONNECT BY is row based, while recursive CTE's are set-based.

I think the results of the recursive with clause are somewhat counterintuitive, but I have heard several people stating otherwise. No matter what anyone thinks, what matters is to be aware there is a difference.

Calculate using previously calculated values

In part three of my SQL Model Clause Tutorial, I used this example of a simplified financial product of which I want to calculate the balance at the end of each year:

rwijk@ORA11GR2> select * from deposits
  2  /

  CUSTOMER     AMOUNT THE_DATE
---------- ---------- -------------------
         1       1000 01-01-2003 00:00:00
         1        200 01-01-2004 00:00:00
         1        500 01-01-2005 00:00:00
         1        100 01-01-2006 00:00:00
         1        800 01-01-2007 00:00:00
         2         20 01-01-2003 00:00:00
         2        150 01-01-2004 00:00:00
         2         60 01-01-2005 00:00:00
         2        100 01-01-2006 00:00:00
         2        100 01-01-2007 00:00:00

10 rows selected.

rwijk@ORA11GR2> select * from interest_rates
  2  /

THE_DATE            PERCENTAGE
------------------- ----------
01-01-2003 00:00:00          5
01-01-2004 00:00:00        3.2
01-01-2005 00:00:00        4.1
01-01-2006 00:00:00        5.8
01-01-2007 00:00:00        4.9

5 rows selected.

rwijk@ORA11GR2> select customer
  2       , amount
  3       , the_date
  4       , percentage
  5       , balance balance_at_end_of_year
  6    from deposits s
  7       , interest_rates r
  8   where s.the_date = r.the_date
  9   model
 10         partition by (s.customer)
 11         dimension by (s.the_date)
 12         measures (s.amount, r.percentage, 0 balance)
 13         rules
 14         ( balance[any] order by the_date
 15           = round
 16             (   (nvl(balance[add_months(cv(),-12)],0) + amount[cv()])
 17               * (1 + percentage[cv()]/100)
 18             , 2
 19             )
 20         )
 21   order by customer
 22       , the_date
 23  /

  CUSTOMER     AMOUNT THE_DATE            PERCENTAGE BALANCE_AT_END_OF_YEAR
---------- ---------- ------------------- ---------- ----------------------
         1       1000 01-01-2003 00:00:00          5                   1050
         1        200 01-01-2004 00:00:00        3.2                   1290
         1        500 01-01-2005 00:00:00        4.1                1863.39
         1        100 01-01-2006 00:00:00        5.8                2077.27
         1        800 01-01-2007 00:00:00        4.9                3018.26
         2         20 01-01-2003 00:00:00          5                     21
         2        150 01-01-2004 00:00:00        3.2                 176.47
         2         60 01-01-2005 00:00:00        4.1                 246.17
         2        100 01-01-2006 00:00:00        5.8                 366.25
         2        100 01-01-2007 00:00:00        4.9                  489.1

10 rows selected.

Here 1050 = (1000 * 1.05), and 1290 = (1050 + 200) * 1.032. In other words, you need the calculated value of the previous step to calculate the new balance. You can produce this exact result with the recursive with clause like this:

rwijk@ORA11GR2> with t as
  2  ( select s.customer
  3         , r.the_date
  4         , 1 + r.percentage/100 factor
  5         , s.amount
  6      from deposits s
  7         , interest_rates r
  8     where s.the_date = r.the_date
  9  )
 10  , t_recursive (customer,the_date,amount) as
 11  ( select customer
 12         , min(the_date)
 13         , min(amount * factor) keep (dense_rank first order by the_date)
 14      from t
 15     group by customer
 16     union all
 17    select t.customer
 18         , t.the_date
 19         , round((tr.amount + t.amount) * t.factor,2)
 20      from t
 21         , t_recursive tr
 22     where tr.customer = t.customer
 23       and tr.the_date = t.the_date - interval '1' year
 24  )
 25  select *
 26    from t_recursive
 27   order by customer
 28       , the_date
 29  /

  CUSTOMER THE_DATE                AMOUNT
---------- ------------------- ----------
         1 01-01-2003 00:00:00       1050
         1 01-01-2004 00:00:00       1290
         1 01-01-2005 00:00:00    1863.39
         1 01-01-2006 00:00:00    2077.27
         1 01-01-2007 00:00:00    3018.26
         2 01-01-2003 00:00:00         21
         2 01-01-2004 00:00:00     176.47
         2 01-01-2005 00:00:00     246.17
         2 01-01-2006 00:00:00     366.25
         2 01-01-2007 00:00:00      489.1

10 rows selected.

The first with clause is a regular one, to only do the join once. And as you can see, combining those clauses is not a problem, although you can't use two recursive with clauses in one statement. The first union all part selects the balance for each customer at the end of their first year (2003). The second part uses recursion to get the other years. The point here is that you can use the expression round((tr.amount + t.amount) * t.factor,2), referencing tr.amount, which is a calculated value. With the connect-by clause you can select PRIOR [column_name], but you cannot select PRIOR [calculated_value]. So what used to be impossible with the connect-by, is now possible with the recursive-with (for people who don't like the model clause that is :-) ). So recursive-with is more powerful.

Performance

Let's check how our running example query performs:

rwijk@ORA11GR2>  select /*+ gather_plan_statistics */
  2          lpad(' ', 2 * level - 2, ' ') || ename as ename
  3        , empno
  4        , mgr
  5        , level
  6     from emp
  7  connect by mgr = prior empno
  8    start with mgr is null
  9  /

ENAME                     EMPNO        MGR      LEVEL
-------------------- ---------- ---------- ----------
KING                       7839                     1
  JONES                    7566       7839          2
    SCOTT                  7788       7566          3
      ADAMS                7876       7788          4
    FORD                   7902       7566          3
      SMITH                7369       7902          4
  BLAKE                    7698       7839          2
    ALLEN                  7499       7698          3
    WARD                   7521       7698          3
    MARTIN                 7654       7698          3
    TURNER                 7844       7698          3
    JAMES                  7900       7698          3
  CLARK                    7782       7839          2
    MILLER                 7934       7782          3

14 rows selected.

rwijk@ORA11GR2> select * from table
  2    (dbms_xplan.display_cursor(null,null,'basic iostats last'))
  3  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------

EXPLAINED SQL STATEMENT:
------------------------
 select /*+ gather_plan_statistics */         lpad(' ', 2 * level - 2,
' ') || ename as ename       , empno       , mgr       , level    from
emp connect by mgr = prior empno   start with mgr is null

Plan hash value: 763482334

--------------------------------------------------------------------------------
--------------------------

| Id  | Operation                               | Name | Starts | E-Rows | A-Row
s |   A-Time   | Buffers |

--------------------------------------------------------------------------------
--------------------------

|   0 | SELECT STATEMENT                        |      |      1 |        |     1
4 |00:00:00.01 |       7 |

|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|      |      1 |        |     1
4 |00:00:00.01 |       7 |

|   2 |   TABLE ACCESS FULL                     | EMP  |      1 |     14 |     1
4 |00:00:00.01 |       7 |

--------------------------------------------------------------------------------
--------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("MGR"=PRIOR NULL)
       filter("MGR" IS NULL)


22 rows selected.

Just one full table access. A nice optimization for this kind of query. Here you can see how the plan looked like in previous versions. Probably the cost based optimizer realizes that no row can appear more than once in the output - else it would be a cycle - and thus it doesn't necessarily need the "connect-by-pump" access path.

Here is how the plan of the recursive with clause looks like:

rwijk@ORA11GR2> with emps (ename,empno,mgr,lvl) as
  2  ( select ename
  3         , empno
  4         , mgr
  5         , 1
  6      from emp
  7     where mgr is null
  8     union all
  9    select emp.ename
 10         , emp.empno
 11         , emp.mgr
 12         , emps.lvl + 1
 13      from emp
 14           join emps on (emp.mgr = emps.empno)
 15  ) search depth first by empno set a
 16  select /*+ gather_plan_statistics */
 17         lpad(' ', 2 * lvl - 2, ' ') || ename as ename
 18       , empno
 19       , mgr
 20       , lvl
 21    from emps
 22  /

ENAME                     EMPNO        MGR        LVL
-------------------- ---------- ---------- ----------
KING                       7839                     1
  JONES                    7566       7839          2
    SCOTT                  7788       7566          3
      ADAMS                7876       7788          4
    FORD                   7902       7566          3
      SMITH                7369       7902          4
  BLAKE                    7698       7839          2
    ALLEN                  7499       7698          3
    WARD                   7521       7698          3
    MARTIN                 7654       7698          3
    TURNER                 7844       7698          3
    JAMES                  7900       7698          3
  CLARK                    7782       7839          2
    MILLER                 7934       7782          3

14 rows selected.

rwijk@ORA11GR2> select * from table
  2    (dbms_xplan.display_cursor(null,null,'basic iostats last'))
  3  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------

EXPLAINED SQL STATEMENT:
------------------------
with emps (ename,empno,mgr,lvl) as ( select ename        , empno
, mgr        , 1     from emp    where mgr is null    union all
select emp.ename        , emp.empno        , emp.mgr        , emps.lvl
+ 1     from emp          join emps on (emp.mgr = emps.empno) ) search
depth first by empno set a select /*+ gather_plan_statistics */
lpad(' ', 2 * lvl - 2, ' ') || ename as ename      , empno      , mgr
   , lvl   from emps

Plan hash value: 3907725112

--------------------------------------------------------------------------------
--------------------------

| Id  | Operation                               | Name | Starts | E-Rows | A-Row
s |   A-Time   | Buffers |

--------------------------------------------------------------------------------
--------------------------

|   0 | SELECT STATEMENT                        |      |      1 |        |     1
4 |00:00:00.01 |      35 |

|   1 |  VIEW                                   |      |      1 |      3 |     1
4 |00:00:00.01 |      35 |

|   2 |   UNION ALL (RECURSIVE WITH) DEPTH FIRST|      |      1 |        |     1
4 |00:00:00.01 |      35 |

|*  3 |    TABLE ACCESS FULL                    | EMP  |      1 |      1 |
1 |00:00:00.01 |       7 |

|*  4 |    HASH JOIN                            |      |      4 |      2 |     1
3 |00:00:00.01 |      28 |

|   5 |     RECURSIVE WITH PUMP                 |      |      4 |        |     1
4 |00:00:00.01 |       0 |

|*  6 |     TABLE ACCESS FULL                   | EMP  |      4 |     13 |     5
2 |00:00:00.01 |      28 |

--------------------------------------------------------------------------------
--------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("MGR" IS NULL)
   4 - access("EMP"."MGR"="EMPS"."EMPNO")
   6 - filter("EMP"."MGR" IS NOT NULL)


31 rows selected.

In this plan, step 3 represents the first part of the union all (the start with). Steps 4 to 6 are executed as much times as there are levels; 4 in this case (note the "Starts" column!) and in step 2 the results are put together. It is like you write the query. But it's not optimized in the same way as the connect-by query is. In this case it's five times slower. Mind you, I'm not saying: don't ever use this technique because of performance. Just mentioning that there are differences that you should be aware of when writing a hierarchical query.

I'd like to end with two special "use cases":

1) Remember the First International NoCOUG SQL Challenge? You can solve it with the recursive with clause.

2) And you can even solve sudoku's with it.

Tuesday, November 17, 2009

OGh's ACE Case: Oracle 11g Release 2 for developers

This year, the Dutch Oracle usergroup OGh has started to offer a set of 4 presentations, called the ACE Cases. Each presentation is given by a Dutch Oracle ACE about his specialty. The first one was about XMLDB by Marco Gralike. The second one was about SOA Governance by Mike van Alst. For the third one, OGh had invited me to talk about something related to the subject "database". Today I had the pleasure to present a session called Oracle 11g Release 2 for developers. You can download the presentation and the demonstration scripts below.

Powerpoint OGh's ACE Case: Oracle 11g Release 2 voor ontwikkelaars

Bijbehorende demonstratiescripts

The usual warning: they are in Dutch, but that doesn't matter much if you are interested in the demonstration scripts.

Thanks to everyone who attended!

Wednesday, October 28, 2009

Three new "hints"

In version 11.2.0.1, Oracle introduced three new hints: IGNORE_ROW_ON_DUPKEY_INDEX, CHANGE_DUPKEY_ERROR_INDEX and RETRY_ON_ROW_CHANGE. These are not normal hints like we know them. Oracle writes:

Note:
The CHANGE_DUPKEY_ERROR_INDEX, IGNORE_ROW_ON_DUPKEY_INDEX, and RETRY_ON_ROW_CHANGE hints are unlike other hints in that they have a semantic effect. The general philosophy explained in "Hints" does not apply for these three hints.

That's strange. Why does Oracle mix up the concept of a hint by introducing three new hints that have a semantic effect? Now you have two kind of hints. You have optimizer hints and you have semantic hints. Both are silently ignored when you make a syntax error. But there is no way to tell with which kind you are dealing, unless you know them by heart. And it becomes harder to explain. It means you cannot advice to just gather statistics and remove all hints anymore. Couldn't this be solved by adding an extra keyword? Or were they reluctant to introduce another deviation from the ANSI standard? On the other hand, we now have some extra options available, and that's good. Let's explore them.

IGNORE_ROW_ON_DUPKEY_INDEX

From the documentation: "When a unique key violation is encountered, a row-level rollback occurs and execution resumes with the next input row."

An example:

rwijk@ORA11GR2> create table mytable
  2  ( id   number(2)   primary key
  3  , name varchar2(3) not null
  4  )
  5  /

Table created.

rwijk@ORA11GR2> insert into mytable values (4,'old')
  2  /

1 row created.

rwijk@ORA11GR2> insert into mytable values (9,'old')
  2  /

1 row created.

rwijk@ORA11GR2> select * from mytable
  2  /

        ID NAM
---------- ---
         4 old
         9 old

2 rows selected.

rwijk@ORA11GR2> insert /*+ ignore_row_on_dupkey_index(mytable(id)) */ into mytable
  2  ( id
  3  , name
  4  )
  5   select level
  6        , 'new'
  7     from dual
  8  connect by level <= 10
  9  /

8 rows created.

rwijk@ORA11GR2> select *
  2    from mytable
  3   order by id
  4  /

        ID NAM
---------- ---
         1 new
         2 new
         3 new
         4 old
         5 new
         6 new
         7 new
         8 new
         9 old
        10 new

10 rows selected.

So the insert statement with the ignore_row_on_dupkey_index was specified to insert 10 rows, but due to the current contents of the table, only 8 rows were actually inserted, since 4 and 9 already exist. In 10g Release 2 you could achieve a similar effect by using the DBMS_ERRLOG package. Using the DBMS_ERRLOG package is a bit more cumbersome and it is different since it logs ALL errors, not just the unique key violations.

CHANGE_DUPKEY_ERROR_INDEX

From the documentation: "The CHANGE_DUPKEY_ERROR_INDEX hint provides a mechanism to unambiguously identify a unique key violation for a specified set of columns or for a specified index. When a unique key violation occurs for the specified index, an ORA-38911 error is reported instead of an ORA-001.". An example with a table containing a primary key constraint and a unique key constraint:

rwijk@ORA11GR2> create table t
  2  ( id   number(2)
  3  , name varchar2(3) not null
  4  , constraint t_pk primary key (id)
  5  , constraint t_uk unique(name)
  6  )
  7  /

Table created.

rwijk@ORA11GR2> insert into t values (1,'bla')
  2  /

1 row created.

Whether the primary key constraint or the unique key constraint is violated, you'll always get an ORA-00001 error message:

rwijk@ORA11GR2> insert into t values (1,'abc')
  2  /
insert into t values (1,'abc')
*
ERROR at line 1:
ORA-00001: unique constraint (RWIJK.T_PK) violated


rwijk@ORA11GR2> insert into t values (2,'bla')
  2  /
insert into t values (2,'bla')
*
ERROR at line 1:
ORA-00001: unique constraint (RWIJK.T_UK) violated

With the change_dupkey_error_index hint you can reroute one of the ORA-00001 error messages to an ORA-38911 error. Which unique key violation is rerouted, depends on the columns or unique index name specified after the table name in the hint.

rwijk@ORA11GR2> insert /*+ change_dupkey_error_index (t(name)) */ into t
  2  values (2,'bla')
  3  /
insert /*+ change_dupkey_error_index (t(name)) */ into t
*
ERROR at line 1:
ORA-38911: unique constraint (RWIJK.T_UK) violated


rwijk@ORA11GR2> insert /*+ change_dupkey_error_index (t(id)) */ into t
  2  values (1, 'abc')
  3  /
insert /*+ change_dupkey_error_index (t(id)) */ into t
*
ERROR at line 1:
ORA-38911: unique constraint (RWIJK.T_PK) violated

This can be handy in PL/SQL to distinguish one unique key violation from the other, without having to parse the error message for the constraint name:

rwijk@ORA11GR2> declare
  2    e_name_uk_violated exception;
  3    pragma exception_init(e_name_uk_violated,-38911);
  4  begin
  5    insert /*+ change_dupkey_error_index (t,t_uk) */ into t values (1,'abc');
  6  exception
  7  when e_name_uk_violated then
  8    dbms_output.put_line('Name should be unique, and this name already exists.');
  9  when dup_val_on_index then
 10    dbms_output.put_line('Some other unique constraint was violated');
 11  end;
 12  /
Some other unique constraint was violated

PL/SQL procedure successfully completed.

rwijk@ORA11GR2> declare
  2    e_id_pk_violated exception;
  3    pragma exception_init(e_id_pk_violated,-38911);
  4  begin
  5    insert /*+ change_dupkey_error_index (t,t_pk) */ into t values (1,'abc');
  6  exception
  7  when e_id_pk_violated then
  8    dbms_output.put_line('Id should be unique, and this Id already exists.');
  9  when dup_val_on_index then
 10    dbms_output.put_line('Some other unique constraint was violated');
 11  end;
 12  /
Id should be unique, and this Id already exists.

PL/SQL procedure successfully completed.

RETRY_ON_ROW_CHANGE

This one puzzles me. The documentation says: "When you specify this hint, the operation is retried when the ORA_ROWSCN for one or more rows in the set has changed from the time the set of rows to be modified is determined to the time the block is actually modified.".

Ok, I can reproduce that. First create a table t with 50,000 rows and add a row level trigger that logs autonomously into a table called logtable. This way we can monitor the progress of the update statement in a separate session. And it slows down the update considerably, which is a good thing for the test.

rwijk@ORA11GR2> create table t
  2  ( id  number(9) primary key
  3  , col number(9) not null
  4  )
  5  /

Table created.

rwijk@ORA11GR2> insert /*+ append */ into t
  2   select level
  3        , level
  4     from dual
  5  connect by level <= 50000
  6  /

50000 rows created.

rwijk@ORA11GR2> create table logtable
  2  ( last_id                 number(9)
  3  , number_of_processed_ids number(9)
  4  )
  5  /

Table created.

rwijk@ORA11GR2> insert into logtable values (null,0)
  2  /

1 row created.

rwijk@ORA11GR2> create procedure p (p_id in t.id%type)
  2  is
  3    pragma autonomous_transaction;
  4  begin
  5    update logtable
  6       set last_id = p_id
  7         , number_of_processed_ids = number_of_processed_ids + 1
  8    ;
  9    commit;
 10  end;
 11  /

Procedure created.

rwijk@ORA11GR2> create trigger t_aru
  2    after update on t
  3    for each row
  4  begin
  5    p(:new.id);
  6  end;
  7  /

Trigger created.

And now we do an update with the retry_on_row_change hint:

rwijk@ORA11GR2> update /*+ retry_on_row_change */ t
  2     set col = col * -1
  3  /

And while this statement is running, open a second session and run this:

rwijk@ORA11GR2> select *
  2    from logtable
  3  /

   LAST_ID NUMBER_OF_PROCESSED_IDS
---------- -----------------------
     14754                   14754

1 row selected.

rwijk@ORA11GR2> update t
  2     set col = 888888
  3   where id = 40000
  4  /

1 row updated.

rwijk@ORA11GR2> commit
  2  /

Commit complete.

rwijk@ORA11GR2> select *
  2    from logtable
  3  /

   LAST_ID NUMBER_OF_PROCESSED_IDS
---------- -----------------------
     17507                   17508

1 row selected.

So the row with id 40000 is updated and committed, while our statement from session 1 has not reached this row yet.

After session 1 completed the update it says:

50000 rows updated.

But here is what happened (from session 1):

rwijk@ORA11GR2> select *
  2    from t
  3   where id between 39999 and 40001
  4  /

        ID        COL
---------- ----------
     39999     -39999
     40000    -888888
     40001     -40001

3 rows selected.

Session 2 has set COL to 888888, and session 1 has set this value to -888888.

rwijk@ORA11GR2> select *
  2    from logtable
  3  /

   LAST_ID NUMBER_OF_PROCESSED_IDS
---------- -----------------------
     50000                   89922

1 row selected.

By the number_of_processed_ids column you can deduce that the statement was restarted once it has fetched the block with ID 40000 in it. I also checked it in session 2 by reexecuting the last statement several times. The last_id column dropped back to 1 and the entire statement was re-executed. This leads to a total of 89,922 (50,000 + almost 40,000) number of processed ID's. So the statement was retried on detecting a row change. And the table doesn't need the ROWDEPENDENCIES clause to make this happen.

But what's the point? Oracle has its write consistency mechanism that makes sure a statement gets restarted under the covers if during a DML a row has been changed that has not been locked yet. For example, if we add a where clause to the update statement saying "id = col", then the statement sets out to update 50,000 rows. Now, if another session update the col value of ID 40000 to 888888, the update statement of session 1 will detect that automatically and restarts the statement. Without the need for a RETRY_ON_ROW_CHANGE hint:

From session 1 with the same DDL as above:

rwijk@ORA11GR2> update t
  2     set col = col * -1
  3   where id = col
  4  /

And while this statement runs, update the ID 40000 record in session 2:

rwijk@ORA11GR2> select *
  2    from logtable
  3  /

   LAST_ID NUMBER_OF_PROCESSED_IDS
---------- -----------------------
     10783                   10783

1 row selected.

rwijk@ORA11GR2> update t
  2     set col = 888888
  3   where id = 40000
  4  /

1 row updated.

rwijk@ORA11GR2> commit
  2  /

Commit complete.

rwijk@ORA11GR2> select *
  2    from logtable
  3  /

   LAST_ID NUMBER_OF_PROCESSED_IDS
---------- -----------------------
     12822                   12823

1 row selected.

And then from session 1:

49999 rows updated.

rwijk@ORA11GR2> commit
  2  /

Commit complete.

rwijk@ORA11GR2> select *
  2    from t
  3   where id between 39999 and 40001
  4  /

        ID        COL
---------- ----------
     39999     -39999
     40000     888888
     40001     -40001

3 rows selected.

rwijk@ORA11GR2> select *
  2    from logtable
  3  /

   LAST_ID NUMBER_OF_PROCESSED_IDS
---------- -----------------------
     50000                   89999

1 row selected.

So here Oracle's write consistency mechanism made the statement rollback and re-execute. Which, by the way, is a good reason to not use autonomous transactions in database triggers: they might execute twice for a single row.

If someone can enlighten me with a use case for the RETRY_ON_ROW_CHANGE hint, I would be happy.

Sunday, October 18, 2009

All About Grouping at Oracle OpenWorld

On Thursday 10AM I gave a presentation at the Unconference called "All About Grouping". I really need to think more about titles of presentations, as two people came in with the question "What kind of grouping will you be talking about?". Fortunately, when I mentioned it was about grouping in SQL and especially the extensions ROLLUP, CUBE and GROUPING SETS, they stayed.

I had given the presentation two times before, but this was the first time I did it in English. In fact, it was the first time I've presented anything in English. Things went well I guess: it looked like everybody understood what I was saying. And there was some nice interaction as well. So it was a good warm-up for the UKOUG conference.

Thanks to all who attended. You can download the presentation and the scripts below.

Powerpoint of All About Grouping

Demonstration scripts of All About Grouping

Friday, October 16, 2009

Oracle OpenWorld 2009, the last three days

Tuesday was a very rainy day. I decided to take the shuttle bus to the Moscone and was therefore a little late for the first session. The sessions I visited this day were:

Dan Morgan - What's New in Eleven ... Dot Two (that Oracle won't be talking about)
In every release there are like two types of features. On the one side there are features that are picked up by marketing and which are given catchy names. On the other side are the features that are not picked up by marketing. Dan talked about those. Some I recognized because they are mentioned in the New Features Guide, but he also mentioned a few others. Dan is involved in the beta program, so he has already worked with 11g Release 2 for a year now.

Richard Foote - Oracle Indexes: Q & A Discussion
Richard had only two slides which he showed in the first minutes, and then it was time to throw questions at him. About indexes of course, and David Bowie. Jonathan Lewis was also in the room, so they kind of answered the questions together. One of the things that was nice to hear, is that neither Richard nor Jonathan had seen a useful use for the skip scan. Which was something I already suspected.

Steven Feuerstein - Coding Therapy for Software Developers
Another great presentation by Steven. His presentation skills are simply awesome. The subject of the presentation helps of course. However, there were a few statements here that I strongly disagree with. Statements like "SQL statements with 3 table joins, or 7; that's crazy!" or "It doesn't make sense for developers to have to learn the ins and outs of tuning SQL!". Then who is going to write that transactional API or that PL/SQL interface or batch program? I suspect that Steven had some other context in mind when making those statements.

Raj Mattamal - Giving Your Oracle Application Express a Web 2.0 Face-Lift
Raj talked about how to make your APEX application cooler for the end user. With so much jquery code out there, it's easy to do more than just the standard APEX stuff. It makes a big difference in how you interact with the application. Raj did a live demonstration, which I always like. And, man, does Raj talk fast!

Andrew Holdsworth - Current Trends in Real-World Database Performance
This seems to be an annual session, but it was the first one I attended. It's nice to hear from the people who do performance stuff only, what they encounter most. It seems bad SQL is still number one.

Blogger Meetup
Alex Gorbachev organized the annual bloggers meeting this year. And he made it a special one with t-shirts saying "I met more bloggers than you", where everybody was writing their names on the t-shirts. The wearer of the shirt with the most names won a HP Netbook. This is a great event for meeting the people behind the blogs you read (or don't read yet).

Wednesday I really tried to get some more sleep. Again I was awake at 4 AM, about the same time as every morning here in San Francisco. But unfortunately just staying in bed until late in the morning did not help. The sessions I followed were:

Arup Nanda - Upgrade Case Study: Database Replay, Snapshot Standby, and Plan Baselines
This session mainly outlined the use of new technologies you can use for upgrading your database to 11g. Unfortunately for me, the technologies themselves were not explained here; it was about when to apply which technology.

Rory McLean - Using Row-Level Security with Oracle Application Express
Last year I swore not to enter sessions anymore when I don't know the speaker, but this session was only 30 minutes and there wasn't anything more interesting to choose from at that time, so I decided to give this one a shot. And I wasn't disappointed. Rory McLean gave a nice and well-prepared presentation, including a demonstration of VPD within APEX. I didn't learn much new though, but that's because I'm already familiar with the subject.

S. Gopalakrishnan - Keynote: Seven Game Changing Trends: How Prepared Are You?
I realized too late that, to have a good press/analysts/blogger seat for Larry's keynnote, you'd have to be in the room before this one started. When Andrew Clarke and I arrived at Moscone North's hall D, the room was already full, except for some seats at the side. So we ended up behind a giant pillar just watching the big screens. This first keynote was aimed at managers to give some high-level overview about the strategic challenges they are awaiting. Nothing much of interest to me.

Larry Ellison and others - Keynote
Larry's keynote did not have a big new introduction, as expected. He first mentioned the success of Oracle Enterprise Linux. Then he went on to talk about Exadata V2 for OLTP databases. Some impressive numbers were there. I think my current client should definitely check this one out. After they have migrated to 11g Release 2 of course ... And he repeated the $10M challenge and the attack on IBM, which he also did at the Sunday keynote. The third part was about introducing pro-active Oracle support. If you give Oracle all the data about your configuration, then Oracle will inform you about which patches you should run to avoid running into the same problems as other customers did. The fourth part was about the new Fusion Applications. This software is still not ready to ship, but it looked they are well on their way. Fusion Applications will be the successor of E-Business Suite, Siebel, Peoplesoft, JD Edwards and the like. Oracle will support those applications for a long long time, but they just offer an alternative based on modern technologies like SOA.

The highlight of the keynote was of course the appearance of the governor of California, Arnold Schwarzenegger. He did a very funny talk about the importance of technology. How it had helped him in his career and how important it is to reduce global warming.

The keynote was a long show that should have ended at 16:30, but ended at 17:20. I therefore missed "SQL gone wild: Taming Bad SQL the Easy Way (or the Hard Way)" by Benoit Dageville about tuning SQL with some automated tools.

Appreciation Event
Andrew invited me to join him and Marco Gralike, Jacco Landlust, Anjo Kolk, Tim Hall and Doug Burns to the Appreciation Event on Treasure Island. An evening with music performances by Aerosmith, Roger Daltrey and the Wailers and two other bands I did not see. Normally I don't fancy Aerosmith's music that much (with the exception of "Dream On"), but I thought they gave a great show. Roger Daltrey played a couple of songs I did not recognize, but when he played some of The Who classics, the enthusiasm of the crowd -and mine- was greatly increased. After losing everyone of the original group and having a peek at the Wailers, I called it a day.

I was back at the hotel at 1AM. And again awake at 4AM. My own Unconference session started at 10AM (more about that in a separate post). After my session I returned my laptop to my hotel and went back to see "Real-World Database Performance Roundtable" by five people, among whom Greg Rahn, Graham Wood and Andrew Holdsworth. It was a really nice Q&A session with some interesting questions from the audience. But I was so tired that when I closed my eyes for a moment, I actually started sleeping. Some internal mechanism woke me up instantly and I decided I had to leave before I started disrupting the session with some loud snoring ... I grabbed a lunch, had a quick talk with H.Tonguç Yılmaz and Husnu Sensoy (DBA of the year!), and took some photographs to include in a presentation about Oracle OpenWorld for my department. I went back to my hotel and slept from 2PM to 7PM. Finally ...

Tuesday, October 13, 2009

Oracle OpenWorld 2009, the first two days

This year I'm visiting Oracle OpenWorld for the second time. Everything is now familiar, except for two differences. First one is the location of Oracle Develop, which has moved back to the Hilton Hotel, instead of the nearby Marriot Hotel. It is a nice walk, but sometimes you cannot chat after a session if you have a next session at the Moscone Center. Second difference is that it seems to me less people are attending the conference than previous year. But it's still a lot of people. One thing hasn't changed: my excitement about the conference.

The first two days were very busy. I'll quickly describe the sessions I went to. Sunday started early at 8:30 AM.

Alex Gorbachev - Demystifying Oracle Real Application Clusters Workload
A topic that with which I am not too familiar: my shop doesn't use RAC and I'm not a DBA. So I thought to use this one to get to know a little bit more. I had some trouble getting the context right in the beginning of the session, but when Alex started demonstrating the stuff on his laptop, everything fell into place. The demos were simply great. The sometimes complex load balancing algorithms were made perfectly clear by showing exactly the right things. A good start of the conference.

Steven Feuerstein - Why you should care about PL/SQL in Oracle Database 11g Now!
This was the first time I attended a session by Steven Feuerstein. Although I heard nothing new about PL/SQL, it was great to see how he presented. Very interactive and with a lot of humour.

Dimitri Gielis - Mastering an Oracle Application Express Page
In this session we learnt about what happens when APEX shows or processes a page. The session really helped to get some of my thoughts straight. One of the new things I heard was a simple button on the page design screen called "View" where you can see the order of the events that will happen. I saw that page lots of times now, but never bothered to use that.

Andrew Clarke - Designing PL/SQL with Intent
This session was totally different than I expected. It was a rather philosophical one comparing design patterns in PL/SQL with design patterns in town planning. At the end a few PL/SQL examples were shown that I liked a lot, since I preach them myself often as well, although not as well as Andrew did.

Michael Hichwa & David Peake - Oracle Application Express 4.0
A lot of nice new features were presented from the APEX development team themselves. The ones I liked most were Dynamic Actions, with which you can create client side validations declaratively, and the APEX listener. I'm not entirely sure, but it looks like the APEX listener opens up easier and supported ways to have proxy database authentication. Other new features include integrated Team Development for feedback and bug reports and the like, improved charts, websheets applications and improved tabular form handling.

Bryn Llewellyn - Online Application Upgrade
This session was renamed to "Edition Based Redefinition". Bryn very concisely talked about the 11g Release 2's killer feature: editions. Without using a demo he managed to make it perfectly clear what the feature is about and what it is not about. A few random things I wrote down: from now on an object is not defined by owner and name but by owner, name and edition. Only synonyms, views and PL/SQL of all kinds are editionable. You need to rename the table, create an editioning view with the same name as the table's former name. Then move all triggers, all privileges and all VPD policies to the editioning view. And last but not least: in 11gR2 it's now possible to add an extra column while the table is in use, whereas previously you would get an ORA-04021 lock timeout message.

Scott McNealy - Extreme innovation
This keynote was mainly aimed at assuring all Sun customers that Oracle is not trying to kill Sun's products but instead Oracle is increasing the investments in Sun's products. Larry Ellison did a good job at depicting IBM's range of hardware as inferior, slow, and not very green. He even offers $10M if your Oracle application doesn't run at least twice as fast. Now that's confidence!

ACE dinner
An evening at the Sens Restaurant for all Oracle ACEs and ACE Directors, where I've met a lot of old and new friends. Thanks to Lillian and all others at OTN for the perfect organization and a nice evening!

On Monday I attended these sessions:

Jonathan Lewis - How to become an expert
A very entertaining story about what you should do to become an expert. My rather free interpretation of the answer to this question is, to learn and really understand the basics first. If you understand the architecture of version 6, when you could still carry the documentation with you, then everything afterwards is really just a variation on the same theme. Understand exactly what Oracle will have to do when you send it a SQL statement and you'll be in a good position to know where the root cause of your problem is.

Tom Kyte - All About Metadata: Why Telling the Database About Your Schema Matters
A good story about why should have constraints, even in data warehouses. They may slow down DML a little, but it's extra information for the cost based optimizer to make smarter decisions. Extra access paths will become available. A new thing for me was to see the effect of creating a dimension object in combination with materialized view query rewrites. I have to see that one for myself one time.

Greg Rahn - Chalk & Talk: The Core Performance Fundamentals Of Oracle Data Warehousing
A story about Exadata and data warehouses and benchmarks. A different world than my usual one, which was nice. For example, this exadata machine is so fast, Greg recommends to not even use indexes anymore in most data warehouses. Of course, it all depends. I must say I was impressed by all numbers he mentioned. Those are not numbers I usually see in my clients OLTP database.

Jonathan Lewis - How to Hint
Again a very good presentation in which I heard quite a number of new things. To name a few: use_hash is only half a hint; it's better to combine this hint with the swap_join_inputs hint to explicitly state which query block should act as the in-memory table and which one as the probe table. Jonathan also had a great example of the no_merge hint, which finally became clear to me. And the parallel hint was not what I thought it would be. It's not instructing to always go parallel, but rather: if you are full scanning the table AND it's cost is lower than the serial plan, only then go parallel. Which means there might be a day your query will run serial even if it ran in parallel before, or vice versa. In this session I also learnt how an alien looks like ...

Chen Shapira - Visualization Session - Make your charts more effective for troubleshooting and presenting
This one definitely categorizes as "something completely different". Although the running example was about duration of exports, the session was about how to use graphs effectively and how they can effectively support your message, or not. Luckily I knew just enough about statistics to understand all graphs. I especially liked the Q-Q one.

Lucas Jellema - Continuous Application Application Evolution with Oracle Database 11g Release 2
Exactly the same topic as Bryn Llewellyn's session yesterday. Lucas also had a clear story and even included a few live demos that really helped. A very well explained introduction. The only unfortunate thing about both sessions is that no downsides were mentioned. And there must be several. Like test efforts being multiplied or worse performance because crossedition triggers are row level triggers. And we know what that means ... But still I liked the session for being very clear and the nice metaphores.

OTN Night
Here I just grabbed some food, but I could not find any familiar faces so I decided to leave early and catch some sleep. Which doesn't work by the way: I'm still awake each morning well before 4AM. Which is not uncommon for Europeans, it seems ...

More to come later.

Monday, September 14, 2009

Check constraints and AND

Remember the opening statement of the MV error series? It was "Never put more than one error condition into one error message". Two months later I realized I didn't practice what I preach myself.

I'm currently busy developing a Tour de France application at home. The application has to store the standings. After each stage, a daily and overall standing becomes available for each of the following standing types: Individual, point, mountain and team. I'm storing standing lines and I chose a supertype implementation: all in one table. Which means the table looks something like this:

rwijk@ORA11GR1> create table standing_lines
  2  ( stage_id           number(6)   not null
  3  , standing_type      varchar2(2) not null
  4  , overall_or_daily   varchar2(1) not null
  5  , position           number(3)   not null
  6  , rider_id           number(6)
  7  , team_id            number(6)
  8  , duration           interval day(2) to second(0)
  9  , number_of_points   number(4)
 10  , constraint pk primary key (stage_id,standing_type,overall_or_daily,position)
 11  , constraint ck1 check (standing_type in ('IN','PO','MO','TE'))
 12  , constraint ck2 check (overall_or_daily in ('D','O'))
 13  )
 14  /

Table created.

Note that I deliberately left out some unique, foreign key and other check constraints. For each standing type, different columns should be mandatory and other columns should stay null. For example: an individual standing line should contain a rider_id and a duration, and no team_id and no number_of_points. So it seems logical to code some additional check constraints like this:

rwijk@ORA11GR1> alter table standing_lines add constraint ck3 check
  2    (  standing_type != 'IN'
  3    or (   rider_id is not null
  4       and team_id is null
  5       and duration is not null
  6       and number_of_points is null
  7       )
  8    )
  9  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck4 check
  2    (  standing_type != 'PO'
  3    or (   rider_id is not null
  4       and team_id is null
  5       and duration is null
  6       and number_of_points is not null
  7       )
  8    )
  9  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck5 check
  2    (  standing_type != 'MO'
  3    or (   rider_id is not null
  4       and team_id is null
  5       and duration is null
  6       and number_of_points is not null
  7       )
  8    )
  9  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck6 check
  2    (  standing_type != 'TE'
  3    or (   rider_id is null
  4       and team_id is not null
  5       and duration is not null
  6       and number_of_points is null
  7       )
  8    )
  9  /

Table altered.

But when I started coding routines to read some clob content pasted from the official Tour de France site, parsing all standing lines, and inserting the result into the standing_lines table, the inevitable occurred: it didn't work the first time. I received an error message like this:

ORA-02290: check constraint (RWIJK.CK6) violated

So I looked up the definition of this check constraint and all I could say was that I was trying to insert a team standing line and I gave rider_id a value, or I left team_id null, or I left duration null, or I gave number_of_points a value. That's four options. And it's certainly not one error condition per error message. To resolve this error, I first redefined my check constraints and split them into four each.

The first rewrite is by applying the distributivity rule: A or (B and C) == (A or B) and (A or C). In Oracle check constraint logic, it looks like this:

rwijk@ORA11GR1> alter table standing_lines drop constraint ck3
  2  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck4
  2  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck5
  2  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck6
  2  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck3 check
  2    (   ( standing_type != 'IN' or rider_id is not null )
  3    and ( standing_type != 'IN' or team_id is null )
  4    and ( standing_type != 'IN' or duration is not null )
  5    and ( standing_type != 'IN' or number_of_points is null )
  6    )
  7  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck4 check
  2    (   ( standing_type != 'PO' or rider_id is not null )
  3    and ( standing_type != 'PO' or team_id is null )
  4    and ( standing_type != 'PO' or duration is null )
  5    and ( standing_type != 'PO' or number_of_points is not null )
  6    )
  7  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck5 check
  2    (   ( standing_type != 'MO' or rider_id is not null )
  3    and ( standing_type != 'MO' or team_id is null )
  4    and ( standing_type != 'MO' or duration is null )
  5    and ( standing_type != 'MO' or number_of_points is not null )
  6    )
  7  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck6 check
  2    (   ( standing_type != 'TE' or rider_id is null )
  3    and ( standing_type != 'TE' or team_id is not null )
  4    and ( standing_type != 'TE' or duration is not null )
  5    and ( standing_type != 'TE' or number_of_points is null )
  6    )
  7  /

Table altered.

And now it becomes clear we can split them into four parts like this:

rwijk@ORA11GR1> alter table standing_lines drop constraint ck3
  2  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck4
  2  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck5
  2  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines drop constraint ck6
  2  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck3 check
  2    ( standing_type != 'IN' or rider_id is not null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck4 check
  2    ( standing_type != 'IN' or team_id is null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck5 check
  2    ( standing_type != 'IN' or duration is not null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck6 check
  2    ( standing_type != 'IN' or number_of_points is null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck7 check
  2    ( standing_type != 'PO' or rider_id is not null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck8 check
  2    ( standing_type != 'PO' or team_id is null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck9 check
  2    ( standing_type != 'PO' or duration is null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck10 check
  2    ( standing_type != 'PO' or number_of_points is not null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck11 check
  2    ( standing_type != 'MO' or rider_id is not null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck12 check
  2    ( standing_type != 'MO' or team_id is null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck13 check
  2    ( standing_type != 'MO' or duration is null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck14 check
  2    ( standing_type != 'MO' or number_of_points is not null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck15 check
  2    ( standing_type != 'TE' or rider_id is null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck16 check
  2    ( standing_type != 'TE' or team_id is not null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck17 check
  2    ( standing_type != 'TE' or duration is not null )
  3  /

Table altered.

rwijk@ORA11GR1> alter table standing_lines add constraint ck18 check
  2    ( standing_type != 'TE' or number_of_points is null )
  3  /

Table altered.

I reran my code and the error message became:

ORA-02290: check constraint (RWIJK.CK17) violated

Ah! I tried to insert a team standing line, but failed to give duration a value. Now that's more like it.

This is a very specific example, but you can say in general:

Whenever you have a check constraint and a top-level AND operator, you should split the constraint.

To know whether your expressions contains a top-level AND operator, it's essential you can calculate with boolean expressions. I mentioned the distributivity rule already above, and in that link you find some more useful rules. And I find I apply De Morgan's laws very often as well. These allow you to rewrite your expressions containing less mandatory brackets, so the AND and OR operators will go up in the hierarchy.

Whether you want to apply this rule for a BETWEEN operator (which also contains an AND :-) ) is subjective.

About Oracle

Pages