Index usage with LIKE operator in Oracle & Domain Indexes

A lot of developers might be confused about index selectivity while using %LIKE% operator. So please find below how index worked when you use LIKE operator.

Problem Statement

  • While optimizing high CPU consuming queries on 3rd party application, verified that most of the queries are using '%LIKE%' operator.
  • Interestingly enough, while some of these queries are going for "INDEX RANGE" while others are going for "FULL TABLE SCAN" ??

Brief about LIKE operator

  • 'LIKE' Determines whether a specific character string matches a specified pattern.
  • % allows you to match any string of any length (including zero length)

Before starting this you must know that :

  • Only the part before the first wildcard serves as access predicate.
  • The remaining characters do not narrow the scanned index range they just discard non-matching results.

We can use LIKE operator in 4 ways in our queries:

  1. SEARCH-STRING%
  2. %SEARCH-STRING
  3. %SEARCH-STRING%
  4. SEARCH%STRING

1. SEARCH-STRING%

The SEARCH-STRING% will perform INDEX RANGE SCAN data in least possible time.

set autotrace traceonly;
select * from sac where object_type like 'TAB%';

Execution Plan
----------------------------------------------------------
   0       SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=85 Card=3 K Bytes=543 K)
   1    0    TABLE ACCESS BY INDEX ROWID EMPL101.SAC (Cost=85 Card=3 K Bytes=543 K)
   2    1      INDEX RANGE SCAN EMPL101.SAC_INDX (Cost=11 Card=3 K)

Here the optimizer knows ,where the string gets started (means it know the predicate),so It used Index Range Scan .

2. %SEARCH-STRING

When using %SEARCH-STRING it’s access the FULL table.

set autotrace traceonly;
select * from sac where object_type like '%TAB';

Execution Plan
----------------------------------------------------------
   0       SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=398 Card=16 Bytes=2 K)
   1    0    TABLE ACCESS FULL EMPL101.SAC (Cost=398 Card=16 Bytes=2 K)

The opposite case is also possible: a LIKE expression that starts with a wildcard. Such a LIKE expression cannot serve as access predicate.

The database has to scan the entire table, if the where clause does not provide another access path.

3. %SEARCH-STRING%

When using %SEARCH-STRING% it’s access the FULL table.

set autotrace traceonly;
select * from sac where object_type like '%TAB%';

Execution Plan
----------------------------------------------------------
   0       SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=398 Card=3 K Bytes=543 K)
   1    0    TABLE ACCESS FULL EMPL101.SAC (Cost=398 Card=3 K Bytes=543 K)

Here also the optimizer doesn’t know from which letter the String get started ,so it will scan the whole table.

4. SEARCH%STRING

The SEARCH%STRING will perform INDEX RANGE SCAN and generate an initial result set, containing the values that match first string i.e. SEARCH%. Next it will scan through the values to get second string i.e. %STRING

set autotrace traceonly;
select * from sac where object_type like 'TA%BLE';

Execution Plan
----------------------------------------------------------
   0       SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=85 Card=3 K Bytes=543 K)
   1    0    TABLE ACCESS BY INDEX ROWID EMPL101.SAC (Cost=85 Card=3 K Bytes=543 K)
   2    1      INDEX RANGE SCAN EMPL101.SAC_INDX (Cost=11 Card=3 K)

Lets see the chart below for various search using wildcard character:

The more selective the part before the first wildcard is, the smaller the scanned index range becomes.
That, in turn, makes the index lookup more efficient.

oracle-like-query-result

  • The first expression has two characters before the wildcard.
  • They limits the scanned index range to 18 rows. Only one of them matches the entire LIKE expression—the other 17 are discarded.
  • The second expression has a longer prefix, which narrows the scanned index range to two rows.
  • With this expression, the database just reads one extra row that is not relevant for the result.
  • The last expression does not have a filter predicate at all.
  • The database just reads the entry that matches the entire LIKE expression.

Now, this is how our normal LIKE operator works, but what happen when you want to use index in 2nd and 3rd case of the example above.

This is when Oracle*Text Utility comes in picture :)

Oracle*Text utility

oracle-text-utilityThe Oracle*Text utility (formally called Oracle ConText and Oracle Intermedia) allows us to parse through a large text column and index on the words within the column.

Unlike ordinary b-tree or bitmap indexes, Oracle context, ctxcat and ctxrule indexes can be set not to update as content is changed.

Since most standard Oracle databases will use the ctxcat index with standard relational tables, you must decide on a refresh interval.

Oracle provides the SYNC operator for this. The default is SY^NC=MANUAL and you must manually synchronize the index with CTX_DDL.SYNC_INDEX.

SYNC (MANUAL | EVERY "interval-string" | ON COMMIT)

Hence, Oracle Text indexes are only useful for removing full-table scans when the tables are largely read-only and/or the end-users don’t mind not having 100% search recall:

  • The target table is relatively static (e.g. nightly batch updates)
  • Your end-users would not mind “missing” the latest row data

Lets take up this with an example:

SQL> CREATE TABLE sac AS SELECT * FROM all_objects;
   
   Table created.


SQL> CREATE INDEX sac_indx ON sac(object_type);
   
   Index created.


SQL> set autotrace trace explain

SQL> select * from sac where object_type LIKE 'TAB%';

        Execution Plan
     ----------------------------------------------------------
     0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=128
     )

     1 0 TABLE ACCESS (BY INDEX ROWID) OF 'SAC' (TABLE) (Cost=1 Car
     d=1 Bytes=128)

     2 1 INDEX (RANGE SCAN) OF 'SAC_INDX' (INDEX) (Cost=1 Card=1)

Above example shows that using % wild card character towards end probe an Index search.

But if it is used towards start, it will not be used. And sensibly so, because Oracle doesn’t know which data to search, it can start from ‘A to Z’ or ‘a to z’ or even 1 to any number.

See this.

SQL> SELECT *
     FROM sac
     WHERE object_type LIKE '%ABLE';


	Execution Plan
	----------------------------------------------------------
	0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=148 Card=1004 Byte
	s=128512)

	1 0 TABLE ACCESS (FULL) OF 'SAC' (TABLE) (Cost=148 Card=1004 B
	ytes=128512)

Now how to use the index if you are using Like operator searches. The answer is Domain Indexes.


SQL> connect / as sysdba
Connected.

SQL> grant execute on ctx_ddl to public;
Grant succeeded.


SQL> connect sac;
Connected.

SQL> begin
2 ctx_ddl.create_preference('SUBSTRING_PREF',
3 'BASIC_WORDLIST');
4 ctx_ddl.set_attribute('SUBSTRING_PREF',
5 'SUBSTRING_INDEX','TRUE');
6 end;
7
8 /

PL/SQL procedure successfully completed.
  • ctx_ddl.create_preference: Creates a preference in the Text data dictionary.
  • ctx_ddl.set_attribute : Sets a preference attribute. Use this procedure after you have created a preference with CTX_DDL.CREATE_PREFERENCE.
SQL> drop index sac_indx;
Index dropped.

SQL> create index sac_indx on sac(object_type) indextype is ctxsys.context parameters ('wordlist SUBSTRING_PREF memory 50m');
Index created.

SQL> set autotrace trace exp
SQL> select * from sac where contains (OBJECT_TYPE,'%PACK%') > 0

	Execution Plan
 	----------------------------------------------------------
	0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=8 Card=19 Bytes=17
	86)
	1 0 TABLE ACCESS (BY INDEX ROWID) OF 'SAC' (TABLE) (Cost=8 Car
	d=19 Bytes=1786)
	2 1 DOMAIN INDEX OF 'SAC_INDX' (INDEX (DOMAIN)) (Cost=4)

In this case the index is getting used.

Index re-synchronization

Because rebuilding an Oracle Text index (context, ctxcat, ctxrule) requires a full-table scan and lots of internal parsing, it is not practical to use triggers for instantaneous index updates. 

Updating Oracle Text indexes is easy and they can be schedules using dbms_job or the Oracle 10g dbms_scheduler utility package:  Oracle text provides a CTX_DDL package with the sync_index and optimize_index procedures:

SQL> EXEC CTX_DDL.SYNC_INDEX('sac_indx');
SQL> EXEC CTX_DDL.OPTIMIZE_INDEX('sac_indx','FULL');

For example, if you create a nightly dbms_scheduler job to call sync_index, your index will be refreshed, but the structure will become sub-optimal over time.  Oracle recommends that you periodically use the optimize_index package to periodically re-build the whole index from scratch.  Index optimization can be performed in three modes (FAST, FULL or TOKEN).

In sum, the Oracle Text indexes are great for removing unnecessary full-table scans from static Oracle tables and they can reduce I/O by several orders of magnitude, greatly improving overall SQL performance

Conclusion

For proximity, soundex and fuzzy searches, use domain indexes.

References

Oracle Documentation



Leave a Reply

Your email address will not be published. Required fields are marked *

Note

To post source code in comment, use [code language] [/code] tag, for example:

  • [code java] Java source code here [/code]
  • [code html] HTML here [/code]

Current day month ye@r *