(B) if You Were to Write a Truth Table for F1, How Many Rows Evaluate to True? Why?
Helena Markova, April 03, 2001 - 4:05 am UTC
phrase it
A reader, September 26, 2001 - 2:20 pm UTC
Kindly phrase the following query for us
select count(*) from emp where empno not in ( select mgr from emp );
Reader
A reader, September 30, 2002 - 10:01 pm UTC
Tom, Instead of SQL> select count(*) from emp T1 2 where not exists ( select null from emp T2 where t2.mgr = t1.empno ); you could have used SQL> select count(*) from emp T1 2 where not exists ( select mgr from emp T2 where t2.mgr = t1.empno ); Could you tell what circumstances do we use "select null" instead of "select <value>". Are there any advantages
October 01, 2002 - 9:54 am UTC
why select mgr?
I find select null to be semantically more meaningful. You are NOT selecting anything really -- so say that.
Why problem if NULL is there
Tony, October 02, 2002 - 2:02 am UTC
Hi Tom,
Your answer is superb. Can you tell us why there is no record selected for NOT IN when there is NULL?
October 02, 2002 - 10:25 am UTC
Because NULL means -- gee, I don't know. (litterally, null means Unknown) So, the predicate where x not in ( NULL ) evaluates to neither TRUE, nor FALSE ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from dual where dummy not in ( NULL ); no rows selected ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from dual where NOT( dummy not in (NULL) ); no rows selected (you would think one of the two queries would return a row -- but there is a third state for a boolean expression in sql -- "I don't know what the answer is")
select rows using exists
Madhuri Bondre, March 31, 2003 - 6:44 am UTC
Your solution is very useful
but i have one query
i have two tables
first table with item_cd, sub_item_cd (both the columns are in primary key)
second table with sub_item_cd (no primary key)
i want to filter only those records from first table where all sub_item_cd exists in second table
ex.
rows of first table
item_cd sub_item_cd
1010 A
1010 B
1010 C
1011 A
1011 D
1012 A
rows of second table
sub_item_cd
A
A
A
B
B
C
B
C
i need output as
item_cd
1010
1012
how can I get it in one select ?
can exists will solve it?
thanks in advance
March 31, 2003 - 9:02 am UTC
ps$tkyte@ORA920> select t1.item_cd, t1.sub_item_cd, decode( t2.sub_item_cd, NULL, 1, 0 ) 2 from t1, (select distinct sub_item_cd from t2 ) t2 3 where t1.sub_item_cd = t2.sub_item_cd(+) 4 / ITEM_CD S DECODE(T2.SUB_ITEM_CD,NULL,1,0) ---------- - ------------------------------- 1010 a 0 1011 a 0 1012 a 0 1010 b 0 1010 c 0 1011 d 1 6 rows selected. ops$tkyte@ORA920> select t1.item_cd 2 from t1, (select distinct sub_item_cd from t2 ) t2 3 where t1.sub_item_cd = t2.sub_item_cd(+) 4 group by t1.item_cd 5 having max(decode( t2.sub_item_cd, NULL, 1, 0 )) = 0 6 / ITEM_CD ---------- 1010 1012 ops$tkyte@ORA920>
Thanks
Madhuri bondre, April 01, 2003 - 12:31 am UTC
Mirjana, May 05, 2003 - 2:51 am UTC
NOT IN or NOT EXISTS or
Gautam, May 24, 2003 - 11:27 pm UTC
Hi Tom, Your points are very helping, however, in case of having 20 Mill rows in outer table and Mill in inner table none is working, both are too slow. Is there a way to avoid the subquery being executed for retrieval of every row from the outer query.
May 25, 2003 - 9:59 am UTC
give me a "for example" and define "too slow".
on my pc, to full scan a 1,000,000 row table (a copy of all_objects -- the WIDTH * HEIGHT of a table is relevant -- not just the height. I have 1,000,000 tables that consume lots less space then a 10,000 row one) takes about 7 seconds. a 20million -- 20 times as long.
As long as you are using the CBO, and the tables are analyzed, it will use a hash anti join and be very fast.
Gautam, May 27, 2003 - 4:54 pm UTC
Hi Tom, Thanks for your reply.
I have following observations :
Query:
SELECT ind_id, lst_ord_ss, dec_sc_is, lst_ord_is,
ss_group, dec_sc_ss, list_pri, cont_lvl,
site_upd_dt, source_cd, sd_custtyp, site_id,
ss_subgrp
FROM multi_mv
WHERE country_cd = :sys_b_00
AND dsf_index IN (:sys_b_01, :sys_b_02)
AND mail_cont IN (:sys_b_03, :sys_b_04)
AND mail_site = :sys_b_05
AND ss_group IN
(:sys_b_06, :sys_b_07, :sys_b_08, :sys_b_09, :sys_b_10)
AND NOT ind_id IN (SELECT DISTINCT ind_id
FROM ssw.promo
WHERE drop_cd = :sys_b_11)
1. The plan of above query. NL's
SELECT STATEMENT Cost = 193604
FILTER
TABLE ACCESS FULL MULTI_MV
TABLE ACCESS BY INDEX ROWID PROMO
INDEX RANGE SCAN PROM_DROP_CD
2. Adding "ORDER BY int_id"into above Query changes
the plan to seems like MERGE_AJ
SELECT STATEMENT Cost = 118609
TABLE ACCESS BY INDEX ROWID MULTI_MV
INDEX FULL SCAN MV_MULTI_IND_ID
TABLE ACCESS BY INDEX ROWID PROMO
INDEX RANGE SCAN PROM_DROP_CD
3. Changing "NOT IN" to "NOT EXISTS"
SELECT STATEMENT Cost = 198944
SORT ORDER BY
HASH JOIN ANTI
TABLE ACCESS FULL MULTI_MV
TABLE ACCESS FULL PROMO
Why is the CBO choosing Anti join only in case of NOT EXISTS ?
Internally the Anti Hash Join retrives the full row or only the join column ?
The approx time for the query to finish in first 2 cases is 5 days.
I would appreciate any comments on this ?.
May 27, 2003 - 5:51 pm UTC
obviously NO BIND variables on your system eh.
I'll guess (in light of no creates, no knowledge of indexes, no dbms_stats.set_table_stats to let me know how big/how many rows things are, etc etc etc)
Is ind_id in ssw.promo NULLABLE
and if so, what happens when you say:
AND NOT ind_id IN (SELECT DISTINCT ind_id
FROM ssw.promo
WHERE drop_cd = :sys_b_11
and IND_ID IS NOT NULL )
instead (you do know of course that NOT EXISTS and NOT IN are *not* interchangeable, they'll give different results!)
Not In Vs Not Exists !
ramki, May 27, 2003 - 7:41 pm UTC
Tom,
I have 2 tables. Both tables have around 8000 records.
Table 1:
ID
parentID
col1
col2
col3
Table1: Primary key: ID
Table2
Module
ID
processDate
ParentID
dataCol1
dataCol2
source_column
source_fg
Table2 Primary key: Module, Id, processDate
Table 1 is a superset of all Ids that Table 2 has.
ParentIds for Ids in Table1 and Table2 would be same for same Id.
Now, I need to pick those Ids from Table1 that are
- not in Table2
- have same ParentID from Table2.
- for records that have source_column in table2 is not null
- for records in Table2 that have source_fg as null
Table1:
ID ParentId Col1 col2 col3
101 111 10 20 30
102 111 20 40 60
103 112 10 10 10
104 113 10 20 30
105 113 30 40 50
106 114 40 40 50
107 113 50 50 50
Table2:
Module ID ParentId ProcessDate datacol1 datacol2 source_column source_fg
1 101 111 27-MAY-2003 1000 1000 2000
1 103 112 27-MAY-2003 2000 2000 2000 Y
1 104 113 27-MAY-2003 3000 3000 3000
1 106 114 27-MAY-2003 2300 2300 3500
Now, I need the out to be:
102
105
107
Select t1.Id
from Table1 t1, Table2 t2
where not Exists (select null from Table2 where table2.module=1 and
table2.id = t1.id and table2.processDate = sysdate)
and t1.parentId = t2.parentId
and t2.Module = 1
and t2.processDate = sysdate
and t2.source_column is not null
and t2.source_fg is null
Actually, I need to insert these Ids in Table2 and replicate data from their parentIDs in Table2.
so, New records would be
Module ID ParentId ProcessDate datacol1 datacol2 source_column source_fg
1 102 111 27-MAY-2003 1000 1000 2000
1 105 113 27-MAY-2003 3000 3000 3000
1 107 113 27-MAY-2003 3000 3000 3000
I am trying to use, Insert into table2
select ......
Which one would be better: (using RBO)
Select t1.Id, t2.datacol1, t2.datacol2..........
from Table1 t1, Table2 t2
where not Exists (select null from Table2 where table2.module=1 and
table2.id = t1.id and table2.processDate = sysdate)
and t1.parentId = t2.parentId
and t2.Module = 1
and t2.processDate = sysdate
and t2.source_column is not null
and t2.source_fg is null
or
Select t1.Id, t2.datacol1, t2.datacol2..........
from Table1 t1, Table2 t2
where t1.Id not in (select Id from Table2 where table2.module=1 and
table2.processDate = sysdate)
and t1.parentId = t2.parentId
and t2.Module = 1
and t2.processDate = sysdate
and t2.source_column is not null
and t2.source_fg is null
Please advice. Thanks
May 27, 2003 - 8:01 pm UTC
neither, rbo is the wrong technology.
In anycase, it seems you have the test data, benchmark it? it is what I would have to do in order to show you.
rbo would be wrong. rbo would use the least efficient plans for these types of queries. cbo would do it most efficiently.
Problem with my NOT EXISTS statement
Godwin, May 30, 2003 - 8:03 am UTC
Hi Tom,
Its difficult getting you meanwhile this question is similar what is being discussed and I will be grateful if you will mail the reply to godwinofoe@yahoo.com. Please don't ignore reading this.I need an urgent reply to be able to get this report to management.
The problem is in generating a report in Report6i from a University HR database Oracle7.3.2 Server.
The problem is like this:
In a University, staff who retired are taken on contract to work for some time. Also staff who resigned from the university and want to come back are re-engaged.
Now management requested for a report on all active staff (i.e to exclude retirees and include staff re-engaged).
When I wrote a query to produce such report I was not getting the correct result. The query failed to exclude the retired staff. The logic i was using is that the subquery should select all staff who retired and resigned from the university. At that instance of creating their service record, their start date is the same as their end date so the subquery would select by using the condition where the start date equals the end date. When such records of staff are found by the subquery,the outer query would compare its result with the inner query and then exclude any matching records found in the inner query result. I hope i made myself clear.Here is the detail of my query:
SELECT distinct id,name
,INITCAP(LOWER(dname))
,disc,descrip
,startdate,endate
from staff a
,dept b
,jobhist c
,dictator d
,postab e
WHERE a.id= c.stid
AND a.deptno= b.deptno
AND c.indic = d.code
AND c.jobtitle = e.postid
AND NOT EXISTS
(SELECT 1
FROM jobhist j
WHERE j.stid=a.id
AND j.startdate=(SELECT MAX(NVL k.endate,k.startdate))
FROM jobhist k,jobhist n
WHERE k.stid = a.id
AND k.stid=n.stid
AND c.indic IN('01','02','04')
)
)
ORDER BY name asc
This is the result got from the above query:
DeptFile# NameService Indicator PostStart Date End Date
Operations 10003 Mana NEW APPOINTMENT LECTURER 1-Dec-87 20-Jan-99
Operations 10003 Mana re-enganged SENIOR LECTURER 1-Jun-01
Research 10001 Ofoe NEW APPOINTMENT LECTURER 15-Aug-02 15-May-03
Research 10001 Ofoe PROMOTION SENIOR LECTURER 15-May-03
Sales 10002 John NEW APPOINTMENT Junior Admin Asst 14-Jun-75 15-May-88
Sales 10002 John PROMOTION LECTURER 16-May-90 11-Jun-00
Sales 10002 John PROMOTION SENIOR ADMIN 15-May-88 30-May-90
Sales 10002 John PROMOTION SENIOR LECTURER 13-Jun-00 30-Jul-02
The correct output should have excluded staff called John since he is retired.
Here are the records stored in the various tables:
Jobhist Table
File# Ser.Indicator Postcode StartDate End Date
10001 A1 5 15-Aug-02 15-May-03
10001 A2 6 15-May-03
10002 A3 7 14-Jun-75 15-May-88
10002 A3 6 15-May-88 30-May-90
10002 A1 6 16-May-90 11-Jun-00
10002 A2 6 13-Jun-00 30-Jul-02
10002 A2 1 30-May-03 30-May-03
10003 A1 5 1-Dec-87 20-Jan-99
10003 A2 4 1-Jul-00 1-Jul-00
10003 A2 3 1-Jun-01
Staff table:
File# Name Hiredate StaffType Deptno Birthdate
10001 Ofoe 12-Aug-02 3 20
10004 Kofi 1-Jun-75 3 10 15-Jun-40
10002 John 12-May-14 3 30 12-Jun-70
10003 Mana 12-Sep-87 2 40 17-Jul-56
Service Indicator Table(dictator):
Code Description
1 retired
2 contract
3 re-enganged
4 resigned
5 NEW APPOINTMENT
6 PROMOTION
Is there a mistake I made, is there a different way of achieving the result or is SQL incapable of achieving the result? I need your help please. Thanks for your help.
May 30, 2003 - 8:40 am UTC
well, first - i don't read every review, most but not all. Also, I read them as I have time -- not on any sort of set schedule
second, I'm not emailing people on individual followups. I'd spend the other half of my life doing that (leaving no half for me)
third, the example you put forth is really really really hard to follow. Your sample tables for example haven't many columns at all that match the columns in your query. I don't understand your datamodel (as you should). So, I'll make a simple guess
sounds like you want STAFF records such that their last (most recent) record in JOBHIST does not have an indic code in ( 1, 2, 4 )
so, I think we can ignore enddate all togther, just find the max startdate for each person. we can start with:
select file#, max( to_char(startdate,'yyyymmddhh24miss') || indic )
from jobhist
group by file#
Now, that will give us a set of file# (which I assume is the key into STAFF) and their max startdate value ALONG WITH THE INDIC code for that max date.
We can then:
select file#
from (
select file#, max( to_char(startdate,'yyyymmddhh24miss') || indic ) data
from jobhist
group by file#
)
where substr( data, 15 ) not in ( '01', '02', '04' )
/
That'll be the set of ACTIVE STAFF file#'s.
Now you can just
select *
from staff
where file# in ( <that query> );
You got what i mean but not getting the actual result I want
Godwin, June 02, 2003 - 8:59 am UTC
Hi Tom,
Thanks for the response.
With reference to your understanding "sounds like you want STAFF records such that their last (most recent) record in
JOBHIST does not have an indic code in ( 1, 2, 4 )"
What i mean is that staff who resign(4) or retired(1)
shouldn't have their names or previous service records in the JOBHIST table appeared in the report at all.
Take it as that the STAFF TABLE is the MASTER Table with the primary key fileNo and the JOBHIST table is the DETAIL TABLE with FileNo as the foreign Key.
Your solutions you gave produced the names and previous service records of staff excluding their most recent records that has indic code in (1,2,4).
We don't the records of those whose most recent records that has indic code in (1,2,4) appeared.
Can't there be away of achieving that?
Thanks for your help.
June 02, 2003 - 9:16 am UTC
you say
"we don't the records ... with 1,2,4 appeared"
If I read that to mean "why don't" -- then it is because I filtered them:
select file#
from (
select file#, max( to_char(startdate,'yyyymmddhh24miss') || indic ) data
from jobhist
group by file#
)
where substr( data, 15 ) not in ( '01', '02', '04' )
/
Look at the concept there. (you should be able to fix this, it is after all your set of tables???)
That should -- if you've explained things accurately -- return the set of primary keys from jobhist for joining to the other table. The set of records this returns is the MAX startdate record for everyone in jobhist (their LAST ENTRY) if and only if their indic code isn't in 1, 2, 4
Take the technique, verify the results -- apply the technique to your problem.
Since I
a) don't know your tables
b) don't know your keys
c) am having a hard time following your explaination
this is the best I can do.
It was wonderful
Godwin, June 02, 2003 - 10:53 am UTC
Hi Tom,
It is working perfectly.
Thanks for your patience although my questions were not explanatory enough,your guess were correct.I made some mistakes in implementing the solution.
Just as you said,I will now fine tune the query so i can get the records from more than one tables. I will do my best to get accurate result but please bear with me if i should get back to you for additional help.
God bless!!
avoid correlated subquery
umesh, August 10, 2003 - 8:55 am UTC
SELECT
count(*)
FROM
AER A,
AER_PRODUCT B ,
AER_PRODUCT_APPROVAL_NO P ,
AER_REACT E
WHERE
A.AER_DELETED = '0' AND
B.RECORD_DELETED != '1' AND
E.RECORD_DELETED != '1' AND
(B.NOT_RELEVANT IS NULL) AND
(E.NOT_RELEVANT IS NULL) AND
B.TRADE_NAME LIKE :1 AND
(
( B.PRODUCT_TYPE_MULTI = '0'
AND
(B.PRODUCT_TYPE = '1' OR B.PRODUCT_TYPE = '2' OR B.PRODUCT_TYPE = '4' )
)
OR
( B.PRODUCT_TYPE_MULTI = '1'
AND
( ( P.PRODUCT_TYPE = '1' OR P.PRODUCT_TYPE = '2' OR P.PRODUCT_TYPE = '4' )
AND P.COUNTRY_CODE = NVL(A.AER_COUNTRY, '078')
AND B.SEQ_PRODUCT = P.SEQ_PRODUCT AND B.AER_ID = P.AER_ID
)
OR
(NOT EXISTS
( SELECT Y.APPROVAL_NO FROM AER_PRODUCT_APPROVAL_NO Y
WHERE Y.COUNTRY_CODE = NVL(A.AER_COUNTRY, '078')
AND B.SEQ_PRODUCT = Y.SEQ_PRODUCT AND B.AER_ID = Y.AER_ID
)
AND
B.PRODUCT_TYPE != '3'
))
)
AND
(B.PRODUCT_FLAG = '1' OR B.PRODUCT_FLAG = '4') AND
B.AER_ID = A.AER_ID AND
E.AER_ID = A.AER_ID AND
E.EVENT_TYPE != '2'
ORDER BY A.AER_NO, A.COMPOSITE_VERSION_NO DESC , A.AER_ID, E.AUTO_RANK, B.AUTO_RANK
Tables P.KEY
AER AER_ID
AER_PRODUCT AER_ID,SEQ_PRODUCT
AER_APPROVAL_NO
AER_ID,SEQ_PRODUCT,SEQ_APPROVAL_NUMBER
i WANT TO ELIMINATE THE CORRELATED SUBQRY
PLEASE HELP
A reader, September 04, 2003 - 6:39 pm UTC
Tom, Can you please take a look and advice on this query select map, todmdunit, todmdgroup, todfuloc, tomodel, fromdmdunit, fromdmdgroup, fromdfuloc, frommodel, eff from stsc.dfumap a where scen = 0 and not exists ( select * from stsc.dfu b where scen = 0 and a.fromdmdunit = b.dmdunit and a.fromdmdgroup = b.dmdgroup and a.fromdfuloc = b.loc and a.frommodel = b.model ) / no rows selected Elapsed: 00:00:30.29 Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=59587 Bytes=3634807) 1 0 FILTER 2 1 INDEX (FAST FULL SCAN) OF 'DFUMAP1' (NON-UNIQUE) (Cost=2 Card=59587 Bytes=36 34807) 3 1 INDEX (UNIQUE SCAN) OF 'DFU1' (UNIQUE) (Cost=2 Card=1 Bytes=23) Statistics ---------------------------------------------------------- 0 recursive calls 50 db block gets 3590173 consistent gets 0 physical reads 0 redo size 686 bytes sent via SQL*Net to client 319 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed SQL> desc dfu Name Null? Type ----------------------------------------------------------------------- -------- ---------------------------- DMDUNIT NOT NULL VARCHAR2(30) DMDGROUP NOT NULL VARCHAR2(10) LOC NOT NULL VARCHAR2(10) HISTSTART NOT NULL DATE EFF NOT NULL DATE DISC NOT NULL DATE FCSTHOR NOT NULL NUMBER(38) DMDCAL NOT NULL VARCHAR2(18) DMDPOSTDATE NOT NULL DATE MODELDATE NOT NULL DATE STATMSE NOT NULL FLOAT(126) MAXHIST NOT NULL NUMBER(38) TOTFCSTLOCK NOT NULL NUMBER(38) LOCKDUR NOT NULL NUMBER(38) REFITDATE NOT NULL DATE MASK NOT NULL VARCHAR2(18) SCEN NOT NULL NUMBER(38) SCENBITS NOT NULL NUMBER(38) MAPUSED NOT NULL VARCHAR2(18) NETFCSTMSE NOT NULL FLOAT(126) NETFCSTMSESMCONST NOT NULL FLOAT(126) NETFCSTERROR NOT NULL FLOAT(126) NEGFCSTSW NOT NULL NUMBER(38) AUTOADJFACTOR NOT NULL FLOAT(126) MODEL NOT NULL VARCHAR2(18) AUTOADJMIN NOT NULL FLOAT(126) AUTOADJMAX NOT NULL FLOAT(126) PUBLISHDATE NOT NULL DATE UDC_DFU_HIERARCHY VARCHAR2(50) UDC_MODELNUM NOT NULL NUMBER(38) UDC_ABCINDICATOR VARCHAR2(50) UDC_DELETERECSW NOT NULL NUMBER(38) UDC_IMPORTSW NUMBER(38) UDC_NEWAUTOGENSW NOT NULL NUMBER(38) SQL>desc dfumap Name Null? Type ----------------------------------------------------------------------- -------- ---------------------------- MAP NOT NULL VARCHAR2(18) TODMDUNIT NOT NULL VARCHAR2(30) TODMDGROUP NOT NULL VARCHAR2(10) TODFULOC NOT NULL VARCHAR2(10) FROMDMDUNIT NOT NULL VARCHAR2(30) FROMDMDGROUP NOT NULL VARCHAR2(10) FROMDFULOC NOT NULL VARCHAR2(10) EFF NOT NULL DATE DISC NOT NULL DATE FACTOR NOT NULL FLOAT(126) LEVELNUM NOT NULL NUMBER(38) FCSTTYPE NOT NULL NUMBER(38) HISTTYPE NOT NULL NUMBER(38) SCEN NOT NULL NUMBER(38) SCENBITS NOT NULL NUMBER(38) CONVFACTOR NOT NULL FLOAT(126) LOCKSTART NOT NULL DATE LOCKEND NOT NULL DATE TOMODEL NOT NULL VARCHAR2(18) FROMMODEL NOT NULL VARCHAR2(18) UDC_NEWAUTOGENSW NOT NULL NUMBER(38) SQL>select count(*) from dfu; COUNT(*) ---------- 770946 SQL>select count(*) from dfumap; COUNT(*) ---------- 1191740 parameters for optimizer_index_caching integer 80 optimizer_index_cost_adj integer 25
September 05, 2003 - 3:33 pm UTC
try using NOT IN instead of EXISTS
from stsc.dfumap a
where scen = 0
and (fromdmdunit, ... ) not in
( select b.dmdunit, b.dmdgroup, b.loc, b.model
from stsc.dfu b
where scen = 0
)
Sometimes not exists does not work!!!
Aldo bravo, October 06, 2003 - 4:31 pm UTC
Hi Tom, See this example: SQL> drop table t1; drop table t1 * ERROR at line 1: ORA-00942: table or view does not exist SQL> SQL> create table t1 (f1 varchar(10), 2 f2 varchar(10), 3 f3 varchar(10)); Table created. SQL> SQL> insert into t1 values ('1','5','3'); 1 row created. SQL> insert into t1 values ('4','4','6'); 1 row created. SQL> commit; Commit complete. SQL> SELECT f1,f2,f3 2 FROM t1 a 3 WHERE a.f2 <= '5' 4 AND NOT EXISTS (SELECT '1' 5 FROM t1 b 6 WHERE b.f2 <= '5' 7 AND b.f2 > a.f2); F1 F2 F3 ---------- ---------- ---------- 1 5 3 SQL> analyze table t1 compute statistics; Table analyzed. SQL> SELECT f1,f2,f3 2 FROM t1 a 3 WHERE a.f2 <= '5' 4 AND NOT EXISTS (SELECT '1' 5 FROM t1 b 6 WHERE b.f2 <= '5' 7 AND b.f2 > a.f2); no rows selected What can you say about it? If I delete the stats it works: SQL> exec dbms_stats.delete_table_stats(tabname=> 'T1',ownname=>'ESCMANAGER'); PL/SQL procedure successfully completed. SQL> / F1 F2 F3 ---------- ---------- ---------- 1 5 3 It is a 9.2.0.4 database on Windows 2003. Thanks,
October 06, 2003 - 6:22 pm UTC
first -- thank you thank you thank you for such a nice simple testcase. bravo
Second, I filed Bug No: 3178622 on this -- obviously, a problem.
implicit cast?
freek, October 07, 2003 - 8:04 am UTC
Tom,
Could the problem above be explained by an implicit cast?
Where the optimizer is after the analyze aware of the fact the the datatype of the column is varchar2 and does not convert the literal to a number?
freek@DEV01> select (case when to_char(x) > to_char(y) then 'X bigger' else 'Y bigger' end)
2 from ( select 2 x, 10 y
3 from dual
4 );
(CASEWHE
--------
X bigger
Elapsed: 00:00:00.00
if you modify the example to use different values, you get following result
freek@DEV01> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
Elapsed: 00:00:00.00
freek@DEV01> drop table t;
Table dropped.
Elapsed: 00:00:00.00
freek@DEV01> create table t1
2 ( f1 varchar(10),
3 f2 varchar(10),
4 f3 varchar(10)
5 );
Table created.
Elapsed: 00:00:00.00
freek@DEV01> insert into t1 values ('1','2','3');
1 row created.
Elapsed: 00:00:00.00
freek@DEV01> insert into t1 values ('4','10','6');
1 row created.
Elapsed: 00:00:00.00
freek@DEV01> commit;
Commit complete.
Elapsed: 00:00:00.00
freek@DEV01> select f1,f2,f3
2 from t1 a
3 where a.f2 <= '20'
4 and not exists ( select '1'
5 from t1 b
6 where b.f2 <= '20'
7 and b.f2 > a.f2
8 );
F1 F2 F3
---------- ---------- ----------
1 2 3
Elapsed: 00:00:00.00
freek@DEV01> analyze table t1 compute statistics;
Table analyzed.
Elapsed: 00:00:00.00
freek@DEV01> select f1,f2,f3
2 from t1 a
3 where a.f2 <= '20'
4 and not exists ( select '1'
5 from t1 b
6 where b.f2 <= '20'
7 and b.f2 > a.f2
8 );
F1 F2 F3
---------- ---------- ----------
1 2 3
Elapsed: 00:00:00.00
now if you change the query to use an to_number
freek@DEV01> select f1,f2,f3
2 from t1 a
3 where to_number(a.f2) <= 20
4 and not exists ( select 1
5 from t1 b
6 where to_number(b.f2) <= 20
7 and to_number(b.f2) > to_number(a.f2)
8 );
F1 F2 F3
---------- ---------- ----------
4 10 6
October 07, 2003 - 8:36 am UTC
nope, it is a bug - if the SAME query against the SAME data returns DIFFERENT answers based solely on the plan developed -- that is a bug.
RE: sometimes not exists does not work...
Mark A. Williams, October 07, 2003 - 5:35 pm UTC
Tom, Is there some environmental/config setting that could make a difference? I have 9.2.0.4.0 on a WinXP laptop and can not duplicate... Is this behavior specific to Windows Server 2003? I could not duplicate on 9.2.0.4.0 on Solaris, AIX, or Linux as well... (I am going to install Windows Server 2003 and test on that tomorrow) SQL> select * from v$version; BANNER ---------------------------------------------------------- Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production PL/SQL Release 9.2.0.4.0 - Production CORE 9.2.0.3.0 Production TNS for 32-bit Windows: Version 9.2.0.4.0 - Production NLSRTL Version 9.2.0.4.0 - Production 5 rows selected. SQL> create table t1 2 ( 3 f1 varchar(10), 4 f2 varchar(10), 5 f3 varchar(10) 6 ); Table created. SQL> insert into t1 values ('1','5','3'); 1 row created. SQL> insert into t1 values ('4','4','6'); 1 row created. SQL> commit; Commit complete. SQL> SELECT f1, 2 f2, 3 f3 4 FROM t1 a 5 WHERE a.f2 <= '5' 6 AND NOT EXISTS (SELECT '1' 7 FROM t1 b 8 WHERE b.f2 <= '5' 9 AND b.f2 > a.f2); F1 F2 F3 ---------- ---------- ---------- 1 5 3 1 row selected. SQL> analyze table t1 compute statistics; Table analyzed. SQL> SELECT f1, 2 f2, 3 f3 4 FROM t1 a 5 WHERE a.f2 <= '5' 6 AND NOT EXISTS (SELECT '1' 7 FROM t1 b 8 WHERE b.f2 <= '5' 9 AND b.f2 > a.f2); F1 F2 F3 ---------- ---------- ---------- 1 5 3 1 row selected. SQL> Thanks, Mark
October 07, 2003 - 6:18 pm UTC
absolutely -- any non-default optimizer related parameters will affect this. It's all about the plan that was generated and many things will affect that.
I can replicate on Windows 2000 as well
Martin, October 08, 2003 - 2:52 am UTC
Workaround
Aldo Bravo, October 08, 2003 - 4:16 pm UTC
If you use /*+ nounnest +/ hint or if you set the parameter _unnest_subquery=false in the spfile the problem is solved.
There are many bugs related to this issue (2739068-2700474-3085399-2829723) and in one of them I found the workaround.
Thanks!!
rewriting "not in"
A reader, February 19, 2004 - 9:17 am UTC
I have seen a "not in" query written as:
select a.key
from t1 a,
t2 b
where a.key = b.key(+) and
b.key is null
/
What do you think of this?
February 19, 2004 - 11:32 am UTC
it is the "poor mans hash anti-join", a good idea in the olden days of the RBO, not so good in the days of a smarter CBO. start with: ops$tkyte@ORA920PC> create table t1 as select * from all_objects where rownum <= 5000; Table created. ops$tkyte@ORA920PC> create table t2 as select * from all_objects where rownum <= 4950; Table created. ops$tkyte@ORA920PC> create index t2_idx on t2(object_id); Index created. select count(*) from t1 rbo where object_id not in ( select object_id from t2 ) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 8.63 8.53 129066 197295 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 8.64 8.53 129066 197295 0 1 select count(*) from t1 rbo where NOT EXISTS (select null from t2 where t2.object_id = rbo.object_id ) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.08 0.13 83 10075 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.09 0.13 83 10075 0 1 select count(*) from t1, t2 rbo where t1.object_id = rbo.object_id(+) and rbo.object_id IS NULL call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.08 0.11 72 5087 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.09 0.11 72 5087 0 1 <b>so, it looks "good" right? better than the alternatives in RBO, yes. but, analyze and</b> select count(*) from t1 cbo where object_id not in ( select object_id from t2 ) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.04 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.06 0.08 0 91 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.06 0.12 0 91 0 1 <b>hmmm, rbo not so good -- "user written hash anti join under rbo" not so good... the cbo does OK though with the user written one:</b> select count(*) from t1, t2 cbo where t1.object_id = cbo.object_id(+) and cbo.object_id IS NULL call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.08 0.10 0 91 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.08 0.11 0 91 0 1 <b>but I find the not in to be infinitely easier to code and alot more "informational" (clear,easy to understand)</b>
Union in NOT EXISTS
Ishaque Hussain, February 19, 2004 - 10:46 pm UTC
Hi Tom, I don't have any statistics to provide you but could you look at query A and query B and advise which would be more efficient to write.
Query A)
select *
from contr_prod_price cpp,
where cpp.contr in (
select contr
from contr
where co = 'E')
and cpp.prod_cd = 'PX'
and not exists
(select 'x'
from contr_prod_prc_low_lvl_ind cpli
where cpp.contr = cpli.contr
and cpp.contr_frmla = cpli.contr_frmla
and cpp.contr_lvl = cpli.contr_lvl
union
select 'x'
from contr_prod_prc_low_lvl_dir cpld
where cpp.contr = cpld.contr
and cpp.contr_frmla = cpld.contr_frmla
and cpp.contr_lvl = cpld.contr_lvl )
Query B)
select *
from contr_prod_price cpp,
where cpp.contr in (
select contr
from contr
where co = 'E')
and cpp.prod_cd = 'PX'
and not exists
(select 'x'
from contr_prod_prc_low_lvl_ind cpli
where cpp.contr = cpli.contr
and cpp.contr_frmla = cpli.contr_frmla
and cpp.contr_lvl = cpli.contr_lvl )
union
select *
from contr_prod_price cpp
where cpp.contr in (
select contr
from contr
where co = 'E')
and cpp.prod_cd = 'PX'
and not exists
(select 'x'
from contr_prod_prc_low_lvl_dir cpld
where cpp.contr = cpld.contr
and cpp.contr_frmla = cpld.contr_frmla
and cpp.contr_lvl = cpld.contr_lvl )
Thanks
February 20, 2004 - 7:22 am UTC
well, given that they do not return the same results "in general" (the 2cnd will do a distinct on the result set)....
and that I might really consider a NOT IN....
I would probably go with the first, but I would benchmark them (since I had both)
INTERSECT , MINUS
Arangaperumal, April 28, 2004 - 7:57 am UTC
Hi Tom,
1.EXISTS plays same role as INTERSECT
2.NOT EXISTS plays same role as MINUS.
am i right?
So, the same rule can be applied?
NOT IN Query
Reader, May 25, 2004 - 1:19 am UTC
Hi Tom, The following are 2 tables,(external_documents and external_document_links).They are linked by external_documents.NAME and external_document_links.EXTERNAL_DOCUMENT_NAME columns. SQL> select * from external_documents --(Primary Key: NAME) NAME ---------------------- corp-bro2.jpg cvs_structure.doc corp-bro.jpg Raid.pdf AutomaticUndoInternals.pdf 5 rows selected ------------------------------------------------------------- SQL> select * from external_document_links; --(All the 4 columns are part of the primary key) FORM DOCUMENT_NAME DOCUMENT_VERSION EXTERNAL_DOCUMENT_NAME ---------- ----------------------- ----------------------------- form_1 document_1 1 corp-bro2.jpg form_1 document_1 1 Raid.pdf form_2 document_2 1 AutomaticUndoInternals.pdf form_2 document_1 1 Raid.pdf PO po_4000542_2_69 1 Raid.pdf PO po_4000542_2_69 1 AutomaticUndoInternals.pdf form_3 document_1 1 Raid.pdf form_1 document_2 1 corp-bro.jpg PO po_4000542_2_69 1 corp-bro.jpg PO po_4000542_2_69 1 corp-bro2.jpg PO po_4000542_2_69 1 cvs_structure.doc 11 rows selected. ------------------------------------------------------------- The input to the query will be external_document_links.form,external_document_links.document_name and external_document_links.document_version. Based on this criteria I need to retrive documents linked to the above 3 columns with a flag as 'Y' and rest of the documents from external_documents table with the flag as 'N' I have written a query which goes like this :- SQL> select a.name NAME,'Y' PRESENT from external_documents a,external_document_links b where a.name=b.EXTERNAL_DOCUMENT_NAME AND b.form ='form_1' and b.DOCUMENT_VERSION =1 and b.DOCUMENT_NAME='document_1' union select name ,'N' from external_documents where name not in (select a.name NAME from external_documents a,external_document_links b where a.name=b.EXTERNAL_DOCUMENT_NAME AND b.form ='form_1'and b.DOCUMENT_VERSION =1 and b.DOCUMENT_NAME='document_1') So I get the right output as follows :- -------------------------------------------------------------------------------- NAME P ------------------------------ - AutomaticUndoInternals.pdf N Raid.pdf Y corp-bro.jpg N corp-bro2.jpg Y cvs_structure.doc N 5 rows selected ---------------------------------------------------------------- Can you please tell me if there is any other way in which this query can be re-written, i.e without using the 'NOT IN' clause.
May 25, 2004 - 7:12 am UTC
sigh, no table creates. no insert intos. no easy way to play with the data.
seems you want every row from external_documents, and if that document has a mate in external_document_links -- show Y else show N
select name, nvl( (select 'Y'
from external_document_links A
where a.name = b.external_document_name
and b.form = :bind1
and b.document_version = :bind2
and b.document_name = :bind3 ), 'N' ) present
from external_documents;
probably works, but I couldn't really run it or anything.
NOT IN Query
Reader, May 26, 2004 - 1:00 am UTC
Hi Tom
Thanks a TON, Sorry I did not provide the necesssary scripts for table and data creation.
You are the greatest.
NOT IN for REMOTE site
Erik, May 27, 2004 - 5:27 am UTC
Hello Tom,
I would like to ask you for one our bad sql which uses "not in" for remote site. For some sites we have slow network and the package takes about 10min. (it transfers about 100rows, runs once per hour). Do you have any suggestion for us how can we improve performance of the bad sql?
Thank you.
SET AUTOTRACE TRACEONLY
-- insert into monitor.promon@tesa
select
PROINUM1,
PROINUM2,
PROAPPL,
--lc_host,
PROSTAT,
upper(PROPROG),
PROMESS,
PRODATE,
upper(PROUSER),
PROTYPE
from
r4.promon
where
prodate > sysdate-1/24 and --ld_prodate and
prodate < sysdate and --ld_currentDate and
UPPER(proprog) not in (select UPPER(batch) from monitor.promon_badbatch@tesa);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2069 Card=1 Bytes=100)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'PROMON' (Cost=2069 Card=1 Bytes=100)
3 1 REMOTE* (Cost=1 Card=1 Bytes=17) TESA.WORLD
3 SERIAL_FROM_REMOTE SELECT "BATCH" FROM "MONITOR"."PROMON_BADBATCH" "PROMON_BADBATCH"
WHERE LNNVL(UPPER(:1)<>UPPER("BATCH"))
May 27, 2004 - 9:27 am UTC
might try
from promon a, (select distinct upper(batch) batch
from monitor.promon_badbatch@tesa ) b
where upper(a.proprog) = b.batch(+)
and b.batch is null
and .....
(and consider removing the need to upper things like that all of the time by standardizing on upper or lower case for those fields....)
NOT IN for REMOTE site
Erik, May 31, 2004 - 12:40 pm UTC
Tom, thank you very much.
But, could you explain me please, why does my "NOT IN" is processed as correlated subquery? I thought that it should not be correlated.
May 31, 2004 - 2:11 pm UTC
why did you think that? it is all just syntax, the optimizer can and does do whatever it wants under the covers.
Intreasting
Riaz Shahid, June 01, 2004 - 3:10 am UTC
Thats a pretty intresting thread. Tom would you please through some light on query:
select count(*)
from t1, t2 cbo
where t1.object_id = cbo.object_id(+)
and cbo.object_id IS NULL
We are (outer) joing the two tables on the basis of object_id and on the other hand we are saying "cbo.object_id IS NULL".
I can't understand how it will be executed. Can you please tell us in psuedo code ???
June 01, 2004 - 8:34 am UTC
if you understand an outer join -- basically every row in T1 will be in the result set, regardless of whether there was a 'mate' in t2 to be joined to. So, t1 will be retrieved from, and if there is a row(s) in t2 for that object_id -- the join will happen (and t2.object_id will be NOT NULL of course since we did the join). Else, the column values for t2 will be set to NULL and the row from t2 will be output. consider: ops$tkyte@ORA9IR2> create table t1 ( x int ); Table created. ops$tkyte@ORA9IR2> create table t2 ( x int ); Table created. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> insert into t1 values(1); 1 row created. ops$tkyte@ORA9IR2> insert into t1 values(2); 1 row created. ops$tkyte@ORA9IR2> insert into t2 values(1); 1 row created. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> select * from t1,t2 where t1.x = t2.x(+); X X ---------- ---------- 1 1 2 <b>the outer join says "all rows in t1 will be output, no matter what". So, all we need to do is keep the rows where t2.x (cbo.object_id) is NULL -- those are the rows in t1 such that there was no row in T2 to join to -- just like a not in</b>
Excellent Explanation
Riaz Shahid, June 01, 2004 - 9:37 am UTC
I got it.
Love the way you described it...
NOT IN for REMOTE site
Erik, June 01, 2004 - 11:20 am UTC
I thought that this should not be processed as correlated:
... UPPER(proprog) not in (select UPPER(batch) from monitor.promon_badbatch@tesa);
But the excecution plan showed (I think that :1 = promon.proprog):
SELECT "BATCH" FROM "MONITOR"."PROMON_BADBATCH" "PROMON_BADBATCH" WHERE LNNVL(UPPER(:1)<>UPPER("BATCH"))
I did little test:
create table t1 (id number); create table t2 (id number);
insert into t1 values (1); insert into t1 values (2);
insert into t2 values (2);
analyze table t1 compute statistics;
analyze table t2 compute statistics;
I)
select * from t1 where id not in (select id from t2);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'T1'
3 1 TABLE ACCESS (FULL) OF 'T2'
II)
select * from t1 where id not in (select id from t2@ee);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'T1' (Cost=3)
3 1 REMOTE* (Cost=3) EE.WORLD
3 SERIAL_FROM_REMOTE SELECT "ID" FROM "T2" "T2" WHERE LNNVL("ID"<>:1)
1) Is the first SQL processed same as the second - but only explain plan did not show it?
2) If not - why the 2nd is processed as correlated (I think that :1 = t1.id)?
Thank you.
June 01, 2004 - 2:59 pm UTC
no it is not. you are missing a NOT NULL on there (on t2.id) (would make it better).
different queries, different plans.
Into the Wayback Machine (wayback up the page)
Duke Ganote, September 04, 2004 - 12:02 pm UTC
Your two basic queries for comparison were: SQL> select count(*) from emp where empno not in ( select mgr from emp ); COUNT(*) ---------------- 0 SQL> select count(*) from emp T1 2 where not exists ( select null from emp T2 where t2.mgr = t1.empno ); COUNT(*) ---------------- 8 The first fails because of 3-value logic; it's easy to forget that NULL != NULL. The query might as well be: SQL> select count(*) from emp where empno not in ( select null from dual ); COUNT(*) ---------------- 0 The first query could be rectified to account for 3-value logic; for example: SQL> select count(*) from emp where to_char(empno) 2 NOT IN ( select NVL(to_char(mgr),'null') from emp ); COUNT(*) ---------------- 8 Is that correct? And the reason anti-joins can be efficient if there are no NULLs is because a standard index doesn't include NULL values?
September 04, 2004 - 1:04 pm UTC
it would be more correct to code:
not in ( select mgr from emp where mgr is not null )
converting a number to a string would not be a good thing and it is more "self explanatory" what is happening with the "is not null"
It is not a direct cause/effect. create index on emp(mgr,empno) would have an entry for each and every row since EMPNO is not null -- so it is not correct to say that indexes do not include null values -- starndard b*tree indexes do not include ENTIRELY null key entrys but in general they have nulls.
Into the Halfwayback Machine...
Duke Ganote, September 05, 2004 - 11:37 pm UTC
Looking at the "poor mans hash anti-join", there were 3 alternatives evaluated:
select count(*) from t1 rbo
where object_id not in ( select object_id from t2 );
select count(*) from t1 rbo
where NOT EXISTS (select null from t2 where t2.object_id = rbo.object_id );
select count(*) from t1, t2 rbo
where t1.object_id = rbo.object_id(+) and rbo.object_id IS NULL;
Yet none seemed "set oriented". I tried another query that, to me, seems clearer:
select count(*) from t1 rbo,
( select object_id from t1 MINUS select object_id from t2 ) SET_OP
where SET_OP.object_id = RBO.object_id
/
Under RBO, it won with only 225 LIOs (compared with 197295, 10075, and 5087 LIOs):
select count(*) from t1 rbo,
( select object_id from t1 MINUS select object_id from t2 ) SET_OP
where SET_OP.object_id = RBO.object_id
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.06 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.04 0.04 0 225 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.04 0.10 0 225 0 1
However, under the CBO, it came in 3rd with 166 LIOs (more than 91 for 2 others).
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.03 0.03 0 166 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.03 0.03 0 166 0 1
I just find it interesting that logically-equivalent queries have such different implementations.
September 06, 2004 - 10:05 am UTC
select count(*) from t1 rbo
where object_id not in ( select object_id from t2 );
isn't set oriented? actually -- they are all set oriented, no procedural code, no ordered array semantics (as you get with analytics)
you have "give me everything in this set that isn't in this other set", vs "get me everything in this set such that is doesn't exist in the other set", vs "outer join set A to set B and keep rows such that .... is true"
that last one you added really seems "cleaner" than
select count(*) from t1 rbo
where object_id not in ( select object_id from t2 );
does?!?! really?
in & exists
A reader, September 06, 2004 - 11:41 am UTC
hi
in one of your first replies you mentioned this
+++++++++++++++++++++++++
So, the predicate
where x not in ( NULL )
evaluates to neither TRUE, nor FALSE
How about
where x in ( NULL )
that evaluates to neither true or false right?
So isnt IN and EXISTS different as well (like NOT IN and NOT EXISTS)?
September 06, 2004 - 11:59 am UTC
where in and where exists work the same.
where not in and where not exists work differently.
NULL does not affect the behaviour of "in ( set with nulls )" whereas NULL does affect the behaviour of "not in ( set with nulls )"
Clean is in the eyes of the beholder...
Duke Ganote, September 06, 2004 - 11:34 pm UTC
Truly all examples were non-procedural, but I think of set operations in terms of Venn diagrams and operations like MINUS, UNION, and INTERSECT. To my taste, the cleanest query for a simple count is:
select count(*) from
( select object_id from t1 MINUS select object_id from t2 ) RBO_SET
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.05 0.18 143 150 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.06 0.20 143 150 0 1
which has a scant 150 LIOs under the RBO, and an equal-to-the-best 91 LIOs under CBO:
select count(*) from
( select object_id from t1 MINUS select object_id from t2 ) CBO_SET
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.02 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.02 0.02 0 91 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.02 0.04 0 91 0 1
The other 3 queries could easily obtain information from T1 by replacing the "count(*)", so I augmented the simple MINUS with a join back to T1 in my previous query.
BTW, I appreciate your new "LINKS I LIKE" tab! Thank you.
Strange behaviour of IN with OR
Mike, November 10, 2004 - 8:48 am UTC
I have the most bizarre CBO'ness (for what of a better word!) that I can not fathom. Below are the tear down and set up needed.
select * from v$banner
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
PL/SQL Release 9.2.0.4.0 - Production
CORE 9.2.0.3.0 Production
TNS for HPUX: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
--
-- Clear down
--
drop table HAS_PERMISSION_UNIT_TEST
/
drop table PLT_NODE_PERMISSIONS
/
drop table PLT_SECURITY_NODES
/
drop table PLT_USER_DELEGATES
/
drop table PLT_XDI_PERMISSIONS
/
CREATE TABLE PLT_SECURITY_NODES
( NODE_ID VARCHAR2(16) NOT NULL,
PARENT_NODE_ID VARCHAR2(16),
IS_DELETED NUMBER NOT NULL,
IS_SYSTEM NUMBER NOT NULL,
LAST_MODIFIED_DATETIME DATE,
LAST_MODIFIED_USER_ID VARCHAR2(32)
)
/
CREATE INDEX PSN_PSN_FK_I ON PLT_SECURITY_NODES
(PARENT_NODE_ID)
/
CREATE UNIQUE INDEX PSN_PK ON PLT_SECURITY_NODES
(NODE_ID)
/
CREATE INDEX NODE_AND_PARENT_NODE ON PLT_SECURITY_NODES
(NODE_ID, PARENT_NODE_ID)
/
ALTER TABLE PLT_SECURITY_NODES ADD (
CONSTRAINT PSN_PK PRIMARY KEY (NODE_ID))
/
ALTER TABLE PLT_SECURITY_NODES ADD (
CONSTRAINT PSN_PSN_FK FOREIGN KEY (PARENT_NODE_ID)
REFERENCES PLT_SECURITY_NODES (NODE_ID))
/
------------------------------------------------------------------------------------------------
CREATE TABLE PLT_NODE_PERMISSIONS
( NODE_ID VARCHAR2(32) NOT NULL,
USER_ID VARCHAR2(32) NOT NULL,
ROLE_ID VARCHAR2(32) NOT NULL,
IS_DELETED NUMBER NOT NULL,
IS_SYSTEM NUMBER NOT NULL,
LAST_MODIFIED_DATETIME DATE,
LAST_MODIFIED_USER_ID VARCHAR2(32)
)
/
CREATE UNIQUE INDEX PNP_PK ON PLT_NODE_PERMISSIONS
(NODE_ID, USER_ID, ROLE_ID)
/
CREATE INDEX NODE_AND_USER ON PLT_NODE_PERMISSIONS
(NODE_ID, USER_ID)
/
CREATE INDEX PNP_PSN_FK_I ON PLT_NODE_PERMISSIONS
(NODE_ID)
/
CREATE INDEX PNP_PPR_FK_I ON PLT_NODE_PERMISSIONS
(ROLE_ID, USER_ID)
/
ALTER TABLE PLT_NODE_PERMISSIONS ADD (
CONSTRAINT PNP_PK PRIMARY KEY (NODE_ID, USER_ID, ROLE_ID))
/
ALTER TABLE PLT_NODE_PERMISSIONS ADD (
CONSTRAINT PNP_PSN_FK FOREIGN KEY (NODE_ID)
REFERENCES PLT_SECURITY_NODES (NODE_ID))
/
----------------------------------------------------------------------------
CREATE TABLE PLT_USER_DELEGATES
( USER_ID VARCHAR2(32) NOT NULL,
DELEGATE_USER_ID VARCHAR2(32) NOT NULL,
EFFECTIVE_FROM DATE NOT NULL,
EFFECTIVE_TO DATE,
IS_DELETED NUMBER NOT NULL,
IS_SYSTEM NUMBER NOT NULL,
LAST_MODIFIED_DATETIME DATE,
LAST_MODIFIED_USER_ID VARCHAR2(32)
)
/
CREATE INDEX PUD_PP_FK_I ON PLT_USER_DELEGATES
(USER_ID)
/
CREATE INDEX PUD_PP_DELEGATE_FK_I ON PLT_USER_DELEGATES
(DELEGATE_USER_ID)
/
CREATE UNIQUE INDEX PUD_PK ON PLT_USER_DELEGATES
(USER_ID, DELEGATE_USER_ID, EFFECTIVE_FROM)
/
ALTER TABLE PLT_USER_DELEGATES ADD (
CONSTRAINT PUD_PK PRIMARY KEY (USER_ID, DELEGATE_USER_ID, EFFECTIVE_FROM))
/
---------------------------------------------------------------------------------------
CREATE TABLE PLT_XDI_PERMISSIONS
(
XDI VARCHAR2(255) NOT NULL,
USER_ID VARCHAR2(32) NOT NULL,
ROLE_ID VARCHAR2(16) NOT NULL,
IS_DELETED NUMBER NOT NULL,
IS_SYSTEM NUMBER NOT NULL,
LAST_MODIFIED_DATETIME DATE,
LAST_MODIFIED_USER_ID VARCHAR2(32)
)
/
CREATE INDEX PXP_PPR_FK_I ON PLT_XDI_PERMISSIONS
(ROLE_ID, USER_ID)
/
CREATE UNIQUE INDEX PXP_PK ON PLT_XDI_PERMISSIONS
(XDI, USER_ID, ROLE_ID)
/
ALTER TABLE PLT_XDI_PERMISSIONS ADD (
CONSTRAINT PXP_PK PRIMARY KEY (XDI, USER_ID, ROLE_ID))
/
begin
--
-- Set up 4 delegates for this user to have access via
--
for idx in 1..4
loop
insert into plt_user_delegates
( user_id, delegate_user_id, effective_from, effective_to, is_deleted, is_system, last_modified_datetime, last_modified_user_id)
values
('MJONES', 'MJONES_DELEGATE_'||idx, sysdate -100, sysdate + 100, 0, 0, sysdate, 'MJONES');
--
end loop;
--
-- Loop 500 times to create some realistic volume of data that this user has
--
for idx in 1..500
loop
insert into plt_security_nodes
( node_id, parent_node_id, is_deleted, is_system, last_modified_datetime, last_modified_user_id )
values
('NODE_'||idx||'_TOP_LVL', null, 0, 1, sysdate, 'MJONES');
--
insert into plt_security_nodes
( node_id, parent_node_id, is_deleted, is_system, last_modified_datetime, last_modified_user_id )
values
('NODE_'||idx||'_LEVEL_2', 'NODE_'||idx||'_TOP_LVL', 0, 1, sysdate, 'MJONES');
--
insert into plt_security_nodes
( node_id, parent_node_id, is_deleted, is_system, last_modified_datetime, last_modified_user_id )
values
('NODE_'||idx||'_LVL_3', 'NODE_'||idx||'_LEVEL_2', 0, 1, sysdate, 'MJONES');
--
-- Create permission into Node
--
insert into plt_node_permissions ( node_id, user_id, role_id, is_deleted, is_system ) values (
'NODE_'||idx||'_LVL_3', 'MJONES', 'UNIT_TEST', 0, 0);
end loop;
--
-- Now wedge up some XDI Permissions. Have a total of 100 XDI's
--
for idx in 1..100
loop
insert into plt_xdi_permissions
( xdi, user_id, role_id, is_deleted, is_system)
values
( idx||'_JONES_THE_XDI', 'MJONES', 'UNIT_TEST', 0, 0);
end loop;
end;
/
analyze table plt_node_permissions compute statistics for table for all indexes for all indexed columns
/
analyze table plt_security_nodes compute statistics for table for all indexes for all indexed columns
/
analyze table plt_user_delegates compute statistics for table for all indexes for all indexed columns
/
analyze table plt_xdi_permissions compute statistics for table for all indexes for all indexed columns
/
--
-- Now build up a table of 100,000 rows to base all this malarkey on... Set the access up to start with so that we have access too all 100,000 records in bunches of 200 (500 access records we have access to)
--
create table has_permission_unit_test as
select
to_char(rownum) primary_key,
owner, object_name, subobject_name, object_id, data_object_id, object_type, created, last_ddl_time, timestamp, status, temporary, generated, secondary,
'NODE_'||(mod(rownum,200)+1)||'_LVL_3' security_node_id,
--decode(mod(rownum,1000),0, mod(rownum,1000)||'JONES_THE_XDI',rownum||'_XDI_'||object_name) xdi,
rownum||'_XDI_'||object_name xdi,
0 is_deleted,
sysdate last_modified_datetime,
'BOB' last_modified_user_id
from
( select * from all_objects
union all
select * from all_objects
union all
select * from all_objects
union all
select * from all_objects
union all
select * from all_objects
)
where rownum <= 100000-- 100K
/
--
alter table has_permission_unit_test modify primary_key varchar2(32)
/
--
alter table has_permission_unit_test add primary key (primary_key)
/
--
analyze table has_permission_unit_test compute statistics for table for all indexes for all indexed columns
/
--
-- END OF SET UP
--
The query that behaves strangley is...
select count(*)
from has_permission_unit_test main
where main.primary_key <= :i_volume
and
(
main.security_node_id in
( select /*use_hash(sn) index_ffs( sn node_and_parent_node ) use_hash(np) index_ffs(np node_and_user)*/ np.node_id
from plt_security_nodes sn, plt_node_permissions np
where sn.node_id = np.node_id
and np.user_id in
( select user_id
from plt_user_delegates ud_first
where ud_first.delegate_user_id = 'MJONES'
and sysdate between ud_first.effective_from and ud_first.effective_to
and ud_first.is_deleted = 0
union all
select 'MJONES' user_id
from dual dual_first
)
connect by prior sn.parent_node_id = sn.node_id
)
-- COMMENT OUT FROM HERE
or
( main.xdi in
( select xi.xdi
from plt_xdi_permissions xi
where xi.user_id in
( select user_id
from plt_user_delegates ud_second
where ud_second.delegate_user_id = 'MJONES'
and sysdate between ud_second.effective_from and ud_second.effective_to
and ud_second.is_deleted = 0
union all
select 'MJONES' user_id
from dual dual_second
)
)
)
-- DOWN TO HERE
)
Running as is with :i_volume = 1000 takes around 4 seconds, and the matches are being done with the security nodes (I beleive), but if you comment out the bottom bit the plan radically changes to start FFS with a couple of indexes I can not see why the "or" changes this. Even if I change this to "or 1 in (select null from dual where 1=0) it still won't FFS. I tried hinting (the + is removed so it won't pick it up to no avail. Why has the performance so dropped off when I stick any OR in?
Thanks in advance, and hope this is enough info.
Mike.
November 10, 2004 - 12:05 pm UTC
sorry, this is just "not in scope" here and way too "big" for a review -- I have to answer these things relatively "quick" -- they are in a fifo queue, i either answer it or skip it. I don't go back.
this is a "question" for when I'm taking them.
Fair Enough!
A reader, November 10, 2004 - 12:59 pm UTC
I'll await a question time...
Reducing time
Yogesh, December 04, 2004 - 1:00 pm UTC
I've 2 tables cust(5G) and archive(1G), I want to create third table which will have cust - archive data so I used following qry
create table newcust
nologging
as
select * from cust where c1 not in
(select c1 from archive)
parallel(Degree 8)
/
Cust table have index on c1 column ... archive table do not have any indexes.
This query was running for 7 hours .. I killed this qry as it was taking almost all resources.
Can you please suggest some alternative to speed up this query?
December 04, 2004 - 1:46 pm UTC
make sure to use cbo, make sure c1 is NOT NULLable or you say "where c1 is not null" in the subquery.
not sure that parallel would be necessary - big juicy hash anti-join is what I'd be looking for.
Null
Yogesh, December 06, 2004 - 4:58 am UTC
Can you please explain part?
make sure c1 is null or you say "where c1 is not null" in
the subquery.
should it be
"make sure c1 is not null or you say "where c1 is not null" in the subquery."?
One more thing to add. C1 is not null in both the tables. In cust table it is PK and archive table is extract of cust.
December 06, 2004 - 11:43 am UTC
thanks, i corrected that -- it should be "make sure c1 is NOT NULL'able"
NULL
Yogesh, December 06, 2004 - 6:57 am UTC
I guess in the above answer you was relating NULL and NOT NULL to /*+ HASH_AJ */.
There are 2 options we can use
/*+ HASH_AJ */
/*+ MERGE_AJ */
which one we should use?
December 06, 2004 - 12:09 pm UTC
hash_aj probably. hash one table and then join to it -- versus sort both and merge.
REPORT 6i
Elena, March 18, 2005 - 3:59 am UTC
I live in Ukraine and have a question to you:
"Do you answer on questions about REPORT 6i?"
Or you can give us mail when I would ask my questions.
Please,answer me.
With best regards
Elena
March 18, 2005 - 7:04 am UTC
nope -- i've never used reports in my life.
otn.oracle.com -> discussion forums. there is a forum for reports.
HASH_AJ
Yogesh, April 04, 2005 - 6:58 am UTC
I'm using following query
CREATE TABLE ARCHIVE
AS SELECT * FROM SMALL WHERE CNO NOT IN
(SELECT /*+ HASH_AJ */ CNO FROM LARGE WHERE CNO IS NOT NULL)
Following is the query plan
CREATE TABLE STATEMENT Optimizer Mode=CHOOSE
LOAD AS SELECT
FILTER
TABLE ACCESS FULL SMALL 14 K 564 K 236
INDEX FULL SCAN PK_LARGE 867 K 21 M 48303
Size of small table is 20M, and large table is 3225M. Here I was expecting HASH ANTI JOIN.
But if I use following query ...
CREATE TABLE ARCHIVE
AS SELECT * FROM SMALL WHERE CNO IS NOT NULL AND CNO NOT IN
SELECT /*+ HASH_AJ */ CNO FROM LARGE)
It is using ANTI HASH JOIN.
My questions are,
1. Do we have to place NOT NULL in inner query or it can be a part of outer query?
2. Does table size matter?
April 04, 2005 - 7:43 am UTC
are your stats up to date, versions are always useful.
HASH_AJ
Yogesh, April 04, 2005 - 9:23 am UTC
DB version 8.0.4.
Stats for small table is up to date. Large table was analyzed 7 days back. Should I analyze it again ?
April 04, 2005 - 9:27 am UTC
do this,
select table_name, num_rows from user_tables where table_name in ( 'A, 'B' )
using your table names of course. Anywhere close to reality? post it.
svrmgrl> show parameter hash
as well
HASH_AJ
Yogesh, April 04, 2005 - 10:31 am UTC
TABLE_NAME : LARGE
TABLESPACE_NAME :
CLUSTER_NAME :
IOT_NAME :
PCT_FREE :
PCT_USED :
INI_TRANS :
MAX_TRANS :
INITIAL_EXTENT :
NEXT_EXTENT :
MIN_EXTENTS :
MAX_EXTENTS :
PCT_INCREASE :
FREELISTS :
FREELIST_GROUPS :
LOGGING :
BACKED_UP : N
NUM_ROWS : 17351339
BLOCKS : 399153
EMPTY_BLOCKS : 12287
AVG_SPACE : 1305
CHAIN_CNT : 0
AVG_ROW_LEN : 153
AVG_SPACE_FREELIST_BLOCKS : 0
NUM_FREELIST_BLOCKS : 0
DEGREE : 1
INSTANCES : 1
CACHE : N
TABLE_LOCK : ENABLED
SAMPLE_SIZE : 0
LAST_ANALYZED : 28-mar-2005 20:31:03
PARTITIONED : YES
IOT_TYPE :
TEMPORARY : N
NESTED : NO
BUFFER_POOL :
###############################################################
TABLE_NAME : SMALL
TABLESPACE_NAME : DB1
CLUSTER_NAME :
IOT_NAME :
PCT_FREE : 10
PCT_USED : 40
INI_TRANS : 1
MAX_TRANS : 255
INITIAL_EXTENT : 5242880
NEXT_EXTENT : 5242880
MIN_EXTENTS : 1
MAX_EXTENTS : 2147483645
PCT_INCREASE : 0
FREELISTS : 1
FREELIST_GROUPS : 1
LOGGING : YES
BACKED_UP : N
NUM_ROWS : 282211
BLOCKS : 2449
EMPTY_BLOCKS : 110
AVG_SPACE : 2792
CHAIN_CNT : 0
AVG_ROW_LEN : 41
AVG_SPACE_FREELIST_BLOCKS : 4293
NUM_FREELIST_BLOCKS : 9
DEGREE : 1
INSTANCES : 1
CACHE : N
TABLE_LOCK : ENABLED
SAMPLE_SIZE : 0
LAST_ANALYZED : 03-apr-2005 18:24:31
PARTITIONED : NO
IOT_TYPE :
TEMPORARY : N
NESTED : NO
BUFFER_POOL : DEFAULT
SVRMGR> show parameter hash
NAME TYPE VALUE
----------------------------------- ------- ------------------------------
hash_area_size integer 0
hash_join_enabled boolean TRUE
hash_multiblock_io_count integer 1
sequence_cache_hash_buckets integer 10
April 04, 2005 - 12:02 pm UTC
just wanted num rows....
you said:
Size of small table is 20M, and large table is 3225M. Here I was expecting
HASH ANTI JOIN.
stats say 282,211 and 17,351,339
they are "off" a tad. something isn't adding up.
HASH_AJ
Yogesh, April 04, 2005 - 12:39 pm UTC
M -> MB. Sorry about that ...
April 04, 2005 - 1:44 pm UTC
so you didn't have row counts before, they were sizes?
what is your sort area size?
HASH_AJ
Yogesh, April 04, 2005 - 2:00 pm UTC
Yes, 20MB is the size of small table, and large table size is 3225MB.
sort_area_size = 10000000
April 04, 2005 - 5:19 pm UTC
guess you can do the "do it yourself" way if you believe it'll be faster:
select small.*
from large, small
where large.key(+) = small.key
and large.key is null;
HASH_AJ
Yogesh, April 05, 2005 - 5:15 am UTC
I was trying to understand why it is not able to use HASH_AJ hint in this specific case. Because it is working with other queries. If I'm using two tables small - 1GB+ and Large - 5GB+, it works fine .....
"poor mans hash anti-join" ? Good idea ..
April 05, 2005 - 7:32 am UTC
8.0.4 could have some definite impact on this. really old stuff.
HASH_AJ
Yogesh, April 05, 2005 - 8:43 am UTC
Displaying the column of inner query into outer query
Baskar, May 31, 2005 - 11:26 am UTC
Hi Tom,
Please look into this query.
SELECT
statement_date,
revenue_centre,
bill_create_date,
jnl_end_date,
company_code,
revenue
FROM ods_merc_bill_rev stg
WHERE NOT EXISTS (SELECT 1
FROM tmp_rtmr_accounts tra
WHERE stg.gl_account_code||' 000000' = tra.account_code_n106)
OR NOT EXISTS (SELECT account_code_n106,product_code
FROM tmp_rtmr_accounts tra
WHERE stg.gl_account_code||' 000000' = tra.account_code_n106
and EXISTS (SELECT 1
FROM tmp_rtmr_products trp
WHERE
tra.product_code = trp.product_code ))
Now I want the product code which is used in inner most query to be displayed in main query..
How to bring this into main query. Actually It should
return only three rows. If I insert the product code into main query, It is replicated. How to do!..Please advice me..
Table structure for ods_merc_bill_rev:
PROCESS_KEY NUMBER (12)
COMPANY_CODE VARCHAR (6)
REVENUE_CENTRE VARCHAR (6)
STATEMENT_DATE VARCHAR (8)
BILL_CREATE_DATE VARCHAR (8)
JNL_END_DATE VARCHAR (8)
BILL_PERIOD VARCHAR (3)
GL_ACCOUNT_CODE VARCHAR (10)
REVENUE NUMBER (20,2)
Table structure for tmp_rtmr_accounts:
ACCOUNT_CODE_N106 VARCHAR (17)
PRODUCT_CODE VARCHAR (20)
Table structure for tmp_rtmr_products:
PRODUCT_CODE VARCHAR (20)
Please have a look at this and give me a solution as soon as possible..
Thanks in Advance!
Regards,
baskar
May 31, 2005 - 6:34 pm UTC
you'll be joining - but it doesn't seem to make sense to me since you say "or not exists ( product code )"
how can you show something that doesn't exist.
(i'm not here to write SQL for you, I'll be glad to answer questions -- but please don't think I'm going to write your SQL as soon as possible, that is just not reasonable)
hash_aj in 10G
Yogesh, August 05, 2005 - 11:53 am UTC
What is the replacement of hash_aj in 10g? or do we don't have to use hash_aj anymore? Sorry I don't have installed version of 10g. I read this in one document.
August 05, 2005 - 2:14 pm UTC
the "hint" seems to be not documented, but the anti join lives on ops$tkyte@ORA10G> create table t1 ( x int not null, y int ); Table created. ops$tkyte@ORA10G> create table t2 ( x int not null, y int ); Table created. ops$tkyte@ORA10G> ops$tkyte@ORA10G> exec dbms_stats.set_table_stats( user, 'T1', numrows=>1000000, numblks=> 10000 ); PL/SQL procedure successfully completed. ops$tkyte@ORA10G> exec dbms_stats.set_table_stats( user, 'T2', numrows=>1000000, numblks=> 10000 ); PL/SQL procedure successfully completed. ops$tkyte@ORA10G> ops$tkyte@ORA10G> @plan "select * from t1 where x not in (select x from t2 )"; ops$tkyte@ORA10G> delete from plan_table; 4 rows deleted. ops$tkyte@ORA10G> explain plan for &1; old 1: explain plan for &1 new 1: explain plan for select * from t1 where x not in (select x from t2 ) Explained. ops$tkyte@ORA10G> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------- Plan hash value: 629543484 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | | 8525 (2)| |* 1 | HASH JOIN RIGHT ANTI| | 1 | 39 | 23M| 8525 (2)| | 2 | TABLE ACCESS FULL | T2 | 1000K| 12M| | 2742 (2)| | 3 | TABLE ACCESS FULL | T1 | 1000K| 24M| | 2742 (2)| -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("X"="X") 15 rows selected.
anti join
Yogesh, August 09, 2005 - 5:14 am UTC
so optimizer is smart enough to make this decision without specifying any hint?
August 09, 2005 - 10:02 am UTC
Strange outcome of NOT IN
SAF, August 11, 2005 - 4:08 am UTC
Hi Tom, I'm using NOT IN to clear up a table and turning up with some strange results. Hope this is not too stupid... anyway here goes. The basic idea is to clean up an old large table, based on a smaller new one. Both tables have no null elements. --create a "small" table first. CREATE TABLE m1_msisdns_int (msisdn varchar2(18) NOT NULL); --it's filled with some data... --create table of unmatched keys CREATE TABLE unmatched_msisdns (msisdn varchar2(18) NOT NULL); --fill it up INSERT INTO unmatched_msisdns (SELECT mi.msisdn FROM medapp.msisdn_imei mi WHERE mi.msisdn NOT IN (SELECT msisdn FROM m1_msisdns_int)); OK, now records in unmatched_msisdns *should* be in msisdn_imei and *should not* be in m1_msisdns_int. But this is not the case. For example, picking some random records: SQL> select * from unmatched_msisdns where rownum < 5; MSISDN ------------------ 6581000016 6581000017 6581000032 6581000033 SQL> select msisdn from medapp.msisdn_imei where msisdn = '6581000016'; MSISDN ------------------ 6581000016 SQL> select msisdn from m1_msisdns_int where msisdn = '6581000016'; MSISDN ------------------ 6581000016 Why is the record in the m1_msisdns_int table too? What am I doing wrong? Thanks.
August 11, 2005 - 9:55 am UTC
show us from start to finish -- eg: table unmatched starts EMPTY.
do you have a test case I can reproduce with?
Strange outcome of NOT IN: details
SAF, August 12, 2005 - 4:37 am UTC
Hi Tom, Thanks for response. Here are more details of the situation. I run this script first: ======================== DROP TABLE m1_msisdns_ext; DROP TABLE m1_msisdns_int; DROP TABLE unmatched_msisdns; CREATE OR REPLACE DIRECTORY data_dir AS '/home/oracle'; CREATE TABLE m1_msisdns_ext (msisdn varchar2(18)) ORGANIZATION EXTERNAL (TYPE oracle_loader DEFAULT DIRECTORY data_dir ACCESS PARAMETERS (RECORDS DELIMITED BY NEWLINE fields terminated by ',') location ('msisdns_20050811.dat') ) PARALLEL 5 REJECT LIMIT UNLIMITED; CREATE TABLE m1_msisdns_int (msisdn varchar2(18) NOT NULL); INSERT INTO m1_msisdns_int (SELECT msisdn FROM m1_msisdns_ext); DROP TABLE m1_msisdns_ext; commit; CREATE TABLE unmatched_msisdns (msisdn varchar2(18) NOT NULL); INSERT INTO unmatched_msisdns (SELECT msisdn FROM medapp.msisdn_imei WHERE msisdn NOT IN (SELECT msisdn FROM m1_msisdns_int)); commit; ======================== Now, 1. The data file to create m1_msisdns_ext (msisdns_20050811.dat) is just a single column file with strings e.g.: 6596886272 6597600746 6593686340 6596875161 6594745290 6597902848 6598493940 6590933396 6597440556 6593672611 You can generate it yourself, or I can pass one to you. 2. The table medapp.msisdn_imei also contains a list of strings like m1_msisdns_int, but a much larger set. 3. We're using Oracle 10g, on Lintel. All results shown here using SQLPlus. Just to be sure, I do these checks: SQL> select count(*) from medapp.msisdn_imei; COUNT(*) ---------- 1652979 SQL> select count(*) from m1_msisdns_int; COUNT(*) ---------- 1204505 SQL> select count(*) from unmatched_msisdns; COUNT(*) ---------- 449069 SQL> select * from m1_msisdns_int where msisdn is null; no rows selected SQL> select * from medapp.msisdn_imei where msisdn is null; no rows selected However, when I run the NOT IN based insertion, I still see some (some, not all) records in the unmatched_msisdns where some records belong to both tables (m1_msisdns_int and medapp.msisdn_imei). Finally, as an aside, when I select with quote a query does not work, but works without quotes!! Weird, isn't it??: SQL> select * from unmatched_msisdns where msisdn = 6581000016; MSISDN ------------------ 6581000016 SQL> select * from unmatched_msisdns where msisdn = '6581000016'; no rows selected Sorry for the long post, but hope it makes things clear. Thanks for your time and effort.
August 12, 2005 - 8:45 am UTC
test case, something I can cut and paste and just @runit
Check for trailing spaces
Jono, August 12, 2005 - 9:12 am UTC
I would suggest checking if the unmatched_msisdns table contains any trailing spaces after the digits. These would be ignored when an implicit conversion to a number took place and will not be noticeable in the normal SQL*plus query output.
NOT IN: details
SAF, August 14, 2005 - 10:02 pm UTC
For the test case, I'd need to pass you the data file. How do I do that?
Jono:
Thanks for your input. However there are no trailing spaces in the string. Is there a way to turn this behaviour off?
August 15, 2005 - 7:00 am UTC
develop a method to generate, to reproduce the ISSUE. However, I just noticed you said: SQL> select * from unmatched_msisdns where msisdn = 6581000016; MSISDN ------------------ 6581000016 SQL> select * from unmatched_msisdns where msisdn = '6581000016'; no rows selected I disagree with your analysis of the trailing whitespace. When you compare the string to a number, there is an implicit to_number applied to the string: ops$xp10gr1\tkyte@ORA10GR1> create table t ( x varchar2(18) ); Table created. ops$xp10gr1\tkyte@ORA10GR1> @plan "select * from t where x = 1" 0 rows deleted. old 1: explain plan for &1 new 1: explain plan for select * from t where x = 1 Explained. PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------- Plan hash value: 1601196873 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 11 | 2 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T | 1 | 11 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- <b> 1 - filter(TO_NUMBER("X")=1) </b> Note ----- - dynamic sampling used for this statement 17 rows selected. <b>when you compare a string to a string, you do NOT have that happening:</b> ops$xp10gr1\tkyte@ORA10GR1> @plan "select * from t where x = '1'" 2 rows deleted. old 1: explain plan for &1 new 1: explain plan for select * from t where x = '1' Explained. PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------- Plan hash value: 1601196873 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 11 | 2 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T | 1 | 11 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- <b> 1 - filter("X"='1') </b> Note ----- - dynamic sampling used for this statement 17 rows selected. So, I believe you DO have trailing whitespace on some of those. Look again. select * from t where x like '% ' or x like '%'||chr(9) or x like '%' || chr(10) or x like '%' || chr(13) / for example.
spaces
SAF, August 16, 2005 - 6:08 am UTC
The trailing spaces were the culprit indeed.
Thanks a lot.
A reader, August 24, 2005 - 3:54 pm UTC
Tom,
Sorry for dragging this thread back to the basics a bit, but going back to your first reply
select count(*) from emp where empno not in (Select nvl(mgr,0) from emp)
appears to bring the correct result.
What I am trying to understand is when the "NOT EXISTS" works surely it should also say 'Hey, I see a Null value and I don't know if it maybe or may not be so I can ignore'. I mean why is NOT IN alone returning count(*)=0, comparitively.
Thanks
August 25, 2005 - 3:25 am UTC
the where exists DOESN'T see the null. That is the point :)
we said "not exists ( select null from emp T2 where t2.mgr = t1.empno );"
and sure enough for that t1.empno, there were no rows.
Great Stuff
yogi, September 09, 2005 - 2:57 pm UTC
Thanks this was very useful .I used this to bring down the response time of a query from 40 minutes to some few seconds .Keep the good work going.
Not Exists
V, October 27, 2005 - 4:09 pm UTC
Tom I have the following which takes forever to run:
..
..
AND NOT EXISTS (SELECT null FROM tm tp WHERE tp.id = cd.id)
AND NOT EXISTS (SELECT null FROM t1 tp WHERE tp.id = cd.id)
can this be changed to:
AND NOT EXISTS (SELECT null FROM tm tp,t1 tp1 WHERE tp.id = cd.id OR tp1.id = cd.id )
Would this result in the same result?
October 28, 2005 - 1:42 am UTC
why do you believe the not exists is the problem? (you are using the CBO right?)
No, it cannot be changed to what you have, that query is not the same as the prior two.
Using the cbo, you might:
and cd.id not in ( select id from tm where id is not null
union all
select id from t1 where id is not null )
Is this way of existence check valid?
Ford Desperado, November 29, 2005 - 4:17 pm UTC
Tom,
I've come across an application where the client checks if there are any rows meeting some criteria. Here is how it is done:
select count(*) from some_table where (some complex criteria) and rownum=1
/* also in some places rownum<2 is used */
Is is a safe way for existence checks?
Isn't it better to check existence like this:
select case when exists(select 1 from some_table where (some complex criteria)) then 1 else 0 end a from dual
Thank you
November 30, 2005 - 11:07 am UTC
here is the best way to do this:
THIS SPACE INTENTIONALLY LEFT BLANK
I hate that check for existence, hardware vendors love them since it increases sales, but in software it is typically "not useful"
I take the approach of "rather than see if there is work to be done, I'll try to do work and at that point discover one of two things:
a) work is there to be done, lets do it
b) work is not there to be done, skip it
the rownum check works however.
great, thanks!
Ford Desperado, November 30, 2005 - 1:33 pm UTC
A reader, January 04, 2006 - 9:52 am UTC
How can I get a hash mechanism working for this query
Naresh, March 30, 2006 - 12:01 pm UTC
Hello Tom, I have the query below - service_agreement_tas_d and cm1_agreement_param_tas_d have about 2 million rows each, csm_offer_Param has about 1000 rows. Tables are not analyzed as we are doing a data conversion where the tables do not have indexes. The join in the sub query is on key fields (meaning agreement_no, param_name, offer_instance_id define a unique row in cm1_agreement_param): 1 select 2 mod(agreement_no, 100) agreement_key, 3 agreement_no, soc, 30000000 + rownum param_seq_no, 4 sa.effective_date, sa.expiration_date, sa.source_agr_no, sa.soc_seq_no, 5 param_instance_level, param_name, param_values, 6 row_number() over 7 (partition by agreement_no, param_name order by soc_seq_no) rn 8 from service_agreement_tas_d sa, 9 (select distinct soc_cd, param_name, populate_level, param_values, 10 mandatory_ind, param_instance_level from csm_offer_param ) op 11 where sa.soc = op.soc_cd 12 and not exists (select /*+ HASH_AJ */ NULL from cm1_agreement_param_tas_d p 13 where p.agreement_no = sa.agreement_no 14 and p.param_name = op.param_name 15 and ( (op.param_instance_level = 'O' 16 and p.offer_instance_id = sa.soc_seq_no) 17 or (op.param_instance_level = 'S' 18 and p.offer_instance_id is NULL) 19 ) 20* ) SQL> / Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=124560227 Card=19271 2 Bytes=425700808) 1 0 WINDOW (SORT) (Cost=68275 Card=192712 Bytes=425700808) 2 1 COUNT 3 2 FILTER 4 3 HASH JOIN (Cost=9984 Card=192712 Bytes=425700808) 5 4 VIEW (Cost=4340 Card=14703 Bytes=31508529) 6 5 SORT (UNIQUE) (Cost=4340 Card=14703 Bytes=315967 47) 7 6 TABLE ACCESS (FULL) OF 'CSM_OFFER_PARAM' (Cost =6 Card=14703 Bytes=31596747) 8 4 PARTITION LIST (ALL) 9 8 TABLE ACCESS (FULL) OF 'SERVICE_AGREEMENT_TAS_D' (Cost=1156 Card=3854235 Bytes=254379510) 10 3 PARTITION LIST (ALL) 11 10 TABLE ACCESS (FULL) OF 'CM1_AGREEMENT_PARAM_TAS_D' (Cost=646 Card=1 Bytes=155) Is it possible for the last full table access in the sub-query to use a hash mechanism? I tried using HASH_AJ in inner query and not exists (select /*+ HASH_AJ */ NULL from cm1_agreement_param_tas_d p where p.agreement_no = sa.agreement_no But that did not work. Hash related settings: 1* select name, value from v$parameter where name like '%hash%' SQL> / NAME VALUE ------------------------------ ------------------------------ hash_join_enabled TRUE hash_area_size 50000000 Thanks, Naresh
March 31, 2006 - 11:28 am UTC
code it as an outer join instead of a subquery and add
and p.agreement_no is null
to the query.
HASH_AJ for not in
Naresh, March 31, 2006 - 12:56 am UTC
Hello Tom, Version used is 9.2.0.6 I tried a variation for above query with a not in - still does not use hash_aj. I put nvl for each column in the inner select and also set always_anti_join=hash (I do not know if this parameter is relevant in the version we use - 9.2.0.6). What can be done to get a hash join working for this? 1 select 2 agreement_no, soc, 30000000 + rownum param_seq_no, 3 sa.effective_date, sa.expiration_date, sa.soc_seq_no, 4 param_instance_level, param_name, param_values, 5 row_number() over 6 (partition by agreement_no, param_name order by soc_seq_no) rn 7 from service_agreement_tas_d sa, 8 (select distinct soc_cd, param_name, populate_level, param_values, 9 mandatory_ind, param_instance_level from csm_offer_param ) op 10 where sa.soc = op.soc_cd 11 and (nvl(agreement_no,0), nvl(param_name,' '), nvl(soc_seq_no, -1)) not in 12 (select /*+ HASH_AJ */ nvl(agreement_no,0), nvl(param_name,' '), 13 nvl(offer_instance_id, -1) 14* from cm1_agreement_param_tas_d) SQL> / Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=226118412 Card=19271 2 Bytes=423195552) 1 0 WINDOW (SORT) (Cost=67236 Card=192712 Bytes=423195552) 2 1 COUNT 3 2 FILTER 4 3 HASH JOIN (Cost=9276 Card=192712 Bytes=423195552) 5 4 VIEW (Cost=4340 Card=14703 Bytes=31508529) 6 5 SORT (UNIQUE) (Cost=4340 Card=14703 Bytes=315967 47) 7 6 TABLE ACCESS (FULL) OF 'CSM_OFFER_PARAM' (Cost =6 Card=14703 Bytes=31596747) 8 4 PARTITION LIST (ALL) 9 8 TABLE ACCESS (FULL) OF 'SERVICE_AGREEMENT_TAS_D' (Cost=1156 Card=3854235 Bytes=204274455) 10 3 PARTITION LIST (ALL) 11 10 TABLE ACCESS (FULL) OF 'CM1_AGREEMENT_PARAM_TAS_D' (Cost=1173 Card=489 Bytes=75795) The Card=489 for 'CM1_AGREEMENT_PARAM_TAS_D is different from the one below: SQL> select count(*) from CM1_AGREEMENT_PARAM_TAS_D; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=RULE (Cost=1173 Card=1) 1 0 SORT (AGGREGATE) 2 1 PARTITION LIST (ALL) 3 2 TABLE ACCESS (FULL) OF 'CM1_AGREEMENT_PARAM_TAS_D' (Co st=1173 Card=3910431) And this is what the user_tables shows SQL> SQL> select lasT_analyzed, num_rows, USER_STATS from user_tables where table_name in 2 ('SERVICE_AGREEMENT_TAS_D', 'CM1_AGREEMENT_PARAM_TAS_D'); LAST_ANAL NUM_ROWS USE --------- ---------- --- NO NO Thanks, Naresh
Adding != NULL in sub-query did it
naresh, March 31, 2006 - 5:40 am UTC
Writing the sub-query thus, gave the desired result
(select /*+ HASH_AJ */ nvl(agreement_no,0), nvl(param_name,' '),
nvl(offer_instance_id, -1)
from cm1_agreement_param_tas_d
where agreement_no is not NULL and param_name is not null and nvl(offer_instance_id,-1) is not null)
Thanks,
Naresh.
Query Optimization related to EXIST
Soni, May 20, 2006 - 12:55 am UTC
Please look into the part of query below. I need to optimize the query. This looks very big and not optimized. Please advice.
AND EXISTS ((SELECT 1
FROM springs.si_jde_schedrcpts_work sr
WHERE t.order_id = sr.t.order_id
AND t.item = sr.item
AND t.fromloc = sr.fromloc
AND sr.order_status < 90
AND sr.supplysource = 1)
OR EXISTS (SELECT 1
FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc
AND inventory.qty => t.stqty
AND sr.order_staus > 90
AND sr.supplysource = 1)
OR EXISTS (SELECT 1
FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc
AND inventory.qty => t.stqty
AND sr.order_staus < 90
AND sr.supplysource <> 1)
)
May 20, 2006 - 4:52 pm UTC
how do you know it is not optimized? I know it is not syntactically "meaningful" ops$tkyte@ORA10GR2> select * 2 from dual 3 where 1=1 4 and exists ( (select 1 from dual) 5 or exists (select 1 from dual) 6 or exists (select 1 from dual) 7 ) 8 / or exists (select 1 from dual) * ERROR at line 5: ORA-00907: missing right parenthesis
Optimization
Soni, May 21, 2006 - 11:01 am UTC
Thanks TOM, I know it might be wrong. I am new to Oracle and have to get this done using EXIST. I understand it might be wrong. I am learning and working too. Please help me modify this. I was not getting a way to solve this query having OR conditions with EXIST. Please advice.
May 21, 2006 - 8:12 pm UTC
these two:
OR EXISTS (SELECT 1
FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc
AND inventory.qty => t.stqty
AND sr.order_staus > 90
AND sr.supplysource = 1)
OR EXISTS (SELECT 1
FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc
AND inventory.qty => t.stqty
AND sr.order_staus < 90
AND sr.supplysource <> 1)
can obviously be rolled into one - but it would require an OR which wouldn't be much different from what you have if an index on qty,status,supplysource were being used.
basically "not sufficient data to really comment"
exists is not inheritly "bad"
Cursor Modification
Soni, May 22, 2006 - 1:18 am UTC
Hi Tom, I am here writting the whole code for the above query I have to make the changes in a Cursor.
The change description is :
In the c_recship cursor, add an edit to first of all check to see if the t.order_id, t.item and t.fromloc exist on the springs.si_jde_schedrcpts_work sr table, joining t.order_id to sr.order_id, t.item to sr.item and t.fromloc to sr.loc, where the sr. order_status < 90 and the sr.supplysource = 1. If found, create the recship. If it exists and the order_status > 90 or the t.order_id, t.item and t.fromloc does not exist at all as a supplysource = 1 row on the si_jde_schedrcpts_work table, then check the stsc.inventory table joining t.item to inventory.item, t.fromloc to inventory.loc and create recship if inventory.qty = > t.stqty.
Please suggest for the changes done by me.
CURSOR c_recship IS
SELECT t.item item, t.toloc dest, t.fromloc source, s.transmode
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') needarrivedate
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') schedarrivedate
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') availtoshipdate
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') schedshipdate
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') needshipdate
, 4 type, 0 actionallowedsw, 1 firmsw , ROUND(t.stqty,2) qty, 0 orderid
, t.item primaryitem, 1 sys_info, t.order_id
FROM
(SELECT a.item, a.duedate
, b.fromloc, a.toloc, stqty, a.order_number order_id
FROM (SELECT DISTINCT item, TRUNC(duedate) duedate
, toloc, order_number
FROM springs.si_jde_intransit_work
WHERE TRIM(UPPER(order_type)) = 'OT'
) a
, (SELECT item, TRUNC(duedate) duedate, fromloc
, rel_ot_ord_nbr, SUM(NVL(net_st_qty,0)) stqty
FROM springs.si_jde_intransit_work
WHERE TRIM(UPPER(order_type)) = 'ST'
GROUP BY item, TRUNC(duedate), fromloc, rel_ot_ord_nbr
HAVING SUM(NVL(net_st_qty,0)) <> 0
) b
WHERE a.order_number = b.rel_ot_ord_nbr
AND a.item = b.item
) t, stsc.sourcing s
WHERE NOT EXISTS (SELECT 1 FROM stsc.loc
WHERE scen = 0 AND loc = t.toloc
AND NVL(TRIM(si_loc_type),' ') = 'DC'
)
--Change by Soni
AND EXISTS((SELECT 1 FROM springs.si_jde_schedrcpts_work sr
WHERE t.order_id = sr.t.order_id AND t.item = sr.item
AND t.fromloc = sr.fromloc AND sr.order_status < 90
AND sr.supplysource =1 )
OR EXISTS(SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc AND inventory.qty => t.stqty AND sr.order_staus > 90)
OR EXISTS(SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc AND inventory.qty => t.stqty AND sr.supplysource=1))
--Change by Soni till here
AND s.transmode=springs.si_transmode_lookup(t.fromloc
,t.toloc
,' ',t.item
)
AND s.item = t.item
AND s.source = t.fromloc
AND s.dest = t.toloc
AND s.scen = 0;
May 22, 2006 - 7:43 am UTC
suggest you ask "why the heck would I have to do this"
WHERE TRIM(UPPER(order_type)) = 'OT'
what would possibly cause you to have to use trim and upper. ugh.
Optimization
Soni, May 22, 2006 - 8:26 am UTC
The other part of query is old and I don't have to change it. I have got the change to edit only a part of it. Please help as this is too urgent and I am new to Oracle.
Thanks for your patient.
May 22, 2006 - 3:50 pm UTC
is it working, that is the question.
it'll do what you programmed it to - is it working for you.
Checking for optimization
Soni, May 23, 2006 - 3:05 am UTC
CURSOR c_recship IS
SELECT t.item item, t.toloc dest, t.fromloc source, s.transmode
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') needarrivedate
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') schedarrivedate
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') availtoshipdate
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') schedshipdate
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') needshipdate
, 4 type, 0 actionallowedsw, 1 firmsw , ROUND(t.stqty,2) qty, 0 orderid , t.item primaryitem, 1 sys_info, t.order_id
FROM
(SELECT a.item, a.duedate
, b.fromloc, a.toloc, stqty, a.order_number order_id
FROM (SELECT DISTINCT item, TRUNC(duedate) duedate
, toloc, order_number
FROM springs.si_jde_intransit_work
WHERE TRIM(UPPER(order_type)) = 'OT'
) a
, (SELECT item, TRUNC(duedate) duedate, fromloc
, rel_ot_ord_nbr, SUM(NVL(net_st_qty,0)) stqty
FROM springs.si_jde_intransit_work
WHERE TRIM(UPPER(order_type)) = 'ST'
GROUP BY item, TRUNC(duedate), fromloc, rel_ot_ord_nbr
HAVING SUM(NVL(net_st_qty,0)) <> 0
) b
WHERE a.order_number = b.rel_ot_ord_nbr
AND a.item = b.item
) t, stsc.sourcing s
WHERE NOT EXISTS (SELECT 1 FROM stsc.loc
WHERE scen = 0 AND loc = t.toloc
AND NVL(TRIM(si_loc_type),' ') = 'DC'
)
--Changes by Soni from here
AND (EXISTS ( SELECT 1 FROM springs.si_jde_schedrcpts_work
sr WHERE t.order_id = sr.t.order_id AND t.item = sr.item AND t.fromloc = sr.fromloc AND sr.order_status < 90
AND sr.supplysource =1 )
OR EXISTS( SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc AND
inventory.qty => t.stqty
AND sr.order_staus > 90
)
OR EXISTS( SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc AND
inventory.qty => t.stqty
AND sr.supplysource=1
)
)
--Changes by Soni till here
AND s.transmode=springs.si_transmode_lookup(t.fromloc
,t.toloc ,' ',t.item
)
AND s.item = t.item AND s.source = t.fromloc
AND s.dest = t.toloc AND s.scen = 0;
**********
The changes done by me are between commnets. Please tell me if this correct. I have tested this. Its not giving any compilation error. Please check the part modified by me.
The description of the change is in the question asked before this. I want to put them not in three select statement but atleast in only two.
Thanks.
May 23, 2006 - 7:33 am UTC
looks ok to me if it is getting the *right answer*.
any thoughts on this query
A reader, June 15, 2006 - 11:03 am UTC
Tom,
Please keep in mind that main_report_tbl contains 4261053 records
select yard_area , area_name , chassis_lt_description , visit , gate_id , reference_id ,
ssl_user_code , container , lane_out_by , cont_ck , container_lht_description ,
gross_weight , seals , temp_mins , temp_maxs , location , loc_type , booking ,
trucker_description , driver , chassis , genset , hazardous , guard_out_date , voided_date ,
visit_voided_date , haz , vessel_voy , tir , ckr , sec , action , eir_id ,
foreign , sealink_code , lane_in_date , ssl_user_description , port_of_destination_desc ,
port_of_discharge_description , reefer , net_elapsed_time , net_time , rpm_time , clerical_time ,
clerical_services_time , remarks , damage_description , action ,
chassis , loc_type
from main_report_tbl
where guard_out_date is not null
and voided_date is null
and visit_voided_date is null
and substr ( action , 2 , 1 ) not in ( 'C' , 'G' )
and lane_in_date between to_date( '13-JUN-2005 00:00:00', 'dd-mon-yyyy hh24:mi:ss' )
and to_date( '13-JUN-2005 23:59:59', 'dd-mon-yyyy hh24:mi:ss' )
June 16, 2006 - 5:48 pm UTC
looks great!
as long as you want that data that is...
I mean, it doesn't get "much simplier than this"
cannot suggest anything unless you were to say about how many records you might generally anticipate, to see if an index makes sense or not.
Brad, July 09, 2006 - 10:33 pm UTC
Tom..
i have some trouble with a sql...
insert into table T
select /*+ parallel(x,4) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (2,3)
group by a
the above sql runs fine..in about 10 mins...
But i have a limitation that i have to write the above sql
like this
insert into table T
select /*+ parallel(x,4) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (select b from table Y)
group by a..
table Y has 2 rows (2 and 3)
Now this SQL takes 40 mins to complete...any ideas improve this performance...please let me know if i am being vague...Thanks
July 10, 2006 - 7:44 am UTC
is b in Y "NOT NULL" or is it "NULLABLE"
and I assume that statistics are "there" and accurate?
Brad, July 10, 2006 - 4:19 pm UTC
Both the tables are analyzed and the plans are as shown below ...table X has around 7.4 million rows and table y has 2 rows.
The column b in table y does not have null values but its nullable....
the output of the select statement will be arond 170,000 rows....
SQL1:
-----
The below sql takes 25 mins to comeback
select /*+ parallel(x,2) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (select b from table Y)
group by a..
Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop
SELECT STATEMENT Optimizer Mode=CHOOSE 73 K 162987
SORT GROUP BY 73 K 44 M 16217
FILTER
TABLE ACCESS FULL ODS.X 369 K 224 M 10175 :Q1638707000 P->S QC (RANDOM)
TABLE ACCESS FULL ODS.Y 1 4 2
SQL2:
----
The below SQL takes 6 mins to comeback
select /*+ parallel(x,2) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (2,3)
group by a..
Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop
SELECT STATEMENT Optimizer Mode=CHOOSE 73 K 96849
SORT GROUP BY 73 K 44 M 96849 :Q1638710001 P->S QC (RANDOM)
SORT GROUP BY 73 K 44 M 96849 :Q1638710000 P->P HASH
TABLE ACCESS FULL ODS.X 7 M 4G 10175 :Q1638710000 PCWP
July 11, 2006 - 7:34 pm UTC
add "and y is not null" to the subquery.
Brad, July 11, 2006 - 10:30 pm UTC
Tom..i tried what you said..still no change...
select /*+ parallel(x,2) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (select b from table Y where b is not null)
group by a..
Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop
SELECT STATEMENT Optimizer Mode=CHOOSE 73 K 162987
SORT GROUP BY 73 K 44 M 16217
FILTER
TABLE ACCESS FULL ODS.X 369 K 224 M 10175 :Q1640640000 P->S QC (RANDOM)
TABLE ACCESS FULL ODS.Y 1 4 2
July 12, 2006 - 3:34 pm UTC
make it so the plan fits on a line (so we humans can read it easily)
are the ESTIMATED CARDINALITIES correct
and how long does it take to just full scan this table on your instance? I mean, does it take 6 minutes to do that full scan?
not in vs exists
jas, July 11, 2006 - 11:16 pm UTC
I have a query like
select request_id from request
where request_id not in ( select ref from refer_tab)
and reuest_id not in ( select sed from sed_tab)
in whose explian plan ( select ref from refer_tab) query was not using index.
when i write it like .....
select request_id from request
where exists( select ref from refer_tab)
and reuest_id not in ( select sed from sed_tab)
it gives me the same output and also costs is some 10000 times low and query executes fast.
I can't understand the reason that exists and not in gives same result.
July 12, 2006 - 3:37 pm UTC
they don't in general give the same results (eg: they are NOT semantically equivalent) ops$tkyte@ORA10GR2> create table t1 ( x int ); Table created. ops$tkyte@ORA10GR2> create table t2 ( x int ); Table created. ops$tkyte@ORA10GR2> create table t3 ( x int ); Table created. ops$tkyte@ORA10GR2> ops$tkyte@ORA10GR2> insert into t1 values ( 1 ); 1 row created. ops$tkyte@ORA10GR2> insert into t2 values ( NULL ); 1 row created. ops$tkyte@ORA10GR2> insert into t3 values ( 3 ); 1 row created. ops$tkyte@ORA10GR2> ops$tkyte@ORA10GR2> select * 2 from t1 3 where x not in (select x from t2) 4 and x not in (select x from t3) 5 / no rows selected ops$tkyte@ORA10GR2> ops$tkyte@ORA10GR2> select * 2 from t1 3 where NOT EXISTS (select null from t2 where t2.x = t1.x) 4 and x not in (select x from t3) 5 / X ---------- 1 is ref in refer_tab NULLABLE. if it is, add the predicate "where ref is not null" to the not in so that NOT IN and NOT EXISTS (I'm assuming your second example, which is totally "syntax free" was really a not exists and not exists...) all of a sudden become semantically "the same"
CBO not choosing hash anti join
Morten, August 01, 2006 - 6:18 pm UTC
Having worked with Oracle since v.6 I have never really gotten comfortable with the CBO.
I find that often (always?) I would like my queries to behave predictably and consistently rather than potentially (rarely) faster.
Having said that, I realize that I am being backwards about this, and thus I have set out to explore your brilliant site in search of the one thing that will win me over.
This thread is at the core of my issue with the CBO: I have a query that is clearly well suited for a hash anti join, and it runs in about 2 mins using the "poor man's hash anti join".
Both tables involved are indexed and analyzed, so I set out to rewrite my query to use NOT IN, hoping that the CBO would do the rest for me. The result: Indexes used, decrease of perfomance by a factor 3. NOT EXISTS has the same problem. No NULL columns involved.
Normally I would just shake my head and use my hard-coded anti join, but I have a little bit of time on my hands to experiment, so ...
Could you possibly list a few reasons why the CBO would not pick the hash anti join for my query (looking for new records from mta_tab_2 to add to mta_tab_1)?
mta_tab_1 has 2819134 records (acct_id,start_dt unique)
mta_tab_2 has 1844852 records
SELECT x.acct_id
,x.start_dt
FROM mta_tab_2 x
,(SELECT a.acct_id
,a.start_dt
FROM (SELECT acct_id
,start_dt
,end_dt
FROM mta_tab_1
) a
,(SELECT acct_id
,MAX(start_dt) start_dt
FROM mta_tab_1
GROUP BY acct_id
) b
WHERE b.acct_id = a.acct_id
AND :dt BETWEEN a.start_dt AND a.end_dt
AND a.start_dt = b.start_dt
) y
WHERE :dt BETWEEN x.start_dt AND x.end_dt
AND y.start_dt IS NULL
AND y.acct_id(+) = x.acct_id;
August 02, 2006 - 10:56 am UTC
get used to the cbo - it is a fact, in 10g, it is just a plain and simple fact of life.
I always start with comparing the "guess" with "reality"
compare an autotrace traceonly explain
with
a tkprof
see where the optimizer gets estimated cardinalities wrong and ask yourself "is there a logical reason why they are wrong". From that we can usually apply corrective actions.
Looks like you really want to learn about analytics!!!!!
(SELECT a.acct_id
,a.start_dt
FROM (SELECT acct_id
,start_dt
,end_dt
FROM mta_tab_1
) a
,(SELECT acct_id
,MAX(start_dt) start_dt
FROM mta_tab_1
GROUP BY acct_id
) b
WHERE b.acct_id = a.acct_id
AND :dt BETWEEN a.start_dt AND a.end_dt
AND a.start_dt = b.start_dt
) y
could be written in a single pass as:
select acct_id, start_dt
from (
select acct_id, start_dt, end_dt,
row_number() over (partition by acct_id order by start_dt DESC) rn
from mta_tab_1
)
where rn = 1
and start_date <= :dt
and end_dt >= :dt
You just want to get the "max row by acct_id, where the max row is designated by start_dt"
And use that in a NOT IN
Seems this is the query:
select *
from mta_tab_2
where start_dt <= :dt
and end_dt >= :dt
and (acct_id) NOT IN
(
select acct_id
from (
select acct_id, start_dt, end_dt,
row_number() over (partition by acct_id order by start_dt DESC) rn
from mta_tab_1
)
where rn = 1
and start_date <= :dt
and end_dt >= :dt
and acct_id IS NOT NULL
)
more info
Morten, August 01, 2006 - 7:10 pm UTC
btw the query returns ~10K records
Wow - fast response
Morten, August 02, 2006 - 2:14 pm UTC
Thanks for the incredible response time!
I am onto the analytics. I tried the approach out with similar queries without any significant performance improvement, and in the end decided to modify the code as little as possible (I inherited the code).
So you are saying that by comparing the autotrace and the tkprof I should be able to pinpoint where the CBO goes wrong? (The tricky part is getting access to the trace file)
August 02, 2006 - 4:01 pm UTC
you compare the "row source operation" (the truth) with REAL output row counts
to the explain plan
which is the guess - to see if they are "way off"
How about finding all non-primary key indexes ?
A reader, August 25, 2006 - 12:02 pm UTC
SELECT uc.constraint_name,
uc.constraint_type,
uic.*
FROM user_indexes uix,
user_ind_columns uic,
(SELECT index_name,
uc.constraint_type,
uc.table_name,
uc.constraint_name
FROM user_constraints uc
WHERE uc.constraint_type = 'P') uc
WHERE uix.index_name = uic.index_name
AND uix.table_name = uic.table_name
AND uix.index_name = uc.index_name(+)
AND uc.constraint_type IS NULL
order by uic.column_position
any better idea ? can JDBC do it ?
note that the query must run from a user schema with only connect,resource prives.
August 27, 2006 - 8:46 pm UTC
select select index_name from user_indexes where index_name not in (select index_name from user_constraints where index_name is not null );
jdbc can run valid sql, so I would guess "yes" jdbc would be capable of running a query.
Scalability
yuvraj, September 05, 2006 - 3:16 am UTC
Hi Tom,
What about the scalability of each of these ? I have a situation where in the data in the inner query resultset will grow almost exponentially over time whereas the outer query data will almost remain constant (comparitively speaking). Currently, the plans do not look very different from each other (I am using NOT IN & NOT EXISTS).
Thanks a lot!
September 05, 2006 - 4:58 pm UTC
using the CBO, it'll change plans as the volumes of data change over time - unlike the RBO
Scalability
yuvraj, September 05, 2006 - 3:20 am UTC
I am using Oracle 10g Release 10.1.0.4.0.
scalability
A reader, September 06, 2006 - 4:32 am UTC
Tom,
Thank you for the quick response.
So, can we predict (or do we know) which one out of the two will scale better in the situation given? Or is there no way to say ? Given my situation where the plans look alike, which condition should I use keeping in mind the growth of volume of data?
September 06, 2006 - 7:51 am UTC
use the cbo and we don't really *care*, that is the point.
scalability-thanks
A reader, September 07, 2006 - 1:06 am UTC
Thank you Tom. I understand now..
OK
Kumar, February 02, 2007 - 1:50 am UTC
Hi Tom,
Any better way to rewrite the query given below?
SELECT ol.line_id
,ol.created_by
,ol.last_updated_by
,ad.document_id
,ad.pk1_value
,ad.seq_num
FROM oe_order_lines_all ol
,fnd_attached_documents ad
,(SELECT *
FROM so_lines_all
) sl
WHERE ad.entity_name = 'OE_ORDER_LINES'
AND ol.split_from_line_id = ad.pk1_value
AND sl.header_id = ol.Header_id
AND sl.line_id = ol.split_from_line_id
AND NOT EXISTS
( SELECT 1
FROM fnd_attached_documents ad2
WHERE ol.line_id = ad2.pk1_value
AND ad2.entity_name = 'OE_ORDER_LINES'
Please do reply.
February 02, 2007 - 10:45 am UTC
looks okey dokey to me. as long as it gets the right answer of course.
Does null always mean "UNKNOWN"?
Robert Simpson, February 07, 2007 - 12:36 pm UTC
A null value was added to a column, and suddenly no data was returned from queries with a NOT IN clause. We figured out why, but then figured that other queries using things like join conditions and IN to compare the same values were probably returning different results, but that ended up not being the case.
If NULL really means "gee, I don't know. (litterally, null means Unknown)" per October 02, 2002 above, and "US" matches NULL in a NOT IN clause, why don't those two values also match in a join condition, an IN clause, a NOT EXISTS or a MINUS?
It almost seems like NOT IN is very unique in its treatment of NULLs. What are we missing? Thanks.
February 07, 2007 - 6:51 pm UTC
because the sql standard says so - it is the way it was defined to work.
but it makes sense.
where x not in ( Q )
and Q contains nulls - is X not in there? We don't know, there is no way to compare X to nulls...
where x in ( Q )
well, regardless of what Q has as far as nulls go - if X=5 and Q has 5, X is in Q.
CBO: Result not consistent with Dukes..
Jens, February 08, 2007 - 4:18 am UTC
Back to the 3 easy examples:
select count(*) from t1 rbo
where object_id not in ( select object_id from t2 );
select count(*) from t1 rbo
where NOT EXISTS (select null from t2 where t2.object_id = rbo.object_id );
select count(*) from t1, t2 rbo
where t1.object_id = rbo.object_id(+) and rbo.object_id IS NULL;
On my system (10.2.0.1), all these run with the same plan, and thus the same amount of LIO when using the CBO:
--------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 8 | 25 (24)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | HASH JOIN ANTI | | 50 | 400 | 25 (24)| 00:00:01 | | 3 | TABLE ACCESS FULL | T1 | 5000 | 20000 | 17 (12)| 00:00:01 | | 4 | INDEX FAST FULL SCAN| T2_IDX | 4950 | 19800 | 5 (20)| 00:00:01 | --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"="OBJECT_ID") Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 80 consistent gets 0 physical reads 0 redo size 227 bytes sent via SQL*Net to client 234 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
Then Duke Ganote claimed his approach using MINUS to be better, even using CBO. On my system its not:
------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 17 | 52 (35)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 17 | | | |* 2 | HASH JOIN | | 5000 | 85000 | 52 (35)| 00:00:01 | | 3 | TABLE ACCESS FULL | T1 | 5000 | 20000 | 17 (12)| 00:00:01 | | 4 | VIEW | | 5000 | 65000 | 32 (41)| 00:00:01 | | 5 | MINUS | | | | | | | 6 | SORT UNIQUE | | 5000 | 20000 | | | | 7 | TABLE ACCESS FULL | T1 | 5000 | 20000 | 17 (12)| 00:00:01 | | 8 | SORT UNIQUE | | 4950 | 19800 | | | | 9 | INDEX FAST FULL SCAN| T2_IDX | 4950 | 19800 | 5 (20)| 00:00:01 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("SET_OP"."OBJECT_ID"="RBO"."OBJECT_ID") Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 144 consistent gets 0 physical reads 0 redo size 226 bytes sent via SQL*Net to client 234 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed
The execution plan is much more complex, T1 is FTS'ed twice and his query needs to do 2 sorts to get to the result.
I guess this could be for a number of reasons, like:
* DBMS version
* init parameters
* difference in the way statistics where gathered
Any comments on this?
Jens
Need Help only seeing a forest behind the trees..
Jan Solca, April 27, 2007 - 11:28 am UTC
Hi Tom
I tried to solve a simple problem with the IN /EXISTS functions but I was not successfull yet.
Let's assume 2 tables T1 and T2:
T1 has attributes workerUnit, valid_from, valid_to
T2 has attributes workerUnit, year (not date but string)
T1 identifies service providers for a certain period and T2 identifies if the service providers had a budget allocated for a specific year.
What I'm trying to do is to generate a view (parametrized or not) which would allow me to perform the following query "Give me the list of valid workerUnits for a given date and also if a budget was allocated to it for that given year.
Sketch for a result set input date 24.04.2007
workerUnit | valid_from | valid_to | hasBudget 945 01.01.1988 31.12.9999 1 888 03.04.2007 31.12.9999 1 874 08.05.2007 31.03.2010 0
where 0 and 1 in the 'hasBudget' column identify a workerUnit which got budget for the year. Let's exercise it for the first row for instance
T1 workerUnit | valid_from | valid_to 945 01.01.1988 31.12.9999 222 01.01.1972 24.03.2006 T2 workerUnit | Year 945 2004 945 2005 945 2007
Thus if the input date was 22.03.2006(1) the record would have a 0 in the hasBudget. As another example this input date(1) would also retrieve the workerUnit 222 which has a valid_to of 24.03.2006 which is older than (1) but no budget has been allocated (record missing in T2) thus hasBudget 0
the record 222 looks like (still (1) has input date)
workerUnit | valid_from | valid_to | hasBudget ... 222 01.01.1972 24.03.2006 0 ...
Thanks your help ..
Don't Use IN/Exists
RobH, May 16, 2007 - 2:26 pm UTC
Maybe try using a join with a case
Select workerUnit, valid_from , valid_to, case when to_date('YYYY',year) between valid_from and valid_to then 1 else 0 end from t1 outer join t2 on t1.workerUnit = t1.workerUnit
or....(since you have many t2 rows)
RobH, May 16, 2007 - 2:31 pm UTC
Select workerUnit, valid_from , valid_to,
max(case when to_date('YYYY',year) between valid_from and valid_to then 1 else 0 end)
from t1
left outer join t2 on t1.workerUnit = t1.workerUnit
Not Exists vs Not IN
A reader, January 09, 2008 - 10:14 am UTC
Tom,
I want to select all the rows from a table where one column values are not in another table.
CREATE TABLE T1
(
A1 INTEGER,
ANOTHER_COL NUMBER DEFAULT 1,
ANOTHER_COL1 INTEGER,
ANOTHER_COL2 NUMBER(5)
)
/
Insert into T1
(A1, ANOTHER_COL, ANOTHER_COL1, ANOTHER_COL2)
Values
(1, 1, 1, 1);
Insert into T1
(A1, ANOTHER_COL, ANOTHER_COL1, ANOTHER_COL2)
Values
(23, 2, 3, 34);
COMMIT;
create table t3 as
select * from t1;
insert into t1 values (12,3,4,12);
commit;
I want to select all the rows in t1 where another_col1 values are not in t3 table.
select * from t1 where another_col1 not in (select another_col1 from t3) ;
How do we frame the query using Not exists?
Also, If I insert another set of values in T3:
insert into t3 values (13,4,'',23);
Why can't I get any rows when I have some values in T3 with null values ?
I can understand comparing with nulls is a problem. But it should return all rows whose values are not null, avoiding that null row, shouldn't it?
It's always been a confusion for me..can you clarify?
Thanks,
January 09, 2008 - 10:33 am UTC
where column in (set that contains non-null and some nulls)
where, we can say that column is either in that set or not (we only need look at the non-nulls to see if that value is in that set)
where column NOT IN (same set)
well, the presence of nulls - as defined by the SQL language itself, a rule if you will - makes it impossible to see if that value is "not in" that set - we cannot tell if column not in (null) is true or false - it evaluates UNKNOWN.
so, you would in general where column not in (select * from (that set) where that_column is not null )
if that is what you want....
As for the not exists, that is pretty trivial - give that bit of work to yourself as an exercise....
not able to tune it...
Reene, April 04, 2008 - 9:30 am UTC
Hi Tom
I have this query,it retuns 12 rows,runs for 90 minutes.
trying to tune it , but could not,spent 5 days but to no avail.
Parsing user id: 65 (APPS)
********************************************************************************
SELECT /*+ ordered use_hash(mtp) push_subq */ ooha.org_id "OPERATING UNIT ID"
FROM apps.oe_transaction_types_all ott,
apps.oe_transaction_types_tl ottt_l,
apps.oe_order_lines_all oola,
apps.mtl_parameters mtp,
apps.oe_order_headers_all ooha,
apps.hz_cust_accounts hca
WHERE ooha.header_id = oola.header_id AND
ooha.org_id = oola.org_id AND
ooha.order_source_id = 1263 AND
ott.transaction_type_id = ottt_l.transaction_type_id AND
ott.org_id = ooha.org_id AND
oola.line_type_id = ottt_l.transaction_type_id AND
oola.line_type_id = ott.transaction_type_id AND
mtp.organization_id = oola.ship_from_org_id AND
hca.cust_account_id = ooha.sold_to_org_id AND
(ottt_l.NAME LIKE 'GPO%FE_CONSIGNMENT_LINE' OR
ottt_l.NAME LIKE 'GPO%FE_SHIPMENT_LINE') AND
oola.flow_status_code IN ('SHIPPED', 'CLOSED') AND
hca.attribute9 IS NOT NULL
and NOT EXISTS
( SELECT /*+ NO_UNNEST */ 1
FROM apps.mtl_material_transactions mmt ,
apps.mtl_transaction_lot_numbers mtnl ,
apps.mtl_parameters mp ,
apps.mtl_item_locations mil
WHERE mmt.inventory_item_id = oola.inventory_item_id AND
mmt.transaction_type_id IN (420, 40, 41, 42) AND
mmt.transaction_quantity > 0 AND
mtnl.lot_number = ooha.order_number AND
mtnl.transaction_id = mmt.transaction_id AND
mmt.transaction_date > (TRUNC(SYSDATE) -90) AND
mmt.organization_id = mp.organization_id AND
mp.attribute7 = 'PARTS' AND
mp.attribute11 ='SERVICE' AND
mil.inventory_location_id = hca.attribute9 AND
mil.organization_id = mp.organization_id
) AND
NVL(oola.actual_shipment_date, oola.schedule_ship_date) > SYSDATE - 90
/
its tkporf output is
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 690.28 4899.16 1524771 23494672 0 12
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 690.28 4899.16 1524771 23494672 0 12
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 65 (APPS)
Rows Row Source Operation
------- ---------------------------------------------------
12 NESTED LOOPS (cr=23494672 r=1524771 w=0 time=4640213970 us)
157696 NESTED LOOPS (cr=5730245 r=1402270 w=0 time=3699650845 us)
157697 HASH JOIN (cr=5257151 r=1379549 w=0 time=3576187058 us)
157697 TABLE ACCESS BY INDEX ROWID OE_ORDER_LINES_ALL (cr=5257149 r=1379549 w=0 time=3575115599 us)
2789657 NESTED LOOPS (cr=13232 r=12982 w=0 time=90323177 us)
76 HASH JOIN (cr=277 r=82 w=0 time=175347 us)
4597 TABLE ACCESS FULL OE_TRANSACTION_TYPES_ALL (cr=157 r=0 w=0 time=6683 us)
76 TABLE ACCESS FULL OE_TRANSACTION_TYPES_TL (cr=120 r=82 w=0 time=157808 us)
2789580 INDEX RANGE SCAN GEMS_ONT_ORDER_LINES_N99 (cr=12955 r=12900 w=0 time=86301122 us)(object id 1863725)
377 INDEX FULL SCAN MTL_PARAMETERS_U1 (cr=2 r=0 w=0 time=367 us)(object id 9847)
157696 TABLE ACCESS BY INDEX ROWID OE_ORDER_HEADERS_ALL (cr=473094 r=22721 w=0 time=123022055 us)
157697 INDEX UNIQUE SCAN OE_ORDER_HEADERS_U1 (cr=315397 r=1363 w=0 time=11565697 us)(object id 688729)
12 TABLE ACCESS BY INDEX ROWID HZ_CUST_ACCOUNTS (cr=17764427 r=122501 w=0 time=1114177945 us)
157696 INDEX UNIQUE SCAN HZ_CUST_ACCOUNTS_U1 (cr=315395 r=274 w=0 time=3770326 us)(object id 715003)
157621 TABLE ACCESS BY INDEX ROWID MTL_TRANSACTION_LOT_NUMBERS (cr=17291336 r=121205 w=0 time=1096619120 us)
6897874 NESTED LOOPS (cr=14562731 r=114663 w=0 time=1018837983 us)
3448931 NESTED LOOPS (cr=7488435 r=112057 w=0 time=936487677 us)
157633 NESTED LOOPS (cr=788165 r=581 w=0 time=10022817 us)
157633 INDEX RANGE SCAN MTL_ITEM_LOCATIONS_U1 (cr=472899 r=581 w=0 time=7984520 us)(object id 9761)
157633 TABLE ACCESS BY INDEX ROWID MTL_PARAMETERS (cr=315266 r=0 w=0 time=1667395 us)
157633 INDEX UNIQUE SCAN MTL_PARAMETERS_U1 (cr=157633 r=0 w=0 time=770942 us)(object id 9847)
3448931 TABLE ACCESS BY INDEX ROWID MTL_MATERIAL_TRANSACTIONS (cr=6700270 r=111476 w=0 time=924106849 us)
8273225 INDEX RANGE SCAN MTL_MATERIAL_TRANSACTIONS_N1 (cr=695175 r=32781 w=0 time=303092226 us)(object id 6205511)
3448931 INDEX RANGE SCAN MTL_TRANSACTION_LOT_NUMBERS_N1 (cr=7074296 r=2606 w=0 time=74416944 us)(object id 9976)
is there anything which can be done to improve its response time.
is push_subq is the correct choice here.
how does not exists work - please explain using the above query and the tkprof output..
what is wrong here.
thanks
An explanation I use for the IN x NOT IN
Marcus Rangel, May 19, 2008 - 8:54 pm UTC
Tom, there is an explanation I usually give for developers who ask me why "1 in (1,2,null)" returns TRUE but "3 not in (1,2,null)" returns FALSE. Please tell me what you think:
The IN operation is a series of *equality* comparisons that is ultimately resolved by an OR:
1 = 1 ? TRUE
1 = 2 ? FALSE
1 = NULL ? FALSE
TRUE OR FALSE OR FALSE = TRUE
The NOT IN operation is a series of *inequality* comparisons that is ultimately resolved by an AND:
3 != 1 ? TRUE
3 != 2 ? TRUE
3 != NULL ? FALSE
TRUE AND TRUE AND FALSE = FALSE
May 20, 2008 - 10:50 am UTC
with the minor exception that 3!=NULL is not false, it is not true, it is UNKNOWN
ops$tkyte%ORA10GR2> select * from dual where 3 <> NULL; no rows selected ops$tkyte%ORA10GR2> select * from dual where NOT(3 <> NULL); no rows selected
how about in subquery contains null
jian huang zheng, October 02, 2008 - 10:32 am UTC
HI TOM Thanks for this wonderful thread. i read: <Quote> Because NULL means -- gee, I don't know. (litterally, null means Unknown) So, the predicate where x not in ( NULL ) evaluates to neither TRUE, nor FALSE </qoute> but when using in ( subquery), if subquery contains null, it will select rows ,why is that? SQL> desc cc Name A SQL> desc dd Name A SQL> insert into dd values(null); 1 row created. SQL> select * from cc where a in ( select * from dd); A ---------- 3 1 row selected. SQL>
October 02, 2008 - 1:12 pm UTC
because the rules say "where 5 in ( 5, NULL )" is true because 5 is in (5).
But "where 5 NOT IN (1, null)" is unknown - it is unknown if "5 is not in (null)"
ops$tkyte%ORA11GR1> select * from dual where 5 in (5,null); D - X ops$tkyte%ORA11GR1> select * from dual where 5 NOT in (1,null); no rows selected
Could you help with this query, how to evalue it?
jian huang zheng, October 05, 2008 - 8:40 am UTC
Hello Tom Thanks for clearing my doubts about the not in. I have following query , could you please help how conceptually evaluate this query? SQL> create table t1 ( x int); Table created. SQL> create table t2 ( x int); Table created. SQL> select * from t1; X ---------- 1 2 3 rows selected. SQL> select * from t2; X ---------- 1 3 3 rows selected. SQL> select * from t1,t2 where exists ( select null from t2 where t1.x=t2.x); X X ---------- ---------- 1 1 1 3 1 3 rows selected. SQL> select * from t1,t2 where exists ( select null from t1 where t1.x=t2.x); X X ---------- ---------- 1 1 2 1 1 3 rows selected. I am not sure how those two query are executed, Most appreciated if you can give an explaination. Thanks~
October 06, 2008 - 2:49 pm UTC
... select * from t1,t2 ...
join every row in t1 with every row in t2.
you have
1
2
and
1
3
that results in
t1.x,t2.x
1,1
1,3
2,1
2,3
then - apply the filter:
where exists ( select null from t2 where t1.x=t2.x);
to each row.
You can always envision:
select *
from <table list>
where <predicate>
as a cartesian join of all tables in the table list (every row in table t1 joined with every row in t2, joined with every row in t3 and so on - so if you had 10 rows in each table - you'd end up with 10x10x10 rows in the result set) AND THEN run the predicate, the filter against them.
very clear explaination, and one more question about 'in' execution
jian huang zheng, October 09, 2008 - 10:22 am UTC
Hi Tom
Thanks for your explaination about the sql execution. one more question i am curious to know is:
suppose i issue sql : select * from table1 where a in (select a from table2),suppose hash semi-join is used , does oracle have to finish the hash join for all rows before feeding back the resultset to the client or during the execution?
Or is there a universal rule for oracle to feed back the result set, feed back during the sql execution or after the execution?
October 09, 2008 - 11:21 am UTC
with a hash (semi or not) join, the plan is typically like this:
hash join
full scan t1
full scan t2
One of the tables will be full scanned and hashed (hopefully into memory, but not necessarily)
Once that step has taken place, we can start full scanning the other table and probing the hash table for the matches
and start returning them.
So, you will full scan and hash one table BEFORE you get the first row back.
So, does Oracle have to finish the hash join for all rows? - No
exists
A reader, October 09, 2008 - 10:58 pm UTC
Tuning
vikram, November 12, 2008 - 10:56 am UTC
Total Records deleteing 20,000(how to tune the query)
delete from H6054 where entryid IN
(Select C1 from T6054 Where C536870913=LoginName); ---- consumes 8 secs
delete from T6054 where C536870913=LoginName; ---- consumes 6 secs
--COMMIT;
November 13, 2008 - 4:12 pm UTC
sounds good to me - but then I have no information of any use, so I get to make it up...
there are lots of indexes, and the delete hits them all.
there is no index on entryid and you are removing 20,000 out of 2,000,000 records
there is no index on C536870913 and you are finding 1 record out of 100,000
things like that - I'll just assume they are true - so therefore, 8 and 6 seconds is really fast.
no tables
no indexing scheme
no plans
no tkprofs
in short - nothing to really look at.
IN & Exists
Jyothsna, May 03, 2010 - 1:44 am UTC
In Rooms table ive 6 rows (2,4,8,16,32,64 )
if i fire
select roomsize from rooms
where 30>roomsize
Then o/p is 4 rows (2,4,8,16)
But the following gives
select roomsize from rooms
where exists (select 1 from rooms
where 30 > roomsize)
It gives all the 6 rows
Why it happens???
May 06, 2010 - 11:35 am UTC
what the heck is a rooms table?
anyway, if there are any rows in rooms such that "30 > roomsize" then the subquery "(select 1 from rooms
where 30 > roomsize)" obviously returns a record - doesn't it (we started with the supposition that there are some rows in rooms such that 30>roomsize).
Now, if that row exists - then "select x from t where exists (that subquery we know returns AT LEAST one row)" - would return every row from T - since the subquery by our definition here returns at least one row.
explain plan for in and exists
Chirayu, June 30, 2010 - 10:47 pm UTC
Hi Tom,
Oracle Database Version 11G
We had a bug in our application where in the original query was like this..
select bank_branch_name, bank_name, bank_branch_id
from xx_bank_branches
WHERE BANK_BRANCH_ID IN
(SELECT BANK_BRANCH_ID
from xx_bank_accounts a
WHERE a.ACCOUNT_TYPE <> 'EXTERNAL' )
order by bank_branch_name, bank_name ;
I thought of re-writing the query as
select bank_branch_name, bank_name, bank_branch_id
from xx_bank_branches b
where EXISTS ( SELECT 'x' FROM
xx_BANK_ACCOUNTS A
WHERE A.ACCOUNT_TYPE <> 'EXTERNAL'
and a.bank_branch_id = b.bank_branch_id )
order by bank_branch_name, bank_name;
However when I compared the explain plan.. the plan generated by both the queries was exactly the same..
My question is did CBO treat both the queries as similar ?
July 06, 2010 - 2:13 pm UTC
answered this exact question elsewhere.
Yes, here
Sutaria Chirayu, July 07, 2010 - 10:02 pm UTC
Query Tuning
Vicky, March 15, 2011 - 1:43 am UTC
/* Formatted on 3/15/2011 10:37:11 AM (QP5 v5.126.903.23003) */
SELECT --gjh.period_name,gjh.je_Source,gjh.Name,gjh.Description,---rico
gjh.je_header_id journal_ctrl_num, gjl.je_line_num sequence_id
,gcc.segment1 || gcc.segment10 || gcc.segment2 || gcc.segment3 || gcc.segment4 || gcc.segment5 account_code
, --gcc.segment4,--RICO
SUBSTR(ap_main.description, 1, 40) journal_description
,(CASE WHEN gcc.segment3 IN ('1001', '1002') THEN SUBSTR(ap_main.doc_description, 1, 16) ELSE NULL END) document_1
,NULL document_2, SUBSTR(gjl.attribute1, 1, 32) reference_code, NVL(gjl.accounted_dr, -gjl.accounted_cr) balance
,NVL(gjl.entered_dr, -gjl.entered_cr) nat_balance, SUBSTR(gjh.currency_code, 1, 8) nat_cur_code
,gjh.currency_conversion_rate rate, gjh.currency_conversion_type rate_type_home
,DECODE(gjh.status, 'P', 1, 'U', 0, -1) posted_flag, TO_CHAR(GJH.POSTED_DATE, 'yyyy-mm-dd hh24:mi:ss') date_posted
,TO_CHAR(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') date_transfer
FROM apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.gl_code_combinations gcc
,--gl.gl_period_statuses gps ,
( SELECT ap.JE_HEADER_ID, ap.je_line_num, MAX(ap.description) description, MAX(ap.doc_description) doc_description
FROM (SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name, aid.description, NULL doc_description
FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
,apps.gl_je_headers gjh, apps.gl_je_lines gjl, apps.Ap_Invoice_Distributions_All aid
WHERE aeh.ae_header_id = ael.ae_header_id
AND AEL.SOURCE_TABLE = 'AP_INVOICE_DISTRIBUTIONS'
AND AEL.SOURCE_ID = AID.INVOICE_DISTRIBUTION_ID
AND ael.gl_sl_link_id = imp.gl_sl_link_id
AND imp.Gl_Sl_Link_Table = 'APECL'
AND gjh.je_header_id = imp.je_header_id
AND gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gjl.je_line_num = imp.je_line_num
AND aeh.gl_transfer_flag = 'Y'
AND aeh.set_of_books_id = 1001
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
UNION
--select GJH.JE_HEADER_ID,gjl.je_line_num ,gjh.period_name ,aid.description
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name, aid.description, b.description doc_description
FROM apps.ap_ae_headers_all aeh
,apps.ap_ae_lines_all ael
,apps.gl_import_references imp
,apps.ap_invoice_payments_all pay
,apps.ap_invoices_all inv
,apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.Ap_Invoice_Distributions_All aid
,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
,(SELECT fvv.FLEX_VALUE, fvv.description
FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID)
b
WHERE aeh.ae_header_id = ael.ae_header_id
AND ael.source_table = 'AP_INVOICE_PAYMENTS'
AND ael.source_id = pay.invoice_payment_id
AND pay.invoice_id = inv.invoice_id
AND ael.gl_sl_link_id = imp.gl_sl_link_id
AND imp.Gl_Sl_Link_Table = 'APECL'
AND gjh.je_header_id = imp.je_header_id
AND gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gjl.je_line_num = imp.je_line_num
AND aeh.set_of_books_id = 1001
AND aeh.gl_transfer_flag = 'Y'
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
AND pay.invoice_id = jcca.TRX_id(+)
AND jcca.SOURCE(+) = 'AP'
AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM
AND inv.invoice_id = aid.invoice_id
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name, aid.description, NULL doc_description
FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
,apps.ap_invoices_all inv, apps.gl_je_headers gjh, apps.gl_je_lines gjl
,apps.Ap_Invoice_Distributions_All aid
WHERE aeh.ae_header_id = ael.ae_header_id
AND ael.source_table = 'AP_INVOICES'
AND ael.source_id = inv.invoice_id
AND ael.gl_sl_link_id = imp.gl_sl_link_id
AND imp.Gl_Sl_Link_Table = 'APECL'
AND gjh.je_header_id = imp.je_header_id
AND gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gjl.je_line_num = imp.je_line_num
AND aeh.set_of_books_id = 1001
AND aeh.gl_transfer_flag = 'Y'
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
AND inv.invoice_id = aid.invoice_id
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name, gjl.description, b.description doc_description
FROM apps.ap_ae_headers_all aeh
,apps.ap_ae_lines_all ael
,apps.gl_import_references imp
,apps.AP_CHECKS_ALL AC
,apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
,(SELECT fvv.FLEX_VALUE, fvv.description
FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID)
b
,apps.ap_invoice_payments_all pay
WHERE aeh.ae_header_id = ael.ae_header_id
AND ael.source_table = 'AP_CHECKS'
AND ael.source_id = AC.CHECK_ID
AND ael.gl_sl_link_id = imp.gl_sl_link_id
AND imp.Gl_Sl_Link_Table = 'APECL'
AND gjh.je_header_id = imp.je_header_id
AND gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gjl.je_line_num = imp.je_line_num
AND aeh.set_of_books_id = 1001
AND aeh.gl_transfer_flag = 'Y'
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
AND ac.check_id = pay.check_id
AND pay.invoice_id = jcca.TRX_id(+)
AND jcca.SOURCE(+) = 'AP'
AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM) ap
GROUP BY ap.JE_HEADER_ID, ap.je_line_num) ap_main
WHERE gjh.set_of_books_id = 1001
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
--and gjh.period_name =gps.period_name
AND gjh.je_header_id = gjl.je_header_id
AND gjl.code_combination_id = gcc.code_combination_id
AND ap_main.je_header_id = gjh.je_header_id
AND ap_main.je_line_num = gjl.je_line_num
UNION ALL
--ar
SELECT --gjh.period_name,gjh.je_Source,gjh.Name,gjh.Description,---rico
gjh.je_header_id journal_ctrl_num, gjl.je_line_num sequence_id
,gcc.segment1 || gcc.segment10 || gcc.segment2 || gcc.segment3 || gcc.segment4 || gcc.segment5 ccount_code
, --gcc.segment4,--RICO
SUBSTR(gjl.description, 1, 40) journal_description
,(CASE WHEN gcc.segment3 IN ('1001', '1002') THEN SUBSTR(ar_main.doc_description, 1, 16) ELSE NULL END) document_1
,NULL document_2, SUBSTR(gjl.attribute1, 1, 32) reference_code, NVL(gjl.accounted_dr, -gjl.accounted_cr) balance
,NVL(gjl.entered_dr, -gjl.entered_cr) nat_balance, SUBSTR(gjh.currency_code, 1, 8) nat_cur_code
,gjh.currency_conversion_rate rate, gjh.currency_conversion_type rate_type_home
,DECODE(gjh.status, 'P', 1, 'U', 0, -1) posted_flag, TO_CHAR(GJH.POSTED_DATE, 'yyyy-mm-dd hh24:mi:ss') date_posted
,TO_CHAR(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') date_transfer
FROM apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.gl_code_combinations gcc
,( SELECT ar.JE_HEADER_ID, ar.je_line_num, MAX(ar.description) doc_description
FROM (SELECT GJH.JE_HEADER_ID, gjl.je_line_num, b.description
FROM apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.gl_import_references gir
,-- ar_cash_receipts_all acr,
apps.AR_CASH_RECEIPTS_ALL ACR
,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
,(SELECT fvv.FLEX_VALUE, fvv.description
FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID)
b
WHERE gjh.je_source = 'Receivables'
AND gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gir.je_header_id = gjh.je_header_id
AND gir.je_line_num = gjl.je_line_num
-- NTP 4-Mar-11
--AND SUBSTR (gir.reference_2, 1, INSTR (gir.reference_2, 'C') - 1) =
-- TO_CHAR (acr.CASH_RECEIPT_ID)
AND TO_NUMBER(SUBSTR(gir.reference_2, 1, INSTR(gir.reference_2, 'C') - 1)) = acr.CASH_RECEIPT_ID
AND ACR.CASH_RECEIPT_ID = jcca.TRX_id(+)
AND jcca.SOURCE(+) = 'AR'
AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, b.description
FROM apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.gl_import_references gir
,apps.ar_receivable_applications_all ara
,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
,(SELECT fvv.FLEX_VALUE, fvv.description
FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID)
b
WHERE --gjh.je_source= 'Receivables'
gjh .je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gir.je_header_id = gjh.je_header_id
AND gir.je_line_num = gjl.je_line_num
AND NVL(gir.reference_2, 0) = TO_CHAR(ara.applied_customer_trx_id)
AND ara.applied_customer_trx_id = jcca.TRX_id(+)
AND jcca.SOURCE(+) = 'AR'
AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM) ar
GROUP BY ar.JE_HEADER_ID, ar.je_line_num) ar_main
WHERE gjh.set_of_books_id = 1001
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
--and gjh.period_name =gps.period_name
AND gjh.je_header_id = gjl.je_header_id
AND gjl.code_combination_id = gcc.code_combination_id
AND ar_main.je_header_id = gjh.je_header_id
AND ar_main.je_line_num = gjl.je_line_num
UNION ALL
--gl
SELECT --gjh.period_name,gjh.je_Source,gjh.Name,gjh.Description,---rico
gjh.je_header_id journal_ctrl_num, gjl.je_line_num sequence_id
,gcc.segment1 || gcc.segment10 || gcc.segment2 || gcc.segment3 || gcc.segment4 || gcc.segment5 account_code
, ---gcc.segment4,--RICO
SUBSTR(gjl.description, 1, 40) journal_description
,DECODE(gcc.segment3, '1001', SUBSTR(b.description, 1, 16), '1002', SUBSTR(b.description, 1, 16), NULL) document_1
,NULL document_2, SUBSTR(gjl.attribute1, 1, 32) reference_code, NVL(gjl.accounted_dr, -gjl.accounted_cr) balance
,NVL(gjl.entered_dr, -gjl.entered_cr) nat_balance, SUBSTR(gjh.currency_code, 1, 8) nat_cur_code
,gjh.currency_conversion_rate rate, gjh.currency_conversion_type rate_type_home
,DECODE(gjh.status, 'P', 1, 'U', 0, -1) posted_flag, TO_CHAR(GJH.POSTED_DATE, 'yyyy-mm-dd hh24:mi:ss') date_posted
,TO_CHAR(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') date_transfer
FROM apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.gl_code_combinations gcc
,(SELECT fvv.FLEX_VALUE, fvv.description
FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID) b
WHERE gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gjl.global_attribute1 = B.FLEX_VALUE
AND gjl.code_combination_id = gcc.code_combination_id
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
UNION ALL
SELECT ---gjh.period_name,gjh.je_Source,gjh.Name,gjh.Description,---rico
gjh.je_header_id journal_ctrl_num, gjl.je_line_num sequence_id
,gcc.segment1 || gcc.segment10 || gcc.segment2 || gcc.segment3 || gcc.segment4 || gcc.segment5 account_code
, ---gcc.segment4,--RICO
SUBSTR(gjl.description, 1, 40) journal_description, NULL document_1, NULL document_2
,SUBSTR(gjl.attribute1, 1, 32) reference_code, NVL(gjl.accounted_dr, -gjl.accounted_cr) balance
,NVL(gjl.entered_dr, -gjl.entered_cr) nat_balance, SUBSTR(gjh.currency_code, 1, 8) nat_cur_code
,gjh.currency_conversion_rate rate, gjh.currency_conversion_type rate_type_home
,DECODE(gjh.status, 'P', 1, 'U', 0, -1) posted_flag, TO_CHAR(GJH.POSTED_DATE, 'yyyy-mm-dd hh24:mi:ss') date_posted
,TO_CHAR(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') date_transfer
FROM apps.gl_je_headers gjh, apps.gl_je_lines gjl, apps.gl_code_combinations gcc
-- gl.gl_period_statuses gps
WHERE gjh.set_of_books_id = 1001
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
--and gjh.period_name =gps.period_name
AND gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gjl.code_combination_id = gcc.code_combination_id
AND(GJH.JE_HEADER_ID, GJL.JE_LINE_NUM) NOT IN
(SELECT A.JE_HEADER_ID, A.je_line_num
FROM (SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael
,apps.gl_import_references imp, apps.gl_je_headers gjh
,apps.gl_je_lines gjl, apps.Ap_Invoice_Distributions_All aid
WHERE aeh.ae_header_id = ael.ae_header_id
AND AEL.SOURCE_TABLE = 'AP_INVOICE_DISTRIBUTIONS'
AND AEL.SOURCE_ID = AID.INVOICE_DISTRIBUTION_ID
AND ael.gl_sl_link_id = imp.gl_sl_link_id
AND imp.Gl_Sl_Link_Table = 'APECL'
AND gjh.je_header_id = imp.je_header_id
AND gjh.set_of_books_id = 1001
AND gjh.je_header_id = gjl.je_header_id
AND gjl.je_line_num = imp.je_line_num
AND aeh.set_of_books_id = 1001
AND aeh.gl_transfer_flag = 'Y'
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
,apps.ap_invoice_payments_all pay, apps.ap_invoices_all inv, apps.gl_je_headers gjh
,apps.gl_je_lines gjl, apps.Ap_Invoice_Distributions_All aid
WHERE aeh.ae_header_id = ael.ae_header_id
AND ael.source_table = 'AP_INVOICE_PAYMENTS'
AND ael.source_id = pay.invoice_payment_id
AND pay.invoice_id = inv.invoice_id
AND ael.gl_sl_link_id = imp.gl_sl_link_id
AND imp.Gl_Sl_Link_Table = 'APECL'
AND gjh.je_header_id = imp.je_header_id
AND gjh.set_of_books_id = 1001
AND gjh.je_header_id = gjl.je_header_id
AND gjl.je_line_num = imp.je_line_num
AND aeh.set_of_books_id = 1001
AND aeh.gl_transfer_flag = 'Y'
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
AND inv.invoice_id = aid.invoice_id
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
,apps.ap_invoices_all inv, apps.gl_je_headers gjh, apps.gl_je_lines gjl
,apps.Ap_Invoice_Distributions_All aid
WHERE aeh.ae_header_id = ael.ae_header_id
AND ael.source_table = 'AP_INVOICES'
AND ael.source_id = inv.invoice_id
AND ael.gl_sl_link_id = imp.gl_sl_link_id
AND imp.Gl_Sl_Link_Table = 'APECL'
AND gjh.je_header_id = imp.je_header_id
AND gjh.set_of_books_id = 1001
AND gjh.je_header_id = gjl.je_header_id
AND gjl.je_line_num = imp.je_line_num
AND aeh.set_of_books_id = 1001
AND aeh.gl_transfer_flag = 'Y'
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
AND inv.invoice_id = aid.invoice_id
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
,apps.AP_CHECKS_ALL AC, apps.gl_je_headers gjh, apps.gl_je_lines gjl
WHERE aeh.ae_header_id = ael.ae_header_id
AND ael.source_table = 'AP_CHECKS'
AND ael.source_id = AC.CHECK_ID
AND ael.gl_sl_link_id = imp.gl_sl_link_id
AND imp.Gl_Sl_Link_Table = 'APECL'
AND gjh.je_header_id = imp.je_header_id
AND gjh.set_of_books_id = 1001
AND gjh.je_header_id = gjl.je_header_id
AND gjl.je_line_num = imp.je_line_num
AND aeh.set_of_books_id = 1001
AND aeh.gl_transfer_flag = 'Y'
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
--ar
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
FROM apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.gl_import_references gir
,-- ar_cash_receipts_all acr,
apps.AR_CASH_RECEIPTS_ALL ACR
,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
,(SELECT fvv.FLEX_VALUE, fvv.description
FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS'
AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID) b
WHERE gjh.je_source = 'Receivables'
AND gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gir.je_header_id = gjh.je_header_id
AND gir.je_line_num = gjl.je_line_num
AND TO_NUMBER(SUBSTR(gir.reference_2, 1, INSTR(gir.reference_2, 'C') - 1)) = acr.CASH_RECEIPT_ID
AND ACR.CASH_RECEIPT_ID = jcca.TRX_id(+)
AND jcca.SOURCE(+) = 'AR'
AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
FROM apps.gl_je_headers gjh
,apps.gl_je_lines gjl
,apps.gl_import_references gir
,apps.ar_receivable_applications_all ara
,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
,(SELECT fvv.FLEX_VALUE, fvv.description
FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS'
AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID) b
WHERE --gjh.je_source= 'Receivables'
gjh .je_header_id = gjl.je_header_id
AND gir.je_header_id = gjh.je_header_id
AND gjh.set_of_books_id = 1001
AND gir.je_line_num = gjl.je_line_num
AND NVL(gir.reference_2, 0) = TO_CHAR(ara.applied_customer_trx_id)
AND ara.applied_customer_trx_id = jcca.TRX_id(+)
AND jcca.SOURCE(+) = 'AR'
AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
--gl
UNION
SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
FROM gl.gl_je_headers gjh
,gl.gl_je_lines gjl
,(SELECT fvv.FLEX_VALUE, fvv.description
FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS'
AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID) b
WHERE gjh.je_header_id = gjl.je_header_id
AND gjh.set_of_books_id = 1001
AND gjl.global_attribute1 = B.FLEX_VALUE
AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)) A
WHERE (A.PERIOD_NAME = :cp_period_name OR A.PERIOD_NAME = :cp_prior_period_name));
Query tuning
Vicky, March 15, 2011 - 1:44 am UTC
hi Tom,
Could you please help me in tuning this query.
Thanks,
Vicky
March 15, 2011 - 8:23 am UTC
in a word - no
in a sentence - it is huge, it is against tables I don't know, containing data I don't have a clue about, answering a question I don't understand. The odds that it is 100% "correct" (that every outer join is really necessary, that the approach taken is sound, that is even returns the right answer) is near 0% in my experience - therefore reverse engineering it - which itself would take a really really long time - isn't fruitful because the query is probably a bad idea in the first place.
In order to "tune a query", one needs:
o intimate knowledge of the schema - all of the constraints, everything.
o intimate knowledge of the question that needs to be answered
What one typically doesn't want (surprisingly) is the
o query you are trying to tune
Not that I'm asking you for the schema, etc - this is not "ask tom to tune your query"
Tuning on Exist
Prince, March 18, 2011 - 12:44 am UTC
Hi Tom,
Please explain me what is wrong in performance in this query and how to tune this. Thanks in advance..
UPDATE customer_table aaa
SET (acc_name, acc_flg, act_date) =
(
SELECT
DISTINCT name,
'Y',
DECODE(to_char(REFERENCE_START_DT, 'DD/MM/YYYY HH24.MI.SS'), NULL, aaa.status_date, to_char(REFERENCE_START_DT, 'DD/MM/YYYY HH24.MI.SS'))
FROM
account acc
WHERE
acc.acc_num = aaa.customer_id
AND aaa.success_flg = 'Y'
AND aaa.error_flg = 'N'
)
WHERE EXISTS
(
SELECT
DISTINCT name,
'Y',
DECODE(to_char(REFERENCE_START_DT, 'DD/MM/YYYY HH24.MI.SS'), NULL, aaa.status_date, to_char(REFERENCE_START_DT, 'DD/MM/YYYY HH24.MI.SS'))
FROM
account acc
WHERE
acc.acc_num = aaa.customer_id
AND aaa.success_flg = 'Y'
AND aaa.error_flg = 'N'
);
March 18, 2011 - 7:51 am UTC
If the optimizer is unable to merge the correlated subquery into the update - it might have to execute it at least once per row in the update.
I like MERGE for this
merge into customer_table aaa using account acc ON (aaa.customer_id = acc.customer_id and aaa.success_flg='Y' and aaa.error_flg='N') when matched then update set acc_name = acc.acc_name, acc_flg = 'Y', act_date = nvl(acc.act_date,status_date);
Instead of decoding and converting to strings (I assume that status_date is a date? ) just nvl it.
If the distinct is necessary - there is something really wrong with your data model, in a big way.
There is no way the predicate:
WHERE
acc.acc_num = aaa.customer_id
AND aaa.success_flg = 'Y'
AND aaa.error_flg = 'N'
)
on aaa.success_flg and aaa.error_flg belonged in the subquery - that should have been set in the where clause of the update itself - where exists (...) AND success_flg = 'Y' and error_flg = 'N'.
A reader, March 21, 2011 - 12:48 am UTC
In this case the account acc is the remote table which is having more than 30millions of rec.
earlier I was using the hint driving_site(acc) for the select statement to improve the performance.
So Now how I will use this hint or is there any alternate to improve the performance ?
Will appreciate an earliest reply.
Thanks
Prince
March 21, 2011 - 9:44 am UTC
The driving site hint is probably your best bet.
Doubt in NOT IN
Sridhar, May 27, 2011 - 1:19 am UTC
Dear Tom,
We have been experimenting in the usage of IN and NOT IN and we came across a small scenario:
When we executed the following SQL:
select SYSDATE from dual where '2' not in ('1','3')
We got sysdate as an Output, but when we used the following SQL:
select SYSDATE from dual where '2' not in ('1','3', NULL)
0 Records was selected.
What could be the reason for this?
May 27, 2011 - 10:35 am UTC
it works that way by definition.
When you compare "NON-NULL-VALUE" to "NULL-VALUE" the answer is "I don't know", it is not true, it is not false.
Therefore, it is "unknown" whether '2' is not in ( NULL ).
With in - we can "know"
where '2' in ( '1', '2', null )
we Know 2 is in that set. But we don't know if '3' is NOT IN that set...
It is by definition the way it works. Part of SQL
Very Good
pradeep sharma, August 08, 2011 - 6:32 am UTC
Amazing Question & Answers
why EXISTS ?
A reader, October 24, 2011 - 11:04 am UTC
Hi Tom,
when 'IN' and 'EXISTS' are interchangeable and we can do everything with IN whatever we can do with EXITS –
Why was the operator EXISTS introduced in ORACLE ?
Are there any special cases where we must use EXISTS only and can't use 'IN' ?
Many thanks
October 24, 2011 - 11:12 am UTC
Exists was introduced in SQL - the language specification. It is not an Oracle thing - it is a sql thing.
It is primarily syntactic convenience, semantic pleasantness. Sometimes "IN" is just "better sounding", sometimes exists is.
In and Exists can be used interchangeable.
Not In and Not Exists cannot - they are different (regarding treatment of nulls).
Just like you can say:
select * from t;
select ALL * from t;
interchangeably, or
select distinct * from t;
select unique * from t;
interchangeably...
In and exists both exist for the same reason "for" and "while" loops exist (even with goto which can be used instead of for or while as well)...
subquery
A reader, October 25, 2011 - 4:41 am UTC
Hi Tom,
1.Scalar subquery is a SPECIAL case of Single row return subquery which returns Single COLOUMN from single ROW.
We can use Scalar Subqueries wherever we can use single row subquery with single column and there are NO any differences between Scalar subquery and Single row retun subquery with single coloumn.
Could you please clarify if the above correct ?
2.I already asked the below question in new thread but did not get answer -SORRY for that.. Can you please confirm if this is correct that
a)when the subquery is non-correlated and
b) when subquery is having aggregate functions count(*) with group by and
c) when it returns multiple rows -
First we will execute the subquery once and STORE the results in PGA -then we will execute main query with this as input ??
ex: select t.x,t.y from T
where t.x > (select max(a) from s group by b)
Thanks Tom for your help all the way !!
October 25, 2011 - 7:10 am UTC
1) what is a single row subquery?
You can use scalar subqueries anywhere you can use an expression.
select * from t where x = 10;
select * from t where x = (select 10 from dual);
There, a scalar subquery was used to replace the expression "10".
Now, if you have:
select * from t where x in (select 10 from dual);
the select 10 from dual is not a scalar subquery in that context, it is just a "subquery".
2) No, that is not how it will happen in all cases. We can and will do view merging when we find it to be the appropriate approach.
Never assume some order of operation outside of expressions with ()'s. That is
(5+3)/2
has a definite order of operation - but
where x > (5+3)/2 and (y < 55 and z > 4)
You cannot assume that y&z are evaulated and then x is (or that x is, and then y and z, or etc etc etc).
All you know is (5+3)/2 - IF evaluated - will be done 'correctly' and everything else happens in some order.
Subqueries, inline views, etc etc etc - all can be moved around, rewritten and so on.
Your query in general does not work - select max(a) from s group by b would IN GENERAL return 0 to N records where N can be any number greater than zero. To see "t.x > (select max(a) from s group by b)" work - either
o s would have to be empty, that query returns 0 rows.
o b is unique in S, that query returns 1 rows.
If b has more than one value, that query returns more than 1 rows and will fail in this context. You need a scalar subquery here!
Now, if you wrote instead;
where t.x > ANY(select max(a) from s group by b)
or
where t.x > ALL(select max(a) from s group by b)
then your query would become a subquery and could work even if b was not unique in s.
A reader, October 25, 2011 - 9:30 am UTC
Hi Tom,
thanks for the reply , It was my misatke in the 2nd point - missing ANY..
ok, you said -
select * from t where x = (select 10 from dual);
There, a scalar subquery was used to replace the expression "10".
Now, if you have:
select * from t where x in (select 10 from dual);
the select 10 from dual is not a scalar subquery in that context, it is just a "subquery".
Why - is it because of the operator 'IN' ?
October 25, 2011 - 11:53 am UTC
yes, IN requires a set
= requires an expression.
a scalar subquery is an expression of sorts.
The algorithm by which EXIST, NOT EXISTS, In & NOT IN operates
NB, November 15, 2011 - 3:45 pm UTC
Can u explain the diff between the Output of these queries?? i.e. The algorithm by which EXIST, NOT EXISTS, In & NOT IN operates.
SELECT * FROM ALL_WKSCRATCHPAD_DB.TB1;??
SELECT * FROM ALL_WKSCRATCHPAD_DB.TB2;
Number Name
1 brad 30
2 tom 20
3 John 10
Number Name
1 brad 30
2 tom 20
3 anil 40
LOCKING ROW FOR ACCESS
SELECT *
FROM ALL_WKSCRATCHPAD_DB.TB1 T1
WHERE NOT EXISTS (SELECT ''
FROM ALL_WKSCRATCHPAD_DB.TB2 T2
WHERE T1.Name = T2.name)
;
Ans:
Number Name
1 John 10
---------------------------
LOCKING ROW FOR ACCESS
SELECT *
FROM ALL_WKSCRATCHPAD_DB.TB1 T1
WHERE T1.Name NOT IN (SELECT T2.name
FROM ALL_WKSCRATCHPAD_DB.TB2 T2
WHERE T1.Name = T2.name)
;
Number Name
1 John 10
----------------------------------
LOCKING ROW FOR ACCESS
SELECT *
FROM ALL_WKSCRATCHPAD_DB.TB1 T1
WHERE EXISTS (SELECT ''
FROM ALL_WKSCRATCHPAD_DB.TB2 T2
WHERE T1.Name <> T2.name)
;
Number Name
1 brad 30
2 tom 20
3 John 10
-------------------------------
LOCKING ROW FOR ACCESS
SELECT *
FROM ALL_WKSCRATCHPAD_DB.TB1 T1
WHERE T1.Name IN (SELECT T2.name
FROM ALL_WKSCRATCHPAD_DB.TB2 T2
WHERE T1.Name <> T2.name)
;
Number Name
November 16, 2011 - 9:33 am UTC
IN = Exists, they the same.
NOT IN is a *little* different from NOT EXISTS, but usually the same.
select * from t1 where not exists (select null from t2 where t2.x = t1.x);
select * from t1 where t1.x NOT IN (select t2.x from t2);
IF t2.x is NOT NULL - those two queries are the same.
IF t2.x is NULLABLE and there is at least one value of t2.x that is null in T2 - then the second query returns ZERO rows - whereas the not exists might return some rows.
but if you have:
select * from t1 where t1.x NOT IN (select t2.x from t2 where t2.x is not null)
then they would be the same again.
A reader, November 16, 2011 - 1:30 pm UTC
<quote>
Your query in general does not work - select max(a) from s group by b would IN GENERAL return 0 to N records where N can be any number greater than zero. To see "t.x > (select max(a) from s group by b)" work - either
o s would have to be empty, that query returns 0 rows.
o b is unique in S, that query returns 1 rows.
<quote>
little correction....
for second point it should not be unique but all record should have same value for b in order to return single row
November 16, 2011 - 1:33 pm UTC
for second point it should not be unique but all record should have same value
for b in order to return single row
no, b must be unique in S.
If today - b has only one value - but tomorrow it might have more than one - it would not work.
Therefore, for this query to work, B must be unique in S.
A reader, November 16, 2011 - 2:02 pm UTC
"Therefore, for this query to work, B must be unique in S."
in that case "(select max(a) from s group by b)" will return multiple rows and give multiple rows return error does not execute at all...
November 17, 2011 - 6:43 pm UTC
ah, I see the point now, yes, B must have a single unique value.
not in vs not exists - primary key
Surya, May 01, 2012 - 6:09 pm UTC
Hi Tom,
First of all, thank you very much for spending your valuable time in answering all our questions.
I have a question regarding not in vs not exists.
db version: 11gr2
Suppose I have a query like this,
select * from tab1 t1 where
(col1,col2,col3) not in (select col1,col2,col3 from tab2 t2);
If (col1,col2,col3) in tab2 is a primary key and so doesn't return any NULL values, can I rewrite the above query as below?
select * from tab1 t1 where
not exists(select null from tab2 t2 where t2.col1=t1.col1 and t2.col2=t1.col2 and t2.col3=t1.col3);
Appreciate all your help. Thank you.
May 02, 2012 - 1:07 pm UTC
yes, they would be semantically equivalent as long as c1,c2,c3 are not null
Not Exists computationally expensive
Nagakiran, July 24, 2012 - 8:51 pm UTC
Hi Tom,
I have been trying to fetch rows from table A if it doesn't exist multiple times in table B..
Eg: Table A
ID Value
1 ABC
1 ABC
1 ABC
2 DEF
2 DEF
Table B
1 ABC
1 ABC
2 DEF
When I compare,these two tables,I need to show that there are two rows in the output as
as ID Value
1 ABC
2 DEF
I tried using the following query
select A.ID ,A.value,
from A
where not exists (select * from B where A.id= B.id and A.value= B.value).
However,I don't get the required output.
I tried using LEFT OUTER JOIN too,
SELECT A.ID, A.value LEFT OUTER JOIN B
ON A.ID = B.ID AND A.value = B.value
Even this doesn't work..
Kindly help me on this.
July 30, 2012 - 9:18 am UTC
no creates
no inserts
no look
not exists works strange
pfunk, August 10, 2012 - 2:14 am UTC
Hi Tom,
i have faced with the strange query execution behavior.
In brief, I has two tables.
First - rem3_credit with "LINK" column as primary key.
Second - credit_doc with "LINK" column as primary key and nullable column "LINK_UP" referenced on rem3_credit.link.
Data was loaded via SQL*Loader with setting DIRECT=Y, so reference constraint had not checked during load.
I know about special settings in SQL*Loader which force reference constraints been checked after load but interesting issue was found when I decide to find rows in rem3_credit that are not in credit_doc.
My query was:
SQL> select * from rem3_credit rc where not exists (select * from credit_doc cd where cd.link=rc.link_up); no rows selected
But actually there was one!
SQL> select rc.link,rc.link_up fk_link, cd.link pk_link from rem3_credit rc, credit_doc cd where cd.link(+)=rc.link_up and cd.link is null; LINK FK_LINK PK_LINK ---------- ---------- ---------- 2202 24031
There is the same result if i`d query such way:
SQL> select link,link_up from rem3_credit rc where not exists (select * from credit_doc cd where cd.link=rc.link_up and cd.link is not null); LINK LINK_UP ---------- ---------- 2202 24031
But why I need to write additional predicate if I know (and Oracle knows) that credit_doc.link is primary key and can`t be null?
Let`s take a look at execution plan:
SQL> select * from TABLE(dbms_xplan.display_cursor('','','ALLSTATS LAST')); SQL_ID 1cbvtut47qfv1, child number 0 ------------------------------------- select /*+ GATHER_PLAN_STATISTICS */ * from rem3_credit rc where not exists (select * from credit_doc cd where cd.link=rc.link_up) Plan hash value: 1161649700 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 15 | |* 1 | TABLE ACCESS FULL| REM3_CREDIT | 1 | 1 | 0 |00:00:00.01 | 15 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RC"."LINK_UP" IS NULL)
Execution plan of "correct" query is:
SQL> select * from TABLE(dbms_xplan.display_cursor('','','ALLSTATS LAST')); SQL_ID av9w7ftq1bu06, child number 0 ------------------------------------- select /*+ GATHER_PLAN_STATISTICS */ rc.link,rc.link_up fk_link, cd.link pk_link from rem3_credit rc, credit_doc cd where cd.link(+)=rc.link_up and cd.link is null Plan hash value: 2146757845 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 452 | |* 1 | FILTER | | 1 | | 1 |00:00:00.01 | 452 | | 2 | NESTED LOOPS OUTER| | 1 | 8 | 797 |00:00:00.01 | 452 | | 3 | TABLE ACCESS FULL| REM3_CREDIT | 1 | 798 | 797 |00:00:00.01 | 16 | |* 4 | INDEX UNIQUE SCAN| PK_CREDIT_DOC | 797 | 12208 | 796 |00:00:00.01 | 436 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("CD"."LINK" IS NULL) 4 - access("CD"."LINK"="RC"."LINK_UP")
I suppose there is some kind of query rewrite or transformation. Or simply a bug. By the way:
SQL> select * from v$version; Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
Can you give some clarification about what happening here? Thank you in advance.
August 17, 2012 - 1:39 pm UTC
please utilize support for something like this.
not exists works strange
pfunk, August 10, 2012 - 2:44 am UTC
Sorry, i`ve made a mistake in table description.
Correct description is:
First - rem3_credit with "LINK" column as primary key and nullable column "LINK_UP" referenced on credit_doc.link.
Second - credit_doc with "LINK" column as primary key.
Balaji, August 14, 2012 - 9:04 am UTC
When you use IN / NOT IN, the query retrieves the records from subquery and then checks the existance of the master record in the subquery. It will not consider NULL Records.
Incase of Exists/Not Exists, the where condition will return a boolean (true/false) and display the desired output. It is faster in performance than IN and NOT IN.
Another better method is using Left/Right outer join.
August 17, 2012 - 2:21 pm UTC
there is so much wrong with what you just said. so much. I don't know where to start.
did you know we can and will rewrite your IN to be Exists, your Exists to be In, your not in as not exists, your not exists as not in???
they are the same to us - we use them as we see fit.
http://www.oracle.com/technetwork/issue-archive/2006/06-may/o36asktom-099107.html
When you use IN / NOT IN, the query retrieves the records from subquery and
then checks the existance of the master record in the subquery. It will not
consider NULL Records.
I don't know what you mean by "it will not consider NULL records" - but I have a feeling that whatever you would say it means is probably wrong. Not in and VERY null sensitive.
ops$tkyte%ORA11GR2> create table t1 ( x int ); Table created. ops$tkyte%ORA11GR2> create table t2 ( x int ); Table created. ops$tkyte%ORA11GR2> ops$tkyte%ORA11GR2> insert into t1 values(1); 1 row created. ops$tkyte%ORA11GR2> insert into t1 values(2); 1 row created. ops$tkyte%ORA11GR2> insert into t2 values(1); 1 row created. ops$tkyte%ORA11GR2> ops$tkyte%ORA11GR2> select * from t1 where x not in (select x from t2); X ---------- 2 ops$tkyte%ORA11GR2> insert into t2 values(null); 1 row created. ops$tkyte%ORA11GR2> select * from t1 where x not in (select x from t2); no rows selected ops$tkyte%ORA11GR2>
and you know what, ask yourself, if this were true:
It is faster in performance than
IN and NOT IN.
then why would anyone 'invent' IN and NOT IN????
Jess, August 21, 2012 - 7:57 am UTC
Hi Tom,
What about 'in' vs. '<>'?
I have a query that looks like this:
select txn.* from txn_tbl partition (20100101) txn, attribute_tbl attr
where txn.attrib_fk = attr.attrib_pk and attr.status <> 'A';
Select from big transaction table based on one attribute. The 'status' column in the latter has 3 values and a bitmap index on it. I need to return all transaction with attributes that have any of 2 values.
Run as it is (attr.status <> 'A'), I get the following plan:
--------------------------------------------------------------------------------------------------
Id |Operation |Name |Rows|Bytes|Cost(CPU|Time |Pstr|Pstp|IN-OUT
--------------------------------------------------------------------------------------------------
0|SELECT STATEMENT | | 18M|2912M|6168 (2)|00:01:52| | |
1| PX COORDINATOR | | | | | | | |
2| PX SEND QC (RANDOM) |:TQ10001 | 18M|2912M|6168 (2)|00:01:52| | | P->S
* 3| HASH JOIN | | 18M|2912M|6168 (2)|00:01:52| | | PCWP
4| BUFFER SORT | | | | | | | | PCWC
5| PX RECEIVE | |128 | 512 | 3(34)|00:00:01| | | PCWP
6| PX SEND BROADCAST |:TQ10000 |128 | 512 | 3(34)|00:00:01| | | S->P
7| VIEW |index$_join$_02|128 | 512 | 3(34)|00:00:01| | |
* 8| HASH JOIN | | | | | | | |
9| BITMAP CONV'N TO ROWIDS| |128 | 512 | 1 (0)|00:00:01| | |
*10| BITMAP INDEX FULL SCAN|ATTRIB_TBL_BMI1| | | | | | |
11| INDEX FAST FULL SCAN |ATTRIB_PK_IDX |128 | 512 | 1 (0)|00:00:01| | |
12| PX BLOCK ITERATOR | | 18M|2840M|6159 (1)|00:01:51|1317|1317| PCWC
13| TABLE ACCESS FULL |TXN_TBL | 18M|2840M|6159 (1)|00:01:51|1317|1317| PCWP
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("TXN"."ATTRIB_FK"="ATTR"."ATTRIB_PK")
8 - access(ROWID=ROWID)
10 - filter("ATTR"."STATUS"<>'A')
In one of the releases, the query was changed to the following "to make it easier to understand which values we're bringing back":
select txn.* from txn_tbl partition (20100101) txn, attribute_tbl attr
where txn.attrib_fk = attr.attrib_pk and attr.status in ('B','C');
The plan for this version is this:
--------------------------------------------------------------------------------------------------
Id |Operation |Name |Rows|Bytes|Cost(CPU|Time |Pstr|Pstp|INOUT
--------------------------------------------------------------------------------------------------
0|SELECT STATEMENT | | 18M|2912M|6169 (2)|00:01:52| | |
1| PX COORDINATOR | | | | | | | |
2| PX SEND QC (RANDOM) |:TQ10001 | 18M|2912M|6169 (2)|00:01:52| | | P->S
* 3| HASH JOIN | | 18M|2912M|6169 (2)|00:01:52| | | PCWP
4| BUFFER SORT | | | | | | | | PCWC
5| PX RECEIVE | |128 | 512 | 4(25)|00:00:01| | | PCWP
6| PX SEND BROADCAST |:TQ10000 |128 | 512 | 4(25)|00:00:01| | | S->P
* 7| VIEW |index$_join$_02|128 | 512 | 4(25)|00:00:01| | |
* 8| HASH JOIN | | | | | | | |
9| INLIST ITERATOR | | | | | | | |
10| BITMAP CONV'N TO ROWIDS| |128 | 512 | 2 (0)|00:00:01| | |
*11| BITMAP IDX SINGL VALUE|ATTRIB_TBL_BMI1| | | | | | |
12| INDEX FAST FULL SCAN |ATTRIB_PK_IDX |128 | 512 | 1 (0)|00:00:01| | |
13| PX BLOCK ITERATOR | | 18M|2840M|6159 (1)|00:01:51|1317|1317| PCWC
14| TABLE ACCESS FULL |TXN_TBL | 18M|2840M|6159 (1)|00:01:51|1317|1317| PCWP
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("TXN"."ATTRIB_FK"="ATTR"."ATTRIB_PK")
7 - filter("ATTR"."STATUS"='B' OR "ATTR"."STATUS"='C')
8 - access(ROWID=ROWID)
11 - access("ATTR"."STATUS"='B' OR "ATTR"."STATUS"='C')
With an additional access predicate and 'in list' operation, I would've thought that the second version would produce a different plan. But the plans appear virtually identical in terms of cost. I tried it with a larger number of values in the 'in' list against another table, but that didn't seem to affect the plan either. Does that mean that doing <> is no better than doing in ()? I would've thought that saying 'not A' would yield better performance than listing 'B, C, D, E' as possible 'in' values...
Thanks as always.
August 28, 2012 - 1:16 pm UTC
But the plans appear
virtually identical in terms of cost.
that should be expected, not unexpected!! The cost if retrieving the same amount of data should be....
the same.
I'm disappointed when it *doesn't* work out that way!!!
Does that mean that doing <> is no better than doing in ()?
I would've thought that saying 'not A' would yield better performance than
listing 'B, C, D, E' as possible 'in' values...
no, it doesn't "mean" that. It just means that to get 18,000,000 rows from a table - most of the work, the time, the energy is going to be spent doing IO.
Jess, August 29, 2012 - 2:41 pm UTC
Thanks Tom.
I suppose I was expecting the plans to be different (erroneously so) in a sense that one would be better than the other one, which was not the case.
So then with the io being spent either way to get the rows, does it make it a purely sematic debate as to whether <> or () should be used, or is there a meaningful difference between the two?
Thanks as always
August 29, 2012 - 2:53 pm UTC
In general, != will preclude an index range access path whereas "in (...)" would allow it.
in your case, no index range scanning was to be used so it didn't matter.
so, where x <> 42 would probably never use an index range scan to find the rows.
but where x in ( 1, 2, 3 ) would or could.
Jess, August 29, 2012 - 5:41 pm UTC
Thanks Tom.
That's a really neat piece of information. I didn't know this.
EXISTS with multiple OR vs UNION ALL
Prakash Rai, December 19, 2012 - 1:46 pm UTC
Hi Tom -
Someone in the thread asked to tune the query with multiple EXISTS clause with OR. I have the similar case with multiple EXISTS that is producing the wong result. I replaced subsequents EXISTS with UNION ALL and changed to one EXISTS clause and get the correct result.
Though I am not sure why that would make the difference. Do you see any reason?
--This clause misses some records
WHERE
EXISTS ( SELECT 1 FROM springs.si_jde_schedrcpts_work sr
WHERE t.order_id = sr.t.order_id
AND t.item = sr.item
AND t.fromloc = sr.fromloc
AND sr.order_status < 90
AND sr.supplysource =1 )
OR EXISTS( SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc AND
inventory.qty => t.stqty
AND sr.order_staus > 90
)
OR EXISTS( SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc AND
inventory.qty => t.stqty
AND sr.supplysource=1
)
VS
--this produces the expected result
WHERE
EXISTS ( SELECT 1 FROM springs.si_jde_schedrcpts_work sr
WHERE t.order_id = sr.t.order_id
AND t.item = sr.item
AND t.fromloc = sr.fromloc
AND sr.order_status < 90
AND sr.supplysource =1
UNION ALL
SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc AND
inventory.qty => t.stqty
AND sr.order_staus > 90
UNION ALL
SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc AND
inventory.qty => t.stqty
AND sr.supplysource=1
)
Thanks an always!
Prakash
December 20, 2012 - 4:57 pm UTC
please contact support with this one...
(and use two subqueries, not three - just or the last condition on the last two)
If what you say is true, it looks like a bad plan (bug) is being generated for the first query - but I cannot confirm that as I don't have the schema or the entire query and cannot verify that.
EXISTS with multiple OR vs UNION ALL
Prakash Rai, December 20, 2012 - 9:47 am UTC
Follow up to my previous post;
Somewhere I read, use of multiple EXISTS with OR clause like below will not execute subsequent clause and that may vary if a column used in the predicate is indexed. e.g. SR.SUPPLYSOURCE. That's exactly the case I tested, but WHY? Why the standalone index makes the query behave differently?
I see INDEX or UNION ALL are fix or work around but curious what's the reason to Oracle behaves differently?
....
WHERE
EXISTS( SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc
AND inventory.qty => t.stqty
AND sr.supplysource = 1 )
OR EXISTS( SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc
AND inventory.qty => t.stqty
AND sr.supplysource = 2 )
OR EXISTS( SELECT 1 FROM stsc.inventory inventory
WHERE t.item = inventory.item
AND t.fromloc = inventory.loc
AND inventory.qty => t.stqty
AND sr.supplysource =3 )
Thanks as always!
Prakash
December 20, 2012 - 5:23 pm UTC
please see above.
EXISTS with multiple OR vs UNION ALL
Prakash Rai, December 20, 2012 - 10:21 pm UTC
Tom - Thanks for your response. In real scenario I have 5 OR EXISTS and data breaks after 3rd clause. I had my peers and DBA looked into it and came to the similar conclusion that it is a bug. While we open SR with Oracle, I am applying workaround with UNION ALL, that preserves the performance and produce the correct result.
Thanks as always.
Prakash
January 02, 2013 - 8:19 am UTC
please only ever post the REAL QUERY.
You just wasted your time and my time by having me look at a fake example that does not represent what you really are faced with.
Now we have no idea if you are facing a bug or not - because we have never seen the actual query you are running.
The union all might not be giving you the right answer by the way. If more than one branch of the OR is satisfied - you are going to get the same row more than once with a union all.
Rahul, December 29, 2012 - 11:48 am UTC
Hi Tom ,
Your link provided for IN and exists is awesome . Though for NOT IN AND NOT EXISTS you gave us the example of the case which was mishandled by NOT IN .So if i ignore such cases then the working of NOT IN and NOT EXISTS should be similar to IN AND EXISTS. I mean , in case of NOT IN - The sub-query is transformed to a view with distinct values and a join . but an equi join wouldn't be needed here .So how does it work ?
Also for NOT EXISTS ,suppose the inner table is big enough and indexed on the column .Since we arent matching any values, i suppose the index wont be used. so how does NOT EXISTS work ?
January 04, 2013 - 11:23 am UTC
NOT IN and NOT EXISTS are not the same in the way that IN and EXISTS are.
they (not in/not exists) deal with NULLS very differently.
with not in/exists - the subuquery doesn't have to be distincted - we can do a thing called a semi-join - so we don't necessarily need to distinct them. We'd use an outer join and keep only the rows where the joined to table was "null" (not there at all).
and we can use an index easily for both not in and not exists if it makes sense.
ops$tkyte%ORA11GR2> create table t1 ( x int, y int, z int ); Table created. ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T1', numrows => 100 ); PL/SQL procedure successfully completed. ops$tkyte%ORA11GR2> create table t2 ( x int not null, y int, z int ); Table created. ops$tkyte%ORA11GR2> create index t2_idx on t2(x); Index created. ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T2', numrows => 1000000, numblks => 100000 ); PL/SQL procedure successfully completed. ops$tkyte%ORA11GR2> ops$tkyte%ORA11GR2> set autotrace traceonly explain ops$tkyte%ORA11GR2> select * 2 from t1 3 where NOT EXISTS (select null from t2 where t2.x = t1.x) 4 / Execution Plan ---------------------------------------------------------- Plan hash value: 2403176152 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100 | 5200 | 29 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 100 | 5200 | 29 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| T1 | 100 | 3900 | 29 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | T2_IDX | 1 | 13 | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("T2"."X"="T1"."X") ops$tkyte%ORA11GR2> set autotrace off
EXISTS with multiple OR vs UNION ALL
Prakahs Rai, January 02, 2013 - 12:59 pm UTC
Tom -
I apologize for not posting the actual query.
Here is the query that I replaced "OR" with "UNION ALL".
In the third EXIST clause, if I explicitly convert ics.cac_segment_type (which is number field) to TO_NUMBRE (as "TO_NUMBER(ics.cac_segment_type) = 1.3", the query fully executes the next evaluation "ics.cac_segment_type= 1.4" and returns the correct result, or else only partiatially evaluates the next statmenet and skips about 20% of "ics.cac_segment_type= 1.4" matchings. The use of TO_NUMBER() changed the plan to "INDEX STORAGE FAST FULL SCAN" from "INDEX RANGE SCAN". This solution was found as hit and trial that I don't see the logical reasoning and also impacted the performance unacceptably.
UNION ALL produces the correct result and there is no counter impact. Though I don't see the point as you said "The union all might not be giving you the right answer by the way....". Can you please elaborate why would top query returns multiple records if the EXIST satisfies with multiple clauses. EXIST will only validate EXIST = Yes or No. What am I missing?
Thanks,
Prakash
SELECT
...
FROM
table1 x
WHERE EXISTS
(SELECT 'X'
FROM local_gtt ics
WHERE ics.cac_segment_type = 1.1
AND ics.buyer_bus_org_fk = x.buyerorg
AND ics.ui_buyer_org_exists =1
AND ics.segment_value = x.segment1
UNION ALL
SELECT 'X'
FROM local_gtt ics
WHERE ics.cac_segment_type =1.2
AND ics.buyer_bus_org_fk = x.buyerorg
AND ics.ui_buyer_org_exists = 1
AND ics.segment_value = x.segment2
UNION ALL
SELECT 'X'
FROM local_gtt ics
WHERE ics.cac_segment_type = 1.3
AND ics.buyer_bus_org_fk = x.buyerorg
AND ics.ui_buyer_org_exists = 1
AND ics.segment_value = x.segment3
UNION ALL
SELECT 'X'
FROM local_gtt ics
WHERE ics.cac_segment_type = 1.4
AND ics.buyer_bus_org_fk = x.buyerorg
AND ics.ui_buyer_org_exists = 1
AND ics.segment_value = x.segment4
UNION ALL
SELECT 'X'
FROM local_gtt ics
WHERE ics.cac_segment_type = 1.5
AND ics.buyer_bus_org_fk = x.buyerorg
AND ics.ui_buyer_org_exists = 1
AND ics.segment_value = x.segment5
)
Difference in execution plan and execution time
Amit, February 01, 2013 - 3:34 pm UTC
Hi Tom,
I have following queries with execution plans. Both return exact same data. First query uses normal joins, it shows execution plan with low cost and executes faster. Second query uses EXISTS, it's cost is much higher and execution is 10 times slower compared to first query.
I am trying to understand reason for the difference between the execution plans of these 2 queries. Is there any way so that I can put HINT to second query so that its plan becomes better?
Thanks!
Table details -
Table_name Num_rows
--------------------------------
TBL_CALENDAR 4217
TBL_ACCOUNT 41355
TBL_DATA_SRC 12
TBL_POSN 377259
---------------------------------------------------------------------------------------------
First Query -
SELECT S.ROWID
FROM TBL_POSN S,
TBL_ACCOUNT A,
TBL_DATA_SRC D,
(
SELECT TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY') MMYYYY,
MAX(CASE WHEN TO_CHAR(TBL_CALENDAR_DATE, 'D') IN (2, 3, 4, 5, 6) THEN TBL_CALENDAR_DATE ELSE NULL END) LAST_WORKING_DAY,
MAX(LAST_DAY(TBL_CALENDAR_DATE)) LAST_CALENDAR_DAY
FROM TBL_CALENDAR
GROUP BY TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY')
) LAST_DAY
WHERE 1=1
AND REPORT_DATE < '31-OCT-12'
AND REPORT_DCDE = 'I'
AND A.AC_SK = S.AC_SK
AND A.DATA_SRC_ID = D.DATA_SRC_ID
AND S.REPORT_DATE >= D.BACKLOAD_CUTOFF_DATE
and TO_CHAR(s.REPORT_DATE, 'MMYYYY') = LAST_DAY.MMYYYY
and (
a.DATA_SRC_ID in(1, 2)
OR (
a.DATA_SRC_ID in(3, 4, 5, 6, 10)
AND s.REPORT_DATE <> LAST_DAY.LAST_WORKING_DAY
)
or (
A.DATA_SRC_ID = 8
and s.REPORT_DATE <> LAST_DAY.LAST_CALENDAR_DAY
)
);
Plan hash value: 1112228868
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Pstart| Pstop |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11926 | 815K| 529 (15)| | |
|* 1 | HASH JOIN | | 11926 | 815K| 529 (15)| | |
|* 2 | HASH JOIN | | 857 | 40279 | 519 (15)| | |
| 3 | NESTED LOOPS | | 8574 | 326K| 362 (19)| | |
|* 4 | TABLE ACCESS FULL | TBL_DATA_SRC | 1 | 12 | 2 (0)| | |
| 5 | PARTITION RANGE ITERATOR| | 14291 | 376K| 360 (19)| KEY | KEY |
|* 6 | INDEX FAST FULL SCAN | XU_TBL_POSN_PK | 14291 | 376K| 360 (19)| KEY | KEY |
| 7 | TABLE ACCESS FULL | TBL_ACCOUNT | 41355 | 323K| 154 (4)| | |
| 8 | VIEW | | 4200 | 96600 | 8 (25)| | |
| 9 | HASH GROUP BY | | 4200 | 33600 | 8 (25)| | |
| 10 | INDEX FULL SCAN | TBL_CALENDAR_PK| 4217 | 33736 | 6 (0)| | |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("LAST_DAY"."MMYYYY"=TO_CHAR(INTERNAL_FUNCTION("S"."REPORT_DATE"),'MMYYYY'))
filter("A"."DATA_SRC_ID"=1 OR "A"."DATA_SRC_ID"=2 OR ("A"."DATA_SRC_ID"=3 OR "A"."DATA_SRC_ID"=4 OR
"A"."DATA_SRC_ID"=5 OR "A"."DATA_SRC_ID"=6 OR "A"."DATA_SRC_ID"=10) AND
"S"."REPORT_DATE"<>"LAST_DAY"."LAST_WORKING_DAY" OR "A"."DATA_SRC_ID"=8 AND
"S"."REPORT_DATE"<>"LAST_DAY"."LAST_CALENDAR_DAY")
2 - access("A"."AC_SK"="S"."AC_SK" AND "A"."DATA_SRC_ID"="D"."DATA_SRC_ID")
4 - filter("D"."BACKLOAD_CUTOFF_DATE"<'31-OCT-12')
6 - filter("REPORT_DCDE"='I' AND "REPORT_DATE"<'31-OCT-12' AND
"S"."REPORT_DATE">="D"."BACKLOAD_CUTOFF_DATE")
------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------
Second query -
SELECT ROWID
FROM TBL_POSN s
WHERE 1=1
AND REPORT_DATE < '31-OCT-12'
AND REPORT_DCDE = 'I'
and EXISTS
(
SELECT 1
FROM TBL_ACCOUNT A,
TBL_DATA_SRC D,
(
SELECT TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY') MMYYYY,
MAX(CASE WHEN TO_CHAR(TBL_CALENDAR_DATE, 'D') IN (2, 3, 4, 5, 6) THEN TBL_CALENDAR_DATE ELSE NULL END) LAST_WORKING_DAY,
MAX(LAST_DAY(TBL_CALENDAR_DATE)) LAST_CALENDAR_DAY
FROM TBL_CALENDAR
GROUP BY TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY')
) LAST_DAY
WHERE AC_SK = s.AC_SK
AND A.DATA_SRC_ID = D.DATA_SRC_ID
AND REPORT_DATE >= D.BACKLOAD_CUTOFF_DATE
and TO_CHAR(REPORT_DATE, 'MMYYYY') = LAST_DAY.MMYYYY
and (
a.DATA_SRC_ID in(1, 2)
OR (
A.DATA_SRC_ID IN(3, 4, 5, 6, 10)
AND REPORT_DATE <> LAST_DAY.LAST_WORKING_DAY
)
or (
A.DATA_SRC_ID = 8
and REPORT_DATE <> LAST_DAY.LAST_CALENDAR_DAY
)
)
);
Plan hash value: 723147282
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Pstart| Pstop |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 27 | 456K (8)| | |
|* 1 | FILTER | | | | | | |
| 2 | PARTITION RANGE ITERATOR | | 285K| 7536K| 353 (17)| 1 | KEY |
|* 3 | INDEX FAST FULL SCAN | XU_TBL_POSN_PK | 285K| 7536K| 353 (17)| 1 | KEY |
| 4 | NESTED LOOPS | | 2 | 86 | 4 (0)| | |
| 5 | NESTED LOOPS | | 1 | 20 | 2 (0)| | |
| 6 | TABLE ACCESS BY INDEX ROWID| TBL_ACCOUNT | 1 | 8 | 1 (0)| | |
|* 7 | INDEX UNIQUE SCAN | PK_TBL_ACCOUNT | 1 | | 1 (0)| | |
|* 8 | TABLE ACCESS BY INDEX ROWID| TBL_DATA_SRC | 1 | 12 | 1 (0)| | |
|* 9 | INDEX UNIQUE SCAN | PK_TBL_DATA_SRC| 1 | | 1 (0)| | |
|* 10 | VIEW | | 2 | 46 | 2 (0)| | |
| 11 | SORT GROUP BY | | 42 | 336 | 8 (25)| | |
|* 12 | INDEX FULL SCAN | TBL_CALENDAR_PK| 42 | 336 | 7 (15)| | |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ */ 0 FROM (SELECT /*+ */
TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'MMYYYY') "MMYYYY",MAX(CASE
TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'d')) WHEN 2 THEN "TBL_CALENDAR_DATE" WHEN 3 THEN
"TBL_CALENDAR_DATE" WHEN 4 THEN "TBL_CALENDAR_DATE" WHEN 5 THEN "TBL_CALENDAR_DATE" WHEN 6 THEN "TBL_CALENDAR_DATE"
ELSE NULL END ) "LAST_WORKING_DAY",MAX(LAST_DAY(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"))) "LAST_CALENDAR_DAY" FROM
"TBLDBO"."TBL_CALENDAR" "TBL_CALENDAR" WHERE TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'MMYYYY')=TO_CHAR(:B1,'M
MYYYY') GROUP BY TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'MMYYYY')) "LAST_DAY","TBLDBO"."TBL_DATA_SRC"
"D","TBLDBO"."TBL_ACCOUNT" "A" WHERE "AC_SK"=:B2 AND "A"."DATA_SRC_ID"="D"."DATA_SRC_ID" AND
"D"."BACKLOAD_CUTOFF_DATE"<=:B3 AND (("A"."DATA_SRC_ID"=1 OR "A"."DATA_SRC_ID"=2) OR ("A"."DATA_SRC_ID"=3 OR
"A"."DATA_SRC_ID"=4 OR "A"."DATA_SRC_ID"=5 OR "A"."DATA_SRC_ID"=6 OR "A"."DATA_SRC_ID"=10) AND
"LAST_DAY"."LAST_WORKING_DAY"<>:B4 OR "A"."DATA_SRC_ID"=8 AND "LAST_DAY"."LAST_CALENDAR_DAY"<>:B5)))
3 - filter("REPORT_DCDE"='I' AND "REPORT_DATE"<'31-OCT-12')
7 - access("AC_SK"=:B1)
8 - filter("D"."BACKLOAD_CUTOFF_DATE"<=:B1)
9 - access("A"."DATA_SRC_ID"="D"."DATA_SRC_ID")
10 - filter("A"."DATA_SRC_ID"=1 OR "A"."DATA_SRC_ID"=2 OR ("A"."DATA_SRC_ID"=3 OR "A"."DATA_SRC_ID"=4 OR
"A"."DATA_SRC_ID"=5 OR "A"."DATA_SRC_ID"=6 OR "A"."DATA_SRC_ID"=10) AND "LAST_DAY"."LAST_WORKING_DAY"<>:B1 OR
"A"."DATA_SRC_ID"=8 AND "LAST_DAY"."LAST_CALENDAR_DAY"<>:B2)
12 - filter(TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'MMYYYY')=TO_CHAR(:B1,'MMYYYY'))
February 04, 2013 - 9:17 am UTC
couple of comments on a few really *bad* practices:
a) TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY') MMYYYY
use trunc( dt, 'mm' ). keep the date a date, trunc to the month level. don't conver to a string (hugely expensive). especially when you need it to compare to something later: TO_CHAR(REPORT_DATE, 'MMYYYY') = LAST_DAY.MMYYYY. that should be "trunc(report_date,'mm') = last_day.mmyyyy.
trunc simply resets bytes at the end of a 7 byte field. to_char has to run through probably hundreds of thousands of lines of code!
b) CASE WHEN TO_CHAR(TBL_CALENDAR_DATE, 'D') IN (2, 3, 4, 5, 6)
compare numbers to numbers, dates to dates, strings to string. here you are comparing a string to a number - meaning there is an implicit conversion in there - meaning this code is *bad* (implicit conversions are *evil* - avoid them)
b) TO_CHAR(TBL_CALENDAR_DATE, 'd' )
beware of date functions that are dependent on where you are located.
ops$tkyte%ORA11GR2> alter session set nls_territory = america; Session altered. ops$tkyte%ORA11GR2> select to_char(sysdate,'d'), trunc(sysdate)-trunc(sysdate,'iw')+1 2 from dual; T TRUNC(SYSDATE)-TRUNC(SYSDATE,'IW')+1 - ------------------------------------ 2 1 ops$tkyte%ORA11GR2> alter session set nls_territory = germany; Session altered. ops$tkyte%ORA11GR2> select to_char(sysdate,'d'), trunc(sysdate)-trunc(sysdate,'iw')+1 2 from dual; T TRUNC(SYSDATE)-TRUNC(SYSDATE,'IW')+1 - ------------------------------------ 1 1 ops$tkyte%ORA11GR2>
your code will return different answers in different countries... you might consider using the IW approach - it is based on ISO standards (worldwide) instead of country by country settings.
c) MAX(LAST_DAY(TBL_CALENDAR_DATE)) LAST_CALENDAR_DAY
don't compute things you don't need. You only reference last_calendar_day in the outer query in the predicate and could easily have used last_day(mmyyyy)
d) AND REPORT_DATE < '31-OCT-12'
never compare dates to strings, strings to numbers, etc.
always compare dates to dates, strings to strings, numbers to numbers.
never rely on implicit conversions, consider:
ops$tkyte%ORA11GR2> create table t ( x date ); Table created. ops$tkyte%ORA11GR2> ops$tkyte%ORA11GR2> insert into t values ( to_date( '01-jan-2012', 'dd-mon-yyyy' ) ); 1 row created. ops$tkyte%ORA11GR2> ops$tkyte%ORA11GR2> select * from t where x < '31-oct-12'; no rows selected ops$tkyte%ORA11GR2> alter session set nls_date_format = 'dd-mon-yyyy'; Session altered. ops$tkyte%ORA11GR2> select * from t where x < '31-oct-12'; no rows selected ops$tkyte%ORA11GR2>
whoops, wrong answer all of a sudden.....
e) AND REPORT_DATE < '31-OCT-12'
you must have started programming in about 2001 or 2002 or later.... Anyone that programmed in the 1990's knows - it takes four characters for years. Always and forever.
I'm not a fan of hints at all - so no, I'm not going to give a magical hint - you might try using dynamic sampling set to level 3 or above - the exists query is getting what looks like an incorrect cardinality estimate. the first query gets 11-12,000 rows as a guess, the second gets 1 row as a guess. that is the root cause of the bad plan (incorrect cardinality estimates). I cannot tell really where it is incorrect (because you know your data, I do not - you know what indexes go with what tables - i do not, makes it really hard to read a plan...)
so, looking at these plans - where are the cardinality estimates off by a large amount. we'll start there - figure out how we might get the right cardinalities and then the right plan will just "fall out" from that.
as far as I know, these two queries are not equivalent. the join query could return many more rows than the exists. I don't know your constraints (primary keys, foreign keys, not null, etc) and hence cannot even tell you if these two queries are interchangeable!
Difference in execution plan and execution time
Manoj Kaparwan, February 02, 2013 - 5:33 pm UTC
Dear tom Here tried the following test on Oracle 10.2.0.4 emp table - from scott.emp big_table -- CTAS from all_objects man@ora10g:rac1> select a.rowid from emp a where a.hiredate >to_date('1987-04-19','yyyy-mm-dd') and exists ( select null from big_table b , big_table c where b.object_id = c.object_id ); 2 3 4 ROWID ------------------ AAANjIAAEAACMNcAAK Elapsed: 00:00:00.03 Execution Plan ---------------------------------------------------------- Plan hash value: 4226424926 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7692K| 146M| 223K (3)| 00:44:45 | |* 1 | FILTER | | | | | | |* 2 | TABLE ACCESS FULL | EMP | 7692K| 146M| 223K (3)| 00:44:45 | | 3 | NESTED LOOPS | | 1 | 10 | 4 (0)| 00:00:01 | | 4 | INDEX FAST FULL SCAN| BIG_TBL_ID_IDX | 50503 | 246K| 2 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | BIG_TBL_ID_IDX | 1 | 5 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT /*+ */ 0 FROM "BIG_TABLE" "C","BIG_TABLE" "B" WHERE "B"."OBJECT_ID"="C"."OBJECT_ID")) 2 - filter("A"."HIREDATE">TO_DATE(' 1987-04-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 5 - access("B"."OBJECT_ID"="C"."OBJECT_ID") Then, I took tkprof ******************************************************************************** select a.rowid from emp a where a.hiredate >to_date('1987-04-19','yyyy-mm-dd') and exists ( select null from big_table b , big_table c where b.object_id = c.object_id ) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.02 0 18 0 0 Fetch 2 0.00 0.00 0 4 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.01 0.03 0 22 0 1 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 61 Rows Row Source Operation ------- --------------------------------------------------- 1 FILTER (cr=22 pr=0 pw=0 time=1125 us) 1 TABLE ACCESS FULL EMP (cr=4 pr=0 pw=0 time=273 us) 1 NESTED LOOPS (cr=18 pr=0 pw=0 time=776 us) 1 INDEX FAST FULL SCAN BIG_TBL_ID_IDX (cr=15 pr=0 pw=0 time=638 us)(object id 55243) 1 INDEX RANGE SCAN BIG_TBL_ID_IDX (cr=3 pr=0 pw=0 time=92 us)(object id 55243) ******************************************************************************** So Oracle is doing right thing here - as usual. as soon as the 1st row is returned from the NL.. job is done .. no longer required joining each row from big_table a, to big_table c I can see estimation of row in explain plan is way off the reality... is this the reason we should not rely much on explain plan. so we would not go with the cost as shown in the case in above post - might be tkprof would reveal more?
February 04, 2013 - 9:40 am UTC
how did it come up with over 7,000 rows in EMP??????
ops$tkyte%ORA10GR2> create table emp as select * from scott.emp; Table created. ops$tkyte%ORA10GR2> create table big_table as select * from all_objects; Table created. ops$tkyte%ORA10GR2> alter table big_table add constraint bt_pk primary key(object_id); Table altered. ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'EMP' ); PL/SQL procedure successfully completed. ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'BIG_TABLE' ); PL/SQL procedure successfully completed. ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> set autotrace on explain ops$tkyte%ORA10GR2> select a.rowid 2 from emp a where a.hiredate >to_date('1987-04-19','yyyy-mm-dd') 3 and exists 4 ( select null from big_table b , big_table c where b.object_id = 5 c.object_id ); ROWID ------------------ AAARLfAAEAAASvUAAK Execution Plan ---------------------------------------------------------- Plan hash value: 2260623314 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 20 | 5 (0)| 00:00:01 | |* 1 | FILTER | | | | | | |* 2 | TABLE ACCESS FULL | EMP | 1 | 20 | 3 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 2 | 20 | 2 (0)| 00:00:01 | | 4 | INDEX FAST FULL SCAN| BT_PK | 48879 | 238K| 2 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | BT_PK | 1 | 5 | 0 (0)| 00:00:01 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT 0 FROM "BIG_TABLE" "C","BIG_TABLE" "B" WHERE "B"."OBJECT_ID"="C"."OBJECT_ID")) 2 - filter("A"."HIREDATE">TO_DATE(' 1987-04-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 5 - access("B"."OBJECT_ID"="C"."OBJECT_ID") ops$tkyte%ORA10GR2> set autotrace off
post your step by steps to reproduce please.
..Difference in execution plan and execution time
Manoj Kaparwan, February 04, 2013 - 7:32 pm UTC
Dear Tom,
My bad !
emp table in MAN schema had stats way off.
man@ora10g:rac1> select owner, table_name, num_rows from dba_tables where table_name ='EMP';
OWNER TABLE_NAME NUM_ROWS
------------------------------ ------------------------------ ----------
MAN EMP 100000000
later I gather stats.
man@ora10g:rac1> exec dbms_stats.gather_table_stats(user, 'EMP', cascade=>true);
PL/SQL procedure successfully completed.
man@ora10g:rac1> select owner, table_name, num_rows, last_analyzed from dba_tables where table_name ='EMP';
OWNER TABLE_NAME NUM_ROWS LAST_ANALYZED
------------------------------ ------------------------------ ---------- -------------------
MAN EMP 14 2013-02-05 12:25:31
man@ora10g:rac1> explain plan for
2 select a.rowid
from emp a where a.hiredate >to_date('1987-04-19','yyyy-mm-dd')
and exists
( select null from big_table b , big_table c where b.object_id = c.object_id ); 3 4 5
Explained.
man@ora10g:rac1> @xplan
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4226424926
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 20 | 7 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | TABLE ACCESS FULL | EMP | 1 | 20 | 3 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 10 | 4 (0)| 00:00:01 |
| 4 | INDEX FAST FULL SCAN| BIG_TBL_ID_IDX | 50503 | 246K| 2 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | BIG_TBL_ID_IDX | 1 | 5 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ */ 0 FROM "BIG_TABLE" "C","BIG_TABLE" "B"
WHERE "B"."OBJECT_ID"="C"."OBJECT_ID"))
2 - filter("A"."HIREDATE">TO_DATE(' 1987-04-19 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))
5 - access("B"."OBJECT_ID"="C"."OBJECT_ID")
21 rows selected.
Not in and Not exists
ajit, February 20, 2013 - 6:54 pm UTC
Hi tom,
I read most of the discussions above.
I just want to confirm one thing.
If we are dealing with columns which do not have null values..can not in/not exists be interchanged in that case?
Thanks in advance
Big fan :)
February 25, 2013 - 10:09 am UTC
yes, if the column involved in the NOT IN select list (the projected column(s)) are defined as NOT NULL - then not in and not exists would be semantically equivalent.
why "not in" unable to compare null.
ranjan, August 29, 2013 - 8:38 pm UTC
Hi Tom,
Hope you are fine :)
#############
Create Table A As Select Level A From Dual Connect By Level<=5;
create table b as select level+2 a from dual connect by level<=5 union select null from dual;
Select * From A;
1
2
3
4
5
Select * From B;
3
4
5
6
7
Select * From A Where A not In (Select A From B);
no rows selected
select * from a where a in (select a from b);
3
4
5
##############
You told at the top:
--------
Beware the NULL value
and NOT IN!! (also the reason why NOT IN is sometimes avoided).
---------
That is true but why oracle doesn't handle as it handles in case of "in".
I mean to say how oracle skip the values "1" and "2" in case of "in" which is not correct (as we know null cant be compared with anything (null is unknown), how it calculates that "1" in "null" and "2" in "null" are false).
Many people say that, that is side effect of "not in" , but could you please say something :).
September 04, 2013 - 6:22 pm UTC
because we are beholden to the ANSI standard to tell us how to process the language and the standard says this is the way it is to be done.
it is in the definition of the language, a definition we subscribe to but are not in charge of.
hmm, thank you for the reply
Ranjan, September 08, 2013 - 10:10 pm UTC
Exact use of exists
Boob, June 17, 2014 - 2:43 pm UTC
Hi could you please sujjest me how exactly exit will be used.
NOT IN and NOT EXISTS
Niranjan, September 17, 2014 - 9:15 am UTC
Hi Tom,
You said, NOT IN and NOT EXISTS are not the same!!! logically I understand that but I did not quite get, as the result I get is the same (query below). Are you referring to saying (NOT THE SAME) that, EXISTS evaluates to true immediately on the first occurrence of the value during a scan on the table/index?
Used smaple HR schema.
select department_id from departments d where not exists (select null from employees e where e.department_id = d.department_id);
select department_id from departments d where department_id not in (select department_id from employees e where e.department_id = d.department_id);
getting null values from record
lavanya, July 27, 2016 - 11:46 am UTC
Suppose customer is my table in that I want to find all the c ustomers where we don't have their phone numbers .how can I get it?
July 27, 2016 - 1:23 pm UTC
select *
from customers
where phone_number is null
Not seeing difference in null handling between NOT IN and NOT EXISTS
Stacey Baker, March 27, 2017 - 10:27 pm UTC
Hi Tom,
I'm still struggling with the difference in null handling between NOT IN and NOT EXISTS. The example you gave that worked, only worked because the correlated subquery inner join eliminated the null in the table (emp 7839 has no manager):
select count(*) from emp t1 where not exists ( select null from emp t2 where mgr = t1.empno);
If I add "or m.mgr is null", I get 0 records, the same answer I get with NOT IN.
select count(*) from emp t1 where not exists ( select null from emp t2 where mgr = t1.empno or mgr is null);
Also, if I use NOT IN in your NOT EXISTS query, it returns the correct count of employees who aren't managers:
select count(*) from emp t1 where empno not in (select mgr from emp t2 where mgr = t1.empno);
How can I see the difference?
March 28, 2017 - 1:06 am UTC
You can generally transpose between the two if you *change the sql*, because it is basically about sets and set membership. Tom's point was that simply switching "not in" to "not exists" is not a guarantee it will be the same result.
Smrdíš
A reader, September 26, 2017 - 12:01 pm UTC
Not sure I understand...
Randy Strauss, October 13, 2017 - 8:14 pm UTC
Other answer said/implied that the set in:
select count(*) from emp where empno not in ( select mgr from emp );
evaluated to ( NULL ) - why? It seems like it should evaluate to a set that includes managers and a null (does a set automatically discard redundant values?) . Is that correct? And are you saying this wouldn't work- that NULL would make the set poorly defined?
Would this work better/well?:
select count(*) from emp where empno not in ( select distinct(mgr) from emp where mgr is not null);
October 16, 2017 - 12:27 pm UTC
And are you saying this wouldn't work- that NULL would make the set poorly defined?
Effectively, yes.
Would this work?:
select count(*) from emp where empno not in ( select distinct(mgr) from emp where mgr is not null);
Yes.
(B) if You Were to Write a Truth Table for F1, How Many Rows Evaluate to True? Why?
Source: https://asktom.oracle.com/pls/apex/f?p=100:11:0::NO::P11_QUESTION_ID:442029737684
0 Response to "(B) if You Were to Write a Truth Table for F1, How Many Rows Evaluate to True? Why?"
Publicar un comentario