Community
Participate
Working Groups
Build Identifier: 2.1.1.v20100817-r805 I first posted this in the forum - now here http://www.eclipse.org/forums/index.php?t=rview&goto=638580#msg_638580 I´m using Eclipselink from the latest Glassfish 3.1 bundled package (2.1.1.v20100817-r8050) connected to an Oracle 11g (11.2.0.1.0) database and made a strange observation. I´m quering a table with round about 30.000 entries using the JPA2 criteria API (that doesn´t seem to matter). That table (ZBM) is joined (inner by annotation JoinFetch, ManyToOne) to another table (MEL). However this referenced table (MEL) has a column ("messageStateDescription" - mapped to "MELSTSBEZ") which can contain only ~3 different values and is not null. So you can say that at least more than 10.000 entries share the same value. And here it comes: If you query the table and order it by (only) this column (don´t think about the sense) and you use pagination (firstResult, maxResults) you get a strange behaviour. Pagination works on the first ~~100 entries and on the last ~~100 entries. Between this area you get the *same* result on every query. I know, this sounds weird but I broke it down to the SQL-query. Let´s say, I query with firstResult=24520 and maxResults=20 and later with firstResults=24540 - EclipseLink generates the following SQL-queries (removed the first_rows-hint and added a line break, but this doesn´t matter) SELECT * FROM (SELECT a.*, ROWNUM rnum FROM (SELECT t1.ID AS a1, t1.EGSJAH AS a2, t1.BESNUM AS a3, t1.EGSNUM AS a4, t1.MELJAH AS a5, t1.BESMEL AS a6, t1.MELNUM AS a7, t1.EVENTDATA_ID AS a8, t1.EVENT_ID AS a9, t1.MESSAGE_ID AS a10, t1.LAST_MODIFIED_BY_GROUP AS a11, t1.CREATED_BY_GROUP AS a12, t1.CREATED_AT_DATE AS a13, t1.AENDAT AS a14, t1.CREATED_BY_USER AS a15, t1.AENUSR AS a16 FROM MEL t0, ZBM t1 WHERE (t0.ID = t1.MESSAGE_ID) ORDER BY t0.MELSTSBEZ ASC) a WHERE ROWNUM <= 24520) WHERE rnum > 24500 ; SELECT * FROM (SELECT a.*, ROWNUM rnum FROM (SELECT t1.ID AS a1, t1.EGSJAH AS a2, t1.BESNUM AS a3, t1.EGSNUM AS a4, t1.MELJAH AS a5, t1.BESMEL AS a6, t1.MELNUM AS a7, t1.EVENTDATA_ID AS a8, t1.EVENT_ID AS a9, t1.MESSAGE_ID AS a10, t1.LAST_MODIFIED_BY_GROUP AS a11, t1.CREATED_BY_GROUP AS a12, t1.CREATED_AT_DATE AS a13, t1.AENDAT AS a14, t1.CREATED_BY_USER AS a15, t1.AENUSR AS a16 FROM MEL t0, ZBM t1 WHERE (t0.ID = t1.MESSAGE_ID) ORDER BY t0.MELSTSBEZ ASC) a WHERE ROWNUM <= 24540) WHERE rnum > 24520 ; The queries look good, but when i manually execute them (Quest TOAD) I really get the same result on both queries, only the column RNUM is different. It is irrelevant what number I place in "WHERE rnum > 24520" as long as it´s bigger than ~~100. When i manually modify the query to (just changed the outer WHERE-clause and deleted the inner) SELECT * FROM (SELECT a.*, ROWNUM rnum FROM (SELECT t1.ID AS a1, t1.EGSJAH AS a2, t1.BESNUM AS a3, t1.EGSNUM AS a4, t1.MELJAH AS a5, t1.BESMEL AS a6, t1.MELNUM AS a7, t1.EVENTDATA_ID AS a8, t1.EVENT_ID AS a9, t1.MESSAGE_ID AS a10, t1.LAST_MODIFIED_BY_GROUP AS a11, t1.CREATED_BY_GROUP AS a12, t1.CREATED_AT_DATE AS a13, t1.AENDAT AS a14, t1.CREATED_BY_USER AS a15, t1.AENUSR AS a16 FROM MEL t0, ZBM t1 WHERE (t0.ID = t1.MESSAGE_ID) ORDER BY t0.MELSTSBEZ ASC) a ) WHERE rnum > 24520 and rnum <= 24540 ; The result is as expected correct - it returns different rows depending on rnum. This happens *only* if you order your table by a column with a very low variation. I think that Oracle internally "optimizes" this query in a wrong way because the ordering is not deterministic (because of the low variation). For a workaround I now always add as last ordering-criteria the order by "ID" which works for me. But perhaps it would be possible to change EclipseLinks way of generating the pagination query to the statement i proposed... Reproducible: Always Steps to Reproduce: 1. Create a table with a lot of entries (like me ~30.000) 2. Fill a column with only few different values 3. Try the pagination SQL I posted (or try with EclipseLink directly)
I just made up a small testcase to reproduce the problem - which does it on my Oracle-DB ---------------------------------- create table t (id not null, col1, col2, data) as select rownum id, rownum col1, rownum col2, 'test' data from dual connect by level <= 30000; create table t2 (id not null, ref_id) as select rownum id, rownum ref_id from dual connect by level <= 30000; update t set data = 'tes3' where id < 2000; update t set data = 'tes2' where id < 1000; SELECT * FROM (SELECT a.*, ROWNUM rnum FROM ( select t.id, t.data from t, t2 where (t2.ref_id = t.id) order by data asc ) a WHERE ROWNUM <= 10040 ) WHERE rnum > 10010 ; SELECT * FROM (SELECT a.*, ROWNUM rnum FROM ( select t.id, t.data from t, t2 where (t2.ref_id = t.id) order by data asc ) a WHERE ROWNUM <= 10030 ) WHERE rnum > 10000 ; -------------------------- I formulated the 2 select-statements as eclipselink does - compare the output of the two. And then compare to SELECT * FROM (SELECT a.*, ROWNUM rnum FROM ( select t.id, t.data from t, t2 where (t2.ref_id = t.id) order by data asc ) a ) WHERE rnum > 10010 and rnum <= 10040 ; SELECT * FROM (SELECT a.*, ROWNUM rnum FROM ( select t.id, t.data from t, t2 where (t2.ref_id = t.id) order by data asc ) a ) WHERE rnum > 10000 and rnum <= 10030 ;
Setting target and priority. See the following page for the meanings of these fields: http://wiki.eclipse.org/EclipseLink/Development/Bugs/Guidelines
I will try and look into the test case, but in the mean time maybe you can add why the new query will work on a non-deterministic order by field. The suggested query should have the same problem, assuming that Oracle can return different rnum values for the 10,000 rows matching that order value in two different queries. I believe the only solution is to use a more deterministic order by field, such as the ID as you have mentioned.
The Eclipselink project has moved to Github: https://github.com/eclipse-ee4j/eclipselink