Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 318998 - [DB] Support queryXRefs()
Summary: [DB] Support queryXRefs()
Status: CLOSED FIXED
Alias: None
Product: EMF
Classification: Modeling
Component: cdo.db (show other bugs)
Version: 4.0   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Stefan Winkler CLA
QA Contact: Eike Stepper CLA
URL:
Whiteboard: Power to the People
Keywords: noteworthy
Depends on:
Blocks: 300149
  Show dependency tree
 
Reported: 2010-07-06 08:20 EDT by Eike Stepper CLA
Modified: 2011-06-23 03:39 EDT (History)
3 users (show)

See Also:


Attachments
draft (17.81 KB, patch)
2010-07-11 11:42 EDT, Stefan Winkler CLA
no flags Details | Diff
patch-v1 (30.00 KB, patch)
2010-07-13 08:23 EDT, Stefan Winkler CLA
no flags Details | Diff
Patch v2 - ready to be committed (32.65 KB, patch)
2010-07-13 11:48 EDT, Eike Stepper CLA
no flags Details | Diff
Excludes derived references (1.96 KB, patch)
2010-07-21 04:55 EDT, Erwin Betschart CLA
stepper: iplog+
Details | Diff
Fixes xref sql query. (1.39 KB, patch)
2010-07-21 05:22 EDT, Erwin Betschart CLA
stepper: iplog+
Details | Diff
Testcase for queryXRefs (3.77 KB, patch)
2010-07-23 11:06 EDT, Erwin Betschart CLA
stepper: iplog+
Details | Diff
Patch combined - for future reference (8.06 KB, patch)
2010-07-24 08:45 EDT, Eike Stepper CLA
no flags Details | Diff
Patch v4 - test 320690 is failing (6.62 KB, patch)
2010-07-24 09:30 EDT, Eike Stepper CLA
no flags Details | Diff
Patch v5 - for future reference (13.94 KB, patch)
2010-07-25 05:12 EDT, Eike Stepper CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eike Stepper CLA 2010-07-06 08:20:24 EDT
See bug 300149
Comment 1 Stefan Winkler CLA 2010-07-11 11:42:02 EDT
Created attachment 173979 [details]
draft

first draft of infrastructure - not finished yet

I came across two issues I need clarified first.
a.) what about feature-maps? any idea how XRefs and featuremaps could be interrelated? (I suppose, currently we don't support that, but I'm uncertain whether feature-maps could include potential XRefs)

b.) I'm not sure how to formulate the SQL Query:

Approach 1: make one SQL Query for each individual target ID
"SELECT cdo_id from foobar where foobar.ref = 123"
"SELECT cdo_id from foobar where foobar.ref = 456"

Pro: Simple
Con: Slow

Slow because I query objects which simply cannot be in that field, because they do not conform to the feature specification (e.g., foobar.ref may in no case reference OID456 because it's of another type than OID123).
-> Optimization A: query eClass of each target ID and only query those target IDs which could match

Con: Has to be done for each candidate source reference


Approach 2: make one SQL Query for all target IDs
"SELECT cdo_id from foobar where foobar.ref in (123,456)"
Pro: only one SQL statement gets issued
Con: does not scale - there's a limit to the length of SQL statements. If there's a large amount of target IDs, the SQL might get too long.

-> Optimization A could be applied here as well, so that the candidate list gets shorter.
Con: Still does not scale ultimately


Approach 3: make several SQL queries of type 2 but regard the max length of SQL statements
"SELECT cdo_id from foobar where foobar.ref in (123,...,...,..... < awful lot of stuff here >)"
"SELECT cdo_id from foobar where foobar.ref in (456,...,...,..... < more stuff here >)"

Pro: more scalable than alternative 2 and faster than alternative 1
Con: complex, more implementation effort

Eike, do you have another idea, maybe? Or do you prefer the simple solution (1), perhaps with optimization A?
Comment 2 Eike Stepper CLA 2010-07-12 04:11:39 EDT
(In reply to comment #1)
> Approach 2: make one SQL Query for all target IDs
> "SELECT cdo_id from foobar where foobar.ref in (123,456)"
> Pro: only one SQL statement gets issued
> Con: does not scale - there's a limit to the length of SQL statements. If
> there's a large amount of target IDs, the SQL might get too long.

That's what I had in mind.

> -> Optimization A could be applied here as well, so that the candidate list
> gets shorter.

What was Optimization A?

> Con: Still does not scale ultimately
> 
> 
> Approach 3: make several SQL queries of type 2 but regard the max length of SQL
> statements
> "SELECT cdo_id from foobar where foobar.ref in (123,...,...,..... < awful lot
> of stuff here >)"
> "SELECT cdo_id from foobar where foobar.ref in (456,...,...,..... < more stuff
> here >)"

This is something we can consider if/when someone complains. I expect the target set not to get too large in typical scenarios.
Comment 3 Eike Stepper CLA 2010-07-12 04:12:56 EDT
(In reply to comment #2)
> What was Optimization A?

Oh, I found it! I don't think that we need that right now.
Comment 4 Stefan Winkler CLA 2010-07-13 08:23:02 EDT
Created attachment 174133 [details]
patch-v1

Ok, implemented as agreed.

The implementation passes the one XRef test currently in existence.
Needs to be tested further. (e.g. the case of isMany() source references is not tested at all, currently).
Comment 5 Eike Stepper CLA 2010-07-13 11:48:06 EDT
Created attachment 174168 [details]
Patch v2 - ready to be committed
Comment 6 Stefan Winkler CLA 2010-07-14 10:30:24 EDT
Committed to HEAD
Comment 7 Erwin Betschart CLA 2010-07-21 04:55:40 EDT
Created attachment 174821 [details]
Excludes derived references
Comment 8 Erwin Betschart CLA 2010-07-21 04:57:21 EDT
Please reopen this bug.

Attached a patch which excludes derived features from the xref check.

I confirm to
1) The number of lines that you changed is smaller than 250.
2) You are the only author of these changed lines.
3) You apply the EPL to these changed lines.

for attachment 174821 [details].
Comment 9 Erwin Betschart CLA 2010-07-21 05:22:54 EDT
Created attachment 174824 [details]
Fixes xref sql query.

AbstractListTableMapping.queryXRefs(...) causes some problems:

Original sql query:
SELECT l_t.cdo_source, l_t.cdo_value, l_t.cdo_idx FROM base_whitepage_LayerCondition_layers_listl_t, base_whitepage_LayerConditiona_t WHERE  AND a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND a_t.cdo_branch=l_t.cdo_branch AND cdo_branch=0 AND (cdo_revised=0) AND cdo_value IN (4572)

... FROM base_whitepage_LayerCondition_layers_listl_t ...
should be 
... FROM base_whitepage_LayerCondition_layers_list AS l_t ...

Query is then:
SELECT l_t.cdo_source, l_t.cdo_value, l_t.cdo_idx FROM base_whitepage_LayerCondition_layers_list AS l_t, base_whitepage_LayerCondition AS a_t WHERE  AND a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND a_t.cdo_branch=l_t.cdo_branch AND cdo_branch=0 AND (cdo_revised=0) AND cdo_value IN (4572)

... WHERE  AND a_t.cdo...
should be 
... WHERE some clause not AND
is due to listJoin starting with an AND. Therefore changed order of mainTableWhere and listJoin.

Resulting query:
SELECT l_t.cdo_source, l_t.cdo_value, l_t.cdo_idx FROM base_whitepage_LayerCondition_layers_list AS l_t, base_whitepage_LayerCondition AS a_t WHERE cdo_branch=0 AND (cdo_revised=0) AND a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND a_t.cdo_branch=l_t.cdo_branch AND cdo_value IN (4572)

This causes:
org.h2.jdbc.JdbcSQLException: Ambiguous column name CDO_BRANCH; SQL statement:

Changing:
builder.append(mainTableWhere);
to:
builder.append("a_t." + mainTableWhere);
gives working query:
SELECT l_t.cdo_source, l_t.cdo_value, l_t.cdo_idx FROM base_whitepage_LayerCondition_layers_list AS l_t, base_whitepage_LayerCondition AS a_t WHERE a_t.cdo_branch=0 AND (cdo_revised=0) AND a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND a_t.cdo_branch=l_t.cdo_branch AND cdo_value IN (4572)
Comment 10 Erwin Betschart CLA 2010-07-21 05:24:01 EDT
Attached a patch which fixes the sql query in queryXRefs().

I confirm to
1) The number of lines that you changed is smaller than 250.
2) You are the only author of these changed lines.
3) You apply the EPL to these changed lines.

for attachment 174824 [details].
Comment 11 Stefan Winkler CLA 2010-07-21 05:47:05 EDT
Reopened.

Re: Derived References

I think the better way would be to exclude derived references from the calculation of the candidates in the first place. Else we might get inconsistent behaviour in between different store implementations.


Re: SQL
you are right, this is an error. A testcase would be nice, because apparently this error is in there because the case of isMany()-Xref-candidates is not tested.
Comment 12 Erwin Betschart CLA 2010-07-23 02:50:14 EDT
I'll try to provide a testcase may take some time...

I talked with Eike about the problem with derived / transient references and he said that they should be excluded from the given context in which the search for cross references is done.
Comment 13 Erwin Betschart CLA 2010-07-23 11:06:53 EDT
Created attachment 175075 [details]
Testcase for queryXRefs
Comment 14 Erwin Betschart CLA 2010-07-23 11:08:13 EDT
I confirm to
1) The number of lines that you changed is smaller than 250.
2) You are the only author of these changed lines.
3) You apply the EPL to these changed lines.

for attachment 175075 [details].



Btw.: isMany seems to work after sql & transient fixes...
Comment 15 Eike Stepper CLA 2010-07-24 08:45:38 EDT
Created attachment 175140 [details]
Patch combined - for future reference

I've fixed the non-persistent references issue directly in XRefsQueryHandler.
Comment 16 Eike Stepper CLA 2010-07-24 09:30:51 EDT
Created attachment 175147 [details]
Patch v4 - test 320690 is failing

Test 320690 is failing, but also with MEMStore. Investigating...
Comment 17 Eike Stepper CLA 2010-07-25 05:11:27 EDT
Stefan, I've found and fixed twoi issues with scalar xRefs:

1) Detached revisions must be excluded:
      builder.append(CDODBSchema.ATTRIBUTES_VERSION);
      builder.append(">0 AND ");
      
Stefan, does this have to be considered for many refs, too?

2) The idString must be respected:
      builder.append(" AND ");
      builder.append(valueField);
      builder.append(" IN ");
      builder.append(idString);
      
I've added extra checks in QueryXRefsContext.addXRef() to exclude NULL targets, but stores should optimize them away (DBStore does so now).

---

Another possible issue could be that both locking and xrefCheck can take a considerable amount of time. Should we consider async progress monitoring so that big commits do not time out?

---

Committed to HEAD
Comment 18 Eike Stepper CLA 2010-07-25 05:12:01 EDT
Created attachment 175171 [details]
Patch v5 - for future reference
Comment 19 Eike Stepper CLA 2011-06-23 03:39:35 EDT
Available in R20110608-1407