Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 195079 - duplicate detector for jira
Summary: duplicate detector for jira
Status: RESOLVED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Mylyn (show other bugs)
Version: unspecified   Edit
Hardware: PC All
: P3 enhancement (vote)
Target Milestone: 3.0   Edit
Assignee: Eugene Kuleshov CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 197181
Blocks:
  Show dependency tree
 
Reported: 2007-07-02 01:23 EDT by Eugene Kuleshov CLA
Modified: 2007-08-02 01:04 EDT (History)
3 users (show)

See Also:


Attachments
stacktrace-based duplicate detector for jira (17.61 KB, patch)
2007-07-02 01:39 EDT, Eugene Kuleshov CLA
no flags Details | Diff
mylyn/context/zip (10.17 KB, application/octet-stream)
2007-07-02 01:40 EDT, Eugene Kuleshov CLA
no flags Details
stacktrace-based duplicate detector for jira (16.96 KB, patch)
2007-07-02 17:56 EDT, Eugene Kuleshov CLA
no flags Details | Diff
generified duplicate detection (41.97 KB, patch)
2007-07-13 19:40 EDT, Steffen Pingel CLA
no flags Details | Diff
mylyn/context/zip (17.75 KB, application/octet-stream)
2007-07-13 19:40 EDT, Steffen Pingel CLA
no flags Details
reincarnated patch (12.59 KB, patch)
2007-07-14 01:24 EDT, Eugene Kuleshov CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Kuleshov CLA 2007-07-02 01:23:22 EDT
duplicate detector for jira
Comment 1 Eugene Kuleshov CLA 2007-07-02 01:39:57 EDT
Created attachment 72851 [details]
stacktrace-based duplicate detector for jira
Comment 2 Eugene Kuleshov CLA 2007-07-02 01:40:01 EDT
Created attachment 72852 [details]
mylyn/context/zip
Comment 3 Eugene Kuleshov CLA 2007-07-02 17:56:21 EDT
Created attachment 72898 [details]
stacktrace-based duplicate detector for jira

Rob, Mik, I've tried to generalize handling of the duplicate detectors and pushed methods to the super class and added filtering by repository kind. Though it is unclear if we can do that from the binary compatibility point of view. Please review this.
Comment 4 Mik Kersten CLA 2007-07-10 12:26:30 EDT
Tried to apply the patch but I don't quite get the design and it has not been outlined in the bug report.  It does not make sense to me that a duplicate detector would have be bound to a connector kind (e.g. "bugzilla") because the whole point of the current duplicate detectors is that they are generic and only coupled to our search facilities.
Comment 5 Eugene Kuleshov CLA 2007-07-10 12:54:44 EDT
(In reply to comment #4)
> Tried to apply the patch but I don't quite get the design and it has not been
> outlined in the bug report.  It does not make sense to me that a duplicate
> detector would have be bound to a connector kind (e.g. "bugzilla") because the
> whole point of the current duplicate detectors is that they are generic and
> only coupled to our search facilities.

Mik, I'll be happy if you can show me how to use duplicate detector used for Bugzilla, with JIRA connector.

As far as I know, we don't have any generic search facility, so this stuff got to be connector-specific. Maybe somebody had in mind something like you've described, but present implementation (even for Bugzilla) is nowhere near that. Note that if I just "fix" JIRA editors to enable duplicate detectors, I am going to get the bugzilla one in there, which is obviously not going to work.

So, all I did is fixed the implementation of the original design in order to allow other connectors to use this facility and UI. I am fine if you can fix this in some other way.
Comment 6 Steffen Pingel CLA 2007-07-10 21:33:39 EDT
There used to be a method in AbstractRepositoryTaskEditor that returned a query object when passed a string. This provided a very simple but generic way to search across connectors. 

Eugene, I agree that something like that is needed to fix this properly. Merging this patch (which also breaks API) instead of addressing the problem does not make much sense to me.
Comment 7 Eugene Kuleshov CLA 2007-07-10 23:39:22 EDT
I feel really stupid. AbstractDuplicateDetector does return SearchHitCollector (which I didn't change). However, org.eclipse.mylyn.internal.bugzilla.ide.StackTraceDuplicateDetector, which is contributed as extension point now constructs SearchHitCollector that is explicitly using  BugzillaRepositoryQuery. So, I moved the old method from JIRA task editor into similar extension point but specific to JIRA. But to make it work, abstract task editor had to show the right set of detectors, hence connector-specific. What am I missing?

BTW, I did brought issue about breaking API on the one of meetings and Mik said that it is fine to push up methods from an internal class (bugzilla task editor in this case), so it does not break the api.
Comment 8 Steffen Pingel CLA 2007-07-10 23:53:05 EDT
Not sure I understand the first part of your comment. I think the duplicate detectors do not have to be connector specific (but could be if the kind attribute is set). A generic search facility should be either added to the editor or repository connector and be used by the duplicate detector.

 (In reply to comment #7)
> BTW, I did brought issue about breaking API on the one of meetings and Mik said
> that it is fine to push up methods from an internal class (bugzilla task editor
> in this case), so it does not break the api.

Sorry, I wasn't specific enough which change breaks compatibility: Changing the visibility of AbstractRepositoryTaskEditor.getDuplicateSearchCollector() is not backwards compatible.
Comment 9 Eugene Kuleshov CLA 2007-07-11 00:05:57 EDT
(In reply to comment #8)
> Not sure I understand the first part of your comment. I think the duplicate
> detectors do not have to be connector specific (but could be if the kind
> attribute is set).

Right. I don't see why not. However current extension point and corresponding API don't have that. So, I've based my patch around that because I needed to fix detector for JIRA that we lost when new extension point been introduced.

> Sorry, I wasn't specific enough which change breaks compatibility: Changing the
> visibility of AbstractRepositoryTaskEditor.getDuplicateSearchCollector() is not
> backwards compatible.

That one is a mistake (actually a bug in Eclipse's refactoring) and I think I mentioned it somewhere, so that method should stay protected.
Comment 10 Steffen Pingel CLA 2007-07-11 00:13:41 EDT
 (In reply to comment #9)
> > Not sure I understand the first part of your comment. I think the duplicate
> > detectors do not have to be connector specific (but could be if the kind
> > attribute is set).
> 
> Right. I don't see why not. However current extension point and corresponding
> API don't have that. So, I've based my patch around that because I needed to fix
> detector for JIRA that we lost when new extension point been introduced.

Given the API of AbstractDuplicateDetector I don't see why this couldn't be fixed by extending the current API with a generic search facility and moving StackTraceDuplicateDetector out of Bugzilla making it work for any connector?
Comment 11 Eugene Kuleshov CLA 2007-07-11 00:25:00 EDT
(In reply to comment #10)
> Given the API of AbstractDuplicateDetector I don't see why this couldn't be
> fixed by extending the current API with a generic search facility and moving
> StackTraceDuplicateDetector out of Bugzilla making it work for any connector?

Sure. But that will be order of magnitude more difficult and api breaking task (3rd party connectors lose duplicate detection facility), comparing to my patch (which can be a middle ground until big task is done).

If that does not make sense to anyone, you should at least explicitly say in huge letters that current extension point for hyperlink detectors is crafted only for bugzilla task editor (even so it is declared in the tasks.ui plugin).
Comment 12 Steffen Pingel CLA 2007-07-11 00:36:22 EDT
 (In reply to comment #11)
> > Given the API of AbstractDuplicateDetector I don't see why this couldn't be
> > fixed by extending the current API with a generic search facility and moving
> > StackTraceDuplicateDetector out of Bugzilla making it work for any connector?
> 
> Sure. But that will be order of magnitude more difficult and api breaking task
> (3rd party connectors lose duplicate detection facility), comparing to my patch
> (which can be a middle ground until big task is done).

Since there is no time pressure now I don't see why this couldn't be fixed now. From my understanding this can be done without breaking the API. Any 3rd party connector has to implement a custom detector which will continue to work. To me making this type of (potentially painful and extensive) refactoring instead of adding yet another class that duplicates code is the difference between having committer and contributor status.
Comment 13 Eugene Kuleshov CLA 2007-07-11 01:07:41 EDT
(In reply to comment #12)
> To me making this type of (potentially painful and extensive) refactoring
> instead of adding yet another class that duplicates code is the difference
> between having committer and contributor status.

Hmm. Generalizing search been discussed probably from before 1.0 release, but it is been pushed away. Do you think it is for the best wait when it get implemented and leave broken feature? Personally I am not sure if complexity of such task does justify the value (given 750 other open issues), but that is obviously not my call.
Comment 14 Steffen Pingel CLA 2007-07-13 19:40:23 EDT
Created attachment 73784 [details]
generified duplicate detection

This patch introduces a new method in AbstractRepositoryConnector:

 AbstractRepositoryQuery createQuery(TaskRepository repository, RepositoryTaskData taskData, String searchText);

I think this is a better approach than your patch Eugene. We should discuss how we can further extend this to make it a more generic search facility than a simple text search.
Comment 15 Steffen Pingel CLA 2007-07-13 19:40:25 EDT
Created attachment 73785 [details]
mylyn/context/zip
Comment 16 Eugene Kuleshov CLA 2007-07-13 19:43:48 EDT
Steffen, now that there is API policy, what is your idea for 3rd party providers to use this?
Comment 17 Mik Kersten CLA 2007-07-13 20:50:03 EDT
(In reply to comment #12)
Steffen: I agree with the problem that you're outlining (we have taken in too many patches without tests and with duplicated code and have to stop) and agree that committers are on the hook.  But I'm less sure that we have communicated this clearly to committers, so I just posted a guideline here: http://wiki.eclipse.org/Mylyn_Contributor_Reference#Applying_Patches 

On quick inspection this patch does not appear to break any API.  However, this is a significant API addition so we should discuss.  

Rob: if you can do so please post your comments on Monday, otherwise please put this on the agenda for Tuesday's call.

Steffen: note that this patch introduces a modularity problem because mylyn.tasks.ui should not have any dependencies to programming-specific functionality (e.g. stack traces).  The challenge is that we might need a new plug-in for non-connector-specific programming-specific functionality, which I'm of course worried is too heavy weight, but otherwise we could get permanently stuck with this as API.

Comment 18 Eugene Kuleshov CLA 2007-07-13 21:33:06 EDT
Folks, this is all cool, but this bug report was actually created to address specific issue with JIRA connector. I understood that you don't want to fix this in an easy way, but if you want to mess up with a new API, why don't you create separate issue, and maybe make this one depend on it.
Comment 19 Steffen Pingel CLA 2007-07-13 23:51:19 EDT
(In reply to comment #18)
> Folks, this is all cool, but this bug report was actually created to address
> specific issue with JIRA connector. I understood that you don't want to fix
> this in an easy way, but if you want to mess up with a new API, why don't you
> create separate issue, and maybe make this one depend on it.

Okay, I should split the patch into the part that introduces the generic search and the part the refactors the duplicate detectors making them work for all connectors (which includes JIRA and addresses this bug report).

(In reply to comment #16)
> Steffen, now that there is API policy, what is your idea for 3rd party
> providers to use this?

Are you referring to the slightly changed semantics of the duplicate detection code or the generic search facility?

My patch currently is not complete as does not include sufficient checking if the duplicate detection will actually work since this will require connectors need to support the generic search API. I was mostly curious how long it would take to implement it (~30 min) and to support my point that it does not necessarily break API (see your comment #11).

Eugene, I just now realized that the current implementation of adding the duplicate detectors to the Bugzilla editor is broken since it ignores the kind attribute. It would therefore pick up the any other detector that implements the extension point and your patch fixes that, right? 

Could you please provide a cleaned up patch that does not have other unrelated changes (JiraTaskDataHandler, NewQueryAction) and does not break the API so we can review?

I am fine with addressing generic duplicate detection in a separate report.
Comment 20 Eugene Kuleshov CLA 2007-07-14 00:34:32 EDT
(In reply to comment #19)
> > Steffen, now that there is API policy, what is your idea for 3rd party
> > providers to use this?
> Are you referring to the slightly changed semantics of the duplicate
> detection code or the generic search facility?

I was asking how integrators would implement duplicate detectors using 2.0 API.
Comment 21 Eugene Kuleshov CLA 2007-07-14 01:24:26 EDT
Created attachment 73788 [details]
reincarnated patch

Here. 

It would have saved everyone's time if you guys actually looked at the first submission, so I wouldn't have to recreate changes from scratch after dropping them from my workspace.
Comment 22 Steffen Pingel CLA 2007-07-14 11:27:55 EDT
 (In reply to comment #20)
> I was asking how integrators would implement duplicate detectors using 2.0 API.

Duplicate detectors that have a kind attribute specified would only show up in the corresponding editor. All other duplicate detectors would be generic and show up in all editors. All 2.0 detectors need to specify a kind attribute since they can't rely on generic search facilities to be present (unless they use other API that works across all connectors).

 (In reply to comment #21)
> It would have saved everyone's time if you guys actually looked at the first
> submission, so I wouldn't have to recreate changes from scratch after dropping
> them from my workspace.

I did look at your original patch but it contained unrelated changes, API breakage and was not declared as a bug fix but a feature enhancement. Some additional documentation and a little bit of clean-up could have saved everyone a lot of time. You seem to assume that your time is more valuable than everyone else's time which I don't agree with.

The new patch looks good to me and seems to be an acceptable intermediate solution until this is fixed in a more generic way.
Comment 23 Eugene Kuleshov CLA 2007-07-14 13:14:04 EDT
(In reply to comment #22)
> I did look at your original patch but it contained unrelated changes, API
> breakage and was not declared as a bug fix but a feature enhancement. Some
> additional documentation and a little bit of clean-up could have saved everyone
> a lot of time. 

Let me remind what comments I received:

"I think the duplicate detectors do not have to be connector specific (but could be if the kind attribute is set). A generic search facility should be either added to the editor or repository connector and be used by the duplicate detector."

"It does not make sense to me that a duplicate detector would have be bound to a connector kind (e.g. "bugzilla") because the whole point of the current duplicate detectors is that they are generic and only coupled to our search facilities."

Note that I haven't been told about unrelated changes (and those actually hard to notice when you have 4 or 5 outgoing changesets, files just jump randomly). So, it was pretty obvious to me that neither of you wanted these changes and there is no point for me to keep the patch around.

> You seem to assume that your time is more valuable than everyone
> else's time which I don't agree with.

It is not the first time when I get that committers change mind and asking to resubmit already dropped patches. I don't know if you aware of that, but more then 2 outgoing changestes are practically unmanageable and I already have 5 of them sitting there for nearly two weeks now.
Comment 24 Steffen Pingel CLA 2007-07-14 14:53:09 EDT
This will be my last response on this bug report.

(In reply to comment #23)
> Note that I haven't been told about unrelated changes (and those actually hard
> to notice when you have 4 or 5 outgoing changesets, files just jump randomly).

I didn't realize at first that these changes were unrelated. To me it is part of submitting a patch to review it and make sure it is clean before attaching it (and Mylyn's attachment preview makes that easy enough).

> Let me remind what comments I received:
[...]

I stand by the comments I made and I still think we should improve the current API. Your patch contains an important bug fix and should be merged but unfortunately it took a long discussion to figure that out.

> It is not the first time when I get that committers change mind and asking to
> resubmit already dropped patches. I don't know if you aware of that, but more
> then 2 outgoing changestes are practically unmanageable and I already have 5 of
> them sitting there for nearly two weeks now.

I know that managing multiple changesets can be painful. That's one reason to reduce patches to a minimal amount of changes so they can easily be applied even when changes to the code base are made. 

(In reply to comment #18)
> I understood that you don't want to fix
> this in an easy way, but if you want to mess up with a new API, why don't you
> create separate issue, and maybe make this one depend on it.

I have no intentions to mess up or to mess the API up. Your comment is rather insulting to me and against the communication guidelines:  http://wiki.eclipse.org/index.php/Mylar_Contributor_Reference#Communication .
Comment 25 Eugene Kuleshov CLA 2007-07-14 15:36:59 EDT
(In reply to comment #24)
> Your comment is rather insulting to me and against the communication guidelines:
> http://wiki.eclipse.org/index.php/Mylar_Contributor_Reference#Communication .

From your comments I wasn't sure if you have read those, but didn't want to rub your nose into your several violations of your own.

Anyways, I don't know what is the general idea with "this will be my last comment" that seem to becoming a popular habit. Does this mean "won't fix" for the given issue? I hope you all realize that this is not a solution.
Comment 26 Robert Elves CLA 2007-07-17 13:26:16 EDT
In the interest of moving forward with this I think the next action on this report should be my review/comment on Steffen's patch. I will review this week.
Comment 27 Steffen Pingel CLA 2007-07-19 11:14:07 EDT
Since Eugene's most recent patch is essentially a fix to get duplicate detectors working for connectors other than Bugzilla it should be merged first. We can then discuss a more generic solution on a separate bug report.
Comment 28 Mik Kersten CLA 2007-07-19 13:29:40 EDT
Rob: when you're reviewing please note my comment#17 and the first part of Steffen's comment#22 .  The key is that we allow for generic duplicate detectors (e.g. ones that only require the repository search so that they can make simple textual analysis, such as stack trace entry matching, across connectors).  Connector-specific detectors would be a nice thing to support too if we have a driver for that.
Comment 29 Mik Kersten CLA 2007-07-19 13:48:29 EDT
To clarify on my comment above, I'm trying to distinguish between the "kind" of a detector and whether or not a duplicate detector should be applied to a particular editor, that's why I see a problem with the following extension point, which works around a limitation in our API.  Setting this bug for 3.0 because we need to improve the API to support proper modularity for duplicate detectors and this is a good driver.
      
      <detector
            class="org.eclipse.mylyn.internal.jira.ui.StackTraceDuplicateDetector" 
            name="Stack Trace"
            kind="jira">
            
I created bug 197181 for discussing how this should work in a way that does not combine the capability of a connector to do the searches for duplicate detection with custom duplicate detection.
Comment 30 Mik Kersten CLA 2007-07-19 14:06:14 EDT
 (In reply to comment #25)
> (In reply to comment #24)
> > Your comment is rather insulting to me and against the communication
> guidelines:
> > http://wiki.eclipse.org/index.php/Mylar_Contributor_Reference#Communication .
> 
> From your comments I wasn't sure if you have read those, but didn't want to rub
> your nose into your several violations of your own.
> 
> Anyways, I don't know what is the general idea with "this will be my last
> comment" that seem to becoming a popular habit. Does this mean "won't fix" for
> the given issue? I hope you all realize that this is not a solution.

Eugene: if you feel that Steffen did not communicate appropriately feel free to email me pointing out the problematic communication.

This portion of the discussion has unfortuantely degraded into being non technical.  First, please note that the guidelines are intended to be between users and contributors/committers, not committers and committers.  If it is unclear how committers should treat each other I can create additional guidelines but it is a simple as respecting others' opinions and maintaining a technical discussion.

Since there seems to be increasing problem in coming to a consensus, here is the protocol I would like us to use:
1) Each person can outline their technical points as comments.
2) If a discussion is not converging (e.g. after 2 comments back-and-forth) please ask another committer to chime in.
3) If it doesn't converge after that the project lead arbitrates, seeks additional input if needed, and then decides on the course of action.

In general I want to avoid getting to point (3) and I prefer running a project as a cohesive group of peers than with the project lead stepping in to arbitrate things.  But if a discussion gets non-technical or fails to converge due to different view points then we are stuck resorting to (3).
Comment 31 Robert Elves CLA 2007-07-20 12:11:55 EDT
Eugene, would it be possible to re-cut your patch as it isn't taking in my workspace.
Comment 32 Eugene Kuleshov CLA 2007-07-20 13:08:49 EDT
Rob, at this point I am not convinced that my my patch would help. So, why don't you just fix it. It is rather trivial.

Alternatively I can fix this particular issue for jira only, without going into the refactorings. However such fix will make dupe detector for jira appear in Bugzilla editor which you can fix up then, as well as do other cleanup or generalizations/refactorings you like.
Comment 33 Robert Elves CLA 2007-07-20 13:31:46 EDT
 (In reply to comment #32)
> Rob, at this point I am not convinced that my my patch would help. So, why don't
> you just fix it. It is rather trivial.
Okay, I won't have time to look at this today but maybe on weekend
Comment 34 Eugene Kuleshov CLA 2007-08-02 00:06:44 EDT
I created bug 198614 to handle code duplication. Common search infrastructure should be also handled separately.
Comment 35 Eugene Kuleshov CLA 2007-08-02 01:04:16 EDT
I've fixed this particular issue with the stack trace based duplicate detector for JIRA registered trough extension point.