Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 346947

Summary: Relevance Sorting for Subwords Completion Engine
Product: z_Archived Reporter: Marcel Bruch <marcel.bruch>
Component: RecommendersAssignee: Marcel Bruch <marcel.bruch>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: lerch, stevemash
Version: unspecifiedKeywords: helpwanted, plan
Target Milestone: ---   
Hardware: PC   
OS: Mac OS X - Carbon (unsup.)   
Whiteboard: completion, deferred
Bug Depends on: 350991    
Bug Blocks:    
Attachments:
Description Flags
Benchmark plugin for relevance sorting
none
NGrams algorithm
none
EPL Licence headers marcel.bruch: iplog+

Description Marcel Bruch CLA 2011-05-24 05:01:28 EDT
The subwords completion engine uses the internal ranking of Eclipse JDT yet. However, when using subwords or regular expressions a smarter ranking strategy is needed. For instance,  proposals that match the prefix entered by the user should be ranked higher in the list than those that match the simple regular expression. And completions that contain several characters in row (which we call subwords) as in 'foc' which matches 'setFocus' quite well, should be ranked higher that proposals that just match the characters.

A first guess how to implement this is to use the Jaro-Winkler string distance measure and rank the proposals accordingly to this score. Details on this string distance measure can be found here: 

  http://en.wikipedia.org/wiki/Jaro–Winkler_distance

The code relevant to this feature is in org.eclipse.recommenders.rcp.codecompletion.subwords which is available from here: 

  git://git.eclipse.org/gitroot/recommenders/org.eclipse.recommenders.git


For details, add your comments here or send them to the forum:
  
  http://eclipse.org/forums/eclipse.recommenders


Related sources:
  
  http://code-recommenders.blogspot.com/2011/05/subword-matching-completion-engine-for.html
Comment 1 Paul-Emmanuel Faidherbe CLA 2011-06-03 11:44:46 EDT
Created attachment 197311 [details]
Benchmark plugin for relevance sorting

Here is the plugin created to compare Jaro-Winckler and Levenshtein algorithms to compare strings and adjust relevance for subwords proposals.
For tests purposes only.
Comment 2 Paul-Emmanuel Faidherbe CLA 2011-06-27 13:03:45 EDT
Created attachment 198667 [details]
NGrams algorithm

Added NGrams calculation (with prefix pound)
Comment 3 Marcel Bruch CLA 2011-06-27 17:18:17 EDT
Could you please again specify that you put the code under EPL, you authored the code etc. as you did in the last contribution? Thanks, Marcel
Comment 4 Paul-Emmanuel Faidherbe CLA 2011-06-27 17:49:28 EDT
Created attachment 198692 [details]
EPL Licence headers

Ngram algorithm class with EPL licence

I authored 100% of the content I am contributing
I have the rights to contribute the content
I'm contributing the content under the EPL
Comment 5 Marcel Bruch CLA 2011-08-13 19:35:13 EDT
Relevance sorting requires JDT to run proposal sorting after each keystroke. This has been described in bug 350991,  which blocks this one. Currently, no relevance computation is done but "JDT-Score -1" returned.
Comment 6 Marcel Bruch CLA 2011-12-21 20:19:39 EST
closing this bug for ip log generation. Please open a new bug for similar requests.