Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 346947 - Relevance Sorting for Subwords Completion Engine
Summary: Relevance Sorting for Subwords Completion Engine
Status: CLOSED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Recommenders (show other bugs)
Version: unspecified   Edit
Hardware: PC Mac OS X - Carbon (unsup.)
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Marcel Bruch CLA
QA Contact:
URL:
Whiteboard: completion, deferred
Keywords: helpwanted, plan
Depends on: 350991
Blocks:
  Show dependency tree
 
Reported: 2011-05-24 05:01 EDT by Marcel Bruch CLA
Modified: 2019-07-24 14:36 EDT (History)
2 users (show)

See Also:


Attachments
Benchmark plugin for relevance sorting (47.26 KB, application/zip)
2011-06-03 11:44 EDT, Paul-Emmanuel Faidherbe CLA
no flags Details
NGrams algorithm (2.01 KB, application/octet-stream)
2011-06-27 13:03 EDT, Paul-Emmanuel Faidherbe CLA
no flags Details
EPL Licence headers (2.63 KB, application/octet-stream)
2011-06-27 17:49 EDT, Paul-Emmanuel Faidherbe CLA
marcel.bruch: iplog+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marcel Bruch CLA 2011-05-24 05:01:28 EDT
The subwords completion engine uses the internal ranking of Eclipse JDT yet. However, when using subwords or regular expressions a smarter ranking strategy is needed. For instance,  proposals that match the prefix entered by the user should be ranked higher in the list than those that match the simple regular expression. And completions that contain several characters in row (which we call subwords) as in 'foc' which matches 'setFocus' quite well, should be ranked higher that proposals that just match the characters.

A first guess how to implement this is to use the Jaro-Winkler string distance measure and rank the proposals accordingly to this score. Details on this string distance measure can be found here: 

  http://en.wikipedia.org/wiki/Jaro–Winkler_distance

The code relevant to this feature is in org.eclipse.recommenders.rcp.codecompletion.subwords which is available from here: 

  git://git.eclipse.org/gitroot/recommenders/org.eclipse.recommenders.git


For details, add your comments here or send them to the forum:
  
  http://eclipse.org/forums/eclipse.recommenders


Related sources:
  
  http://code-recommenders.blogspot.com/2011/05/subword-matching-completion-engine-for.html
Comment 1 Paul-Emmanuel Faidherbe CLA 2011-06-03 11:44:46 EDT
Created attachment 197311 [details]
Benchmark plugin for relevance sorting

Here is the plugin created to compare Jaro-Winckler and Levenshtein algorithms to compare strings and adjust relevance for subwords proposals.
For tests purposes only.
Comment 2 Paul-Emmanuel Faidherbe CLA 2011-06-27 13:03:45 EDT
Created attachment 198667 [details]
NGrams algorithm

Added NGrams calculation (with prefix pound)
Comment 3 Marcel Bruch CLA 2011-06-27 17:18:17 EDT
Could you please again specify that you put the code under EPL, you authored the code etc. as you did in the last contribution? Thanks, Marcel
Comment 4 Paul-Emmanuel Faidherbe CLA 2011-06-27 17:49:28 EDT
Created attachment 198692 [details]
EPL Licence headers

Ngram algorithm class with EPL licence

I authored 100% of the content I am contributing
I have the rights to contribute the content
I'm contributing the content under the EPL
Comment 5 Marcel Bruch CLA 2011-08-13 19:35:13 EDT
Relevance sorting requires JDT to run proposal sorting after each keystroke. This has been described in bug 350991,  which blocks this one. Currently, no relevance computation is done but "JDT-Score -1" returned.
Comment 6 Marcel Bruch CLA 2011-12-21 20:19:39 EST
closing this bug for ip log generation. Please open a new bug for similar requests.