Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 444487 - [search] Lucene special characters are double escaped
Summary: [search] Lucene special characters are double escaped
Status: RESOLVED WONTFIX
Alias: None
Product: Orion
Classification: ECD
Component: Server (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-18 10:25 EDT by Michael Ochmann CLA
Modified: 2015-01-19 15:46 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Ochmann CLA 2014-09-18 10:25:20 EDT
Some characters in Lucene queries like "-" and "*" must be escaped [1]. The client does that escaping in org.eclipse.orion.client.ui/web/plugins/filePlugin/fileImpl.js function _generateLuceneQuery. However, the server does the same escaping again in SearchServlet line 131 with ClientUtils#escapeQueryChars().

Steps to reproduce:

1. start the server in debug mode and set a break point in SearchServlet on the line:

  String processedTerm = ClientUtils.escapeQueryChars(term.toLowerCase());

2. create a file in the client with just the string foo-bar as content
3. do a global search for foo-bar while you watch the HTTP traffic
4. notice that the request sent to the server is something like
  
  http://localhost:8080/filesearch?sort=Path%20asc&rows=40&start=0&q=foo%5C-bar+Location:/file*

=> the dash is escaped as %5C- , i.e. URL-encoded \-

5. notice on the server that term=foo\-bar and processedTerm=foo\\\-bar 
6. Lucene interprets that as foo\-bar => no match

My gut feeling tells me that the client should not assume anything about the search technology deployed on server side, so I tend to remove the escaping on client side. However, I can't get a clear picture from the git history about why and when this escaping was introduced. So I probably miss some "anecdotes" of this code.


[1] http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Escaping%20Special%20Characters
Comment 1 Anthony Hunter CLA 2015-01-19 15:46:00 EST
We are no longer using Apache Solr on the Orion server for search.