Community
Participate
Working Groups
I was searching for who defines the command "Show in Navigator" but when I typed it in the search box, it searched for "showinnavigator" and therefore got no hits.
As far as I know white space search is not yet supported. I even tried to change it to "show in navigator" in the URL. Server does hit something but it seems to me that it is searching the first word instead of the whole phrase. So if you click on the file and do find, you will hit nothing for "show in navigator". But it is not the end of world. A trick we may want to play: If the query contains white space (or other special chars), we can ask server to give back result on the first word. Then because we know we are getting more than expected, we are forced to do "in-file-search" for every file. One side effect might be the mismatched total number and you may see less than 40 results per page but you still have to show them in multiple pages. I know it is expensive on client side but better than nothing.
I think this "refining coarse result on client" can also be applied to case sensitive search, where server has to give a looser result. The cost is that client has to ask for file content for each file. But we are already doing file meta request anyway, which may be less expensive but I think they are the same level of cost. We can definitely skip meta data request if we will do this "refining".
I haven't had a chance to try, but according to Solr documentation using double-quotes around a set of words should perform a phrase query. The issue might be that we are escaping quote characters on the client?
(In reply to comment #3) > I haven't had a chance to try, but according to Solr documentation using > double-quotes around a set of words should perform a phrase query. The issue > might be that we are escaping quote characters on the client? We can change that, for instance: User typed "foo" in the search box we still respect that by passing query as \"foo\". But {"foo" bar} was typed, we can use "\"foo\" bar" as the query. I tried in Orion project by changing URL to q="return this", server gives back more than expected. Then I googled and found an article : http://stackoverflow.com/questions/7887820/solr-dismax-handler-whitespace-and-special-character-behaviour Seems that if we use double quoto and replace white space with -, it returns right thing. I changed to q="return-this" , seemed this time it returns me the right result. Further for my curiosity I used "return this.open", this time it returns me only one file, which evidenced my guess. I believe there must be other configurations we can poke on server side but I think for now it should be ok.
It would help me tremendously if we implemented this pretty soon. I'm doing a lot of searches where I'm trying to find out who declared a particular command or tooltip, and the space matters. Such as "Open with" "Show in" etc....
I tried the "foo-bar" theory again today but seems it is not always true. E.g. q="open-with" gives back the result but q="shown-in" does not(eclipse hits "shown in"). q="open with" is even worse without giving back any thing. Two thoughts here: 1.When user types {foo bar}, we can ask server to search on "foo" which will give back more results. Then on the client side, in the in-file-search, we walk through all the result files and search on "foo bar". The files that do not contain "foo bar" will be marked as stale. Pros and cons: When search string contains white space in the middle, in-file-search will be forced but the performance should be equivalent to expand all. A lot of stale file will be introduced but the result will be accurate. 2.How about search again in search result? When you type foo , it gives you 100 files. Then in the search result page you can search on "foo bar" within the 40 files. Actually I sometimes used this by the browser search in the result page.
(In reply to comment #6) > > 2.How about search again in search result? > When you type foo , it gives you 100 files. Then in the search result page you > can search on "foo bar" within the 40 files. > Actually I sometimes used this by the browser search in the result page. To clarify, I meant to say: We can think about adding "search within the results" command in the tool bar. It will not behave like the browser's CTRL+F but will generate a subset of the original results.
(In reply to comment #7) > (In reply to comment #6) > > > > 2.How about search again in search result? > > When you type foo , it gives you 100 files. Then in the search result page you > > can search on "foo bar" within the 40 files. > > Actually I sometimes used this by the browser search in the result page. > > To clarify, I meant to say: > We can think about adding "search within the results" command in the tool bar. > It will not behave like the browser's CTRL+F but will generate a subset of the > original results. There might be some cases where this would be useful as a regular use case. But it seems kind of like a hack to make the user do this to solve this problem. I think it would be good to get to the bottom of the server side...why the cases that aren't returning results are failing. Then we can strategize a client side workaround once we understand what's going on.
(In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #6) > > > > > > 2.How about search again in search result? > > > When you type foo , it gives you 100 files. Then in the search result page you > > > can search on "foo bar" within the 40 files. > > > Actually I sometimes used this by the browser search in the result page. > > > > To clarify, I meant to say: > > We can think about adding "search within the results" command in the tool bar. > > It will not behave like the browser's CTRL+F but will generate a subset of the > > original results. > > There might be some cases where this would be useful as a regular use case. Right. It is kind of generic way to narrow down the results. > But it seems kind of like a hack to make the user do this to solve this > problem. The original issue should be resolved at the first place. But my use case is: When I started search I only know there is something related to "foo". Then in the result page I realized "foo bar" was the exact term I wanted to search on. but I agree this is another story. > > I think it would be good to get to the bottom of the server side...why the > cases that aren't returning results are failing. Then we can strategize a > client side workaround once we understand what's going on. I am holding off for the client workaround till we understand completely what happens on the server side.
We talked about different options before but I have not yet concluded a solution. I will keep this as reminder for RC2 tasks but if there is no quick fix I will put it post 0.5
talked to John shortly. Possibly lucene has support for phrase. If not we will detect white space and switch to crawler .
Some pre-processing we are doing on the server is preventing this from working.
Released a fix: http://git.eclipse.org/c/orion/org.eclipse.orion.server.git/commit/?id=606aaf04161aa800cd4584e47d3e19b3670fed81 Regression tests: http://git.eclipse.org/c/orion/org.eclipse.orion.server.git/commit/?id=24143f5b2b43b2223304b580eb4ec64cb08e216e However the client is doing a number of things before sending to the server that prevent this from working. In particular it is encoding the quotes, and removing whitespace between words. Moving back to Libing for that part.
Fixed client side with http://git.eclipse.org/c/orion/org.eclipse.orion.client.git/commit/?id=96331b9881498b59c632d9b60a54e5ed73224b0a. I did some tests but it seems server is still giving back redundant results. Test case 1: 1.In navigator, drill in to orion client code 2.Type 'string type' in the search box. 3.It gives back 7 files but only one file has that 'string type'. this is the response from the server Request URL:http://libingw.orion.eclipse.org:8080/filesearch?sort=Path%20asc&rows=40&start=0&q=%22string%20type%22+Location:/file/R/OrionClient/* Request Method:GET Status Code:200 OK Request Headersview source Accept:application/json Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3 Accept-Encoding:gzip,deflate,sdch Accept-Language:en-US,en;q=0.8 Connection:keep-alive Cookie:JSESSIONID=15hw0wqveur9i1vfs48ki167s7 Host:libingw.orion.eclipse.org:8080 Orion-Version:1 Referer:http://libingw.orion.eclipse.org:8080/plugins/fileClientPlugin.html User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.81 Safari/537.1 X-Requested-With:XMLHttpRequest Query String Parametersview URL encoded sort:Path asc rows:40 start:0 q:"string type" Location:/file/R/OrionClient/* Response Headersview source Content-Encoding:gzip Content-Length:644 Server:Jetty(8.1.3.v20120522) Via:1.1 (jetty)
If you do the same search by regEx on, you will get 5 files that all contains the search term.
Good example : If you search on 'items instanceof Array' , both indexer and crawler(regEx) gave the same result.
I think the "string type" example is a tokenizer issue. All of the false matches contain "string" followed by "type", with only special characters in between. For example: commands.js: {String} type messages.js: string):": "Type csslint.js: {String} type
Since basic phrase searching is working in 1.0 M2, I am going to mark this fixed. I have opened bug 390393 for the case in comment #17.