This Bugzilla instance is deprecated, and most Eclipse projects now use GitHub or Eclipse GitLab. Please see the deprecation plan for details.
Bug 166131 - ContainerContext Index Manager
Summary: ContainerContext Index Manager
Status: CLOSED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Corona (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows XP
: P5 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Glenn Everitt CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-28 17:48 EST by Glenn Everitt CLA
Modified: 2010-04-08 11:13 EDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Glenn Everitt CLA 2006-11-28 17:48:59 EST
Create a Collaboration Context Index Manager.  First thought is to use Apache Lucene to index: Chat logs, Mylar Historys, CVS commit history that have been stored in the Jackrabbit repository.  Index Manager is responsible for knowing what to index (file types, mime types) and managing location and update frequency of indexes.
Comment 1 Glenn Everitt CLA 2007-01-15 17:36:37 EST
The Lucene support inside Jackrabbit supports things like MS Word, Excel, Powerpoint Documents as well as Open Office Documents - we currently have little experience with this support in Jackrabbit.

Lucene provides a way to add indexers for different files types.  To add these indexers into the Jackrabbit environment you edit the repository.xml and / or workplace.xml (Is it both or only one?) you add the name of your lucene filter class to SearchIndex param named textFilterClasses

		<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
	  		<param name="path" value="${wsp.home}/index"/>
	    	<param name="textFilterClasses" value="org.apache.jackrabbit.core.query.lucene.TextPlainTextFilter,org.apache.jackrabbit.core.query.MsExcelTextFilter,org.apache.jackrabbit.core.query.MsPowerPointTextFilter,org.apache.jackrabbit.core.query.MsWordTextFilter,org.apache.jackrabbit.core.query.PdfTextFilter,org.apache.jackrabbit.core.query.HTMLTextFilter,org.apache.jackrabbit.core.query.XMLTextFilter,org.apache.jackrabbit.core.query.RTFTextFilter,org.apache.jackrabbit.core.query.OpenOfficeTextFilter"/>
	    	
			<!-- These are all default values. You can change them if you want -->
	        <param name="useCompoundFile" value="true"/>
	        <param name="minMergeDocs" value="100"/>
	        <param name="volatileIdleTime" value="3"/>
	        <param name="maxMergeDocs" value="100000"/>
	        <param name="mergeFactor" value="10"/>
	        <param name="bufferSize" value="10"/>
	        <param name="cacheSize" value="1000"/>
	        <param name="forceConsistencyCheck" value="false"/>
	        <param name="autoRepair" value="true"/>
	        <param name="analyzer" value="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
	        <param name="queryClass" value="org.apache.jackrabbit.core.query.QueryImpl"/>
	        <param name="idleTime" value="-1"/>
	        <!-- end of default values -->    	
	    	
	    	<param name="respectDocumentOrder" value="false"/>
		</SearchIndex>
 
Comment 2 Glenn Everitt CLA 2007-02-14 18:35:36 EST
Lucene is integrated with Jackrabbit and Search capability using the Lucene indexing is supported in the JCR interface to Jackrabbit.  So, the only work is adding the indexing extensions into Lucene.  I added Apache POI project for reading / indexing Microsoft documents and the jar for open office indexing however could not add the PDF indexing since it is GPL license.
Comment 3 Dennis O'Flynn CLA 2007-02-21 10:25:35 EST
Marked as close for 1.0.0M8 build
Comment 4 Eclipse Webmaster CLA 2010-04-08 11:13:47 EDT
Project is archived.