Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 357802 - Web Crawler: depends on fixed "Url" attribute mapping (NullPointerException)
Summary: Web Crawler: depends on fixed "Url" attribute mapping (NullPointerException)
Status: CLOSED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Smila (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows 7
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Juergen Schumacher CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-15 09:47 EDT by Nadine Ausländer CLA
Modified: 2022-07-07 11:31 EDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nadine Ausländer CLA 2011-09-15 09:47:07 EDT
NullPointerException when the attribute mapping in the data source does not contain the fixed mapping of attribute "Url" to field attribute "Url".

Changed the mapping in the default data source "web.xml" (see "configuration\org.eclipse.smila.connectivity.framework")

from:

<DataSourceConnectionConfig ...>
  <DataSourceID>web</DataSourceID>
  <SchemaID>org.eclipse.smila.connectivity.framework.crawler.web</SchemaID>
  <Attributes>
    <Attribute Type="String" Name="Url" KeyAttribute="true">
      <FieldAttribute>Url</FieldAttribute>
    </Attribute>
    ...
  </Attribute>
</DataSourceConnectionConfig>


to:

<DataSourceConnectionConfig ...>
  <DataSourceID>web</DataSourceID>
  <SchemaID>org.eclipse.smila.connectivity.framework.crawler.web</SchemaID>
  <Attributes>
    <Attribute Type="String" Name="MyUrl" KeyAttribute="true">
      <FieldAttribute>Url</FieldAttribute>
    </Attribute>
    ...
  </Attribute>
</DataSourceConnectionConfig>


Error message is:

errorBuffer: "[--- 2011-09-15 15:34:11.587 --- org.eclipse.smila.connectivity.framework.CrawlerException: java.lang.NullPointerException at org.eclipse.smila.connectivity.framework.crawler.web.WebCrawler.getMetadata(WebCrawler.java:353) at org.eclipse.smila.connectivity.framework.util.internal.DataReferenceImpl.getRecord(DataReferenceImpl.java:126) at org.eclipse.smila.connectivity.framework.impl.CrawlThread.updateDataReference(CrawlThread.java:389) at org.eclipse.smila.connectivity.framework.impl.CrawlThread.processDataReference(CrawlThread.java:342) at org.eclipse.smila.connectivity.framework.impl.CrawlThread.processDataReferences(CrawlThread.java:308) at org.eclipse.smila.connectivity.framework.impl.CrawlThread.run(CrawlThread.java:194) Caused by: java.lang.NullPointerException at org.apache.commons.codec.digest.DigestUtils.md5(DigestUtils.java:86) at org.apache.commons.codec.digest.DigestUtils.md5Hex(DigestUtils.java:108) at org.eclipse.smila.connectivity.framework.crawler.web.WebCrawler.getRecord(WebCrawler.java:575) at org.eclipse.smila.connectivity.framework.crawler.web.WebCrawler.getMetadata(WebCrawler.java:351) ... 5 more , --- 2011-09-15 15:34:12.041 ---

Used the following config as a work-around:

<DataSourceConnectionConfig ...>
  <DataSourceID>web</DataSourceID>
  <SchemaID>org.eclipse.smila.connectivity.framework.crawler.web</SchemaID>
  <Attributes>
    <Attribute Type="String" Name="Url" KeyAttribute="true">
      <FieldAttribute>Url</FieldAttribute>
    </Attribute>
    <Attribute Type="String" Name="MyUrl" KeyAttribute="true">
      <FieldAttribute>Url</FieldAttribute>
    </Attribute>
    ...
  </Attribute>
</DataSourceConnectionConfig>
Comment 1 Juergen Schumacher CLA 2011-09-21 07:20:52 EDT
fixed in rev. 1683
Comment 2 Andreas Weber CLA 2013-04-15 11:50:07 EDT
Closing this