| Summary: | Cannot crawl a single page | ||||||
|---|---|---|---|---|---|---|---|
| Product: | z_Archived | Reporter: | Andrej Rosenheinrich <andrej.rosenheinrich> | ||||
| Component: | Smila | Assignee: | Andreas Weber <Andreas.Weber> | ||||
| Status: | CLOSED FIXED | QA Contact: | |||||
| Severity: | minor | ||||||
| Priority: | P4 | CC: | igor.novakovic, svoigt.brox, tmenzel | ||||
| Version: | unspecified | ||||||
| Target Milestone: | --- | ||||||
| Hardware: | PC | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Attachments: |
|
||||||
|
Description
Andrej Rosenheinrich
Created attachment 176302 [details]
webcrawler configuration
the configuration i use for the webcrawler.
the same effect happens btw. when i am using MaxIterations with value 1 instead of MaxDepth. I can reproduce this problem. I will fix that in the next upcoming days Sebastian, have you managed to fix this already? I've tested my fix and send it to svn. MaxDepth=1 will crawl the seed page MaxDepth=2 will crawl the seed page and the pages with links on the seed pages etc. talking of rev. 719 it seems to me like the problem is solved only for maxdepth. for maxiterations it is still not possible to crawl a single page providing one seed and setting the value for maxiterations to 1. is this the correct and wanted behavior? greets andrej I am reopening this bug. @Thomas: Since Sebastian is unfortunately not any more involved in our project and this piece of code was initially contributed by brox, could you please take a look at it? Cheers Igor Hi Tom, any chance looking at this soon? Cheers Igor not really, unfortunately. since i have no clue about this code either there is probably substantial learning cost involved too and i have more pressing issues ATM. since neither of the remaining team has any pre knowledge about the code i'm un-assing this issue. so anybody willing with some time/need on his hand may pitch in. Connectivity framework was replaced by new Importing framework. With current Web Crawler, single (root) url can be crawled (parameter: "maxCrawlDepth" : "0") without following any links. Closing this |