Community
Participate
Working Groups
Placeholder for performance work I will detail later
Chris, please add more information to this I need to update the design document and set the sizing
Identify the top 20 slowest methods in the LTA code for importing log files and improve this performance at least 10%.
*** Bug 78355 has been marked as a duplicate of this bug. ***
Feature was rejected in 4.1 and it is reused as a 4.2 feature.
One performance improvement that should be considered is the import of large log files with filtration. I think the overhead in initializing an import is proportional to the size of the log file. This makes the procedure of importing a large log file with a very narrow filtration very time consuming. For example, compare the performance of importing an Apache Access log file with only 100 records with importing an Apache Access log file that is above 20 MB and has filtration set for importing the first 100 records. The second import will be substantially slower than the first. I tried to import the first 100 records of a large Windows security log file and it took approximately 2 min. When I tried to import an Apache Access log file with only 100 records, the import completed within a matter of 3 seconds. I think importing a large log file with a small to medium filter set is a common use case that we should address in this feature. Ideally we should make the time complexity of the import process proportional to the number of records being imported.
Ali the problem that you see with windows security log file is related with the fact that for that log type (and all the other Windows related logs) is pre-parsed by an external application which generates a text based file which is then parsed by the corresponding GLA parser. I suspect your security log is considerably large if it takes 2 minutes for the first step (pre-parsing). I think the GLA should have support for direct parsing of binary logs (then the above scenario shouldn't take more then 3-5 seconds), as I understand the provided sensor/extractor blades works only on text based files.
The issues you are raising Ali are GLA/parser related, since this feature was assigned to me all I can do it improve UI performance but not runtime (GLA/parsers etc.). I'm not sure what you mean by reducing the time complexity proportional with the number of records imported? If you have N records in your log and the result of the filtering operation is a set of M records then you suggest that the complexity of the import operation should be O(M)? For predicates like "The first M records" this is possible but I'm afraid then generally speaking it is unlikely that you can perform in O(M) when the predicate is complex unless you have the set of records indexed.
We can eventually move this feature from the UI to the GLA after finishing UI improvements.
Theme: Scaling Up
update component since the requirement targets the import log function
Setting target to future so it doesn't show up in 4.2 feature query.
This has been addressed by adding GLA sensor filtering support. You can close this requirement
Close as done per comment #12
Changing target to when it was closed.
As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant enhancements/defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement/defect is verified/closed by the Project Lead since this enhancement/defect has been resolved and unverified for more than 1 year and considered to be fixed. If this enhancement/defect is still unresolved and reproducible in the latest TPTP release (http://www.eclipse.org/tptp/home/downloads/), please re-open.