Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 116696

Summary: [Performance] Provide a GLA outputer that will output data in Large Resource format and load it directly in the data store
Product: z_Archived Reporter: Marius Slavescu <slavescu>
Component: TPTPAssignee: Marius Slavescu <slavescu>
Status: CLOSED FIXED QA Contact:
Severity: enhancement    
Priority: P1 CC: apnan, cmaier, labadie, popescu, sluiman
Version: unspecifiedKeywords: plan
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
URL: http://www.eclipse.org/tptp/groups/Architecture/documents/features/hf_116696.html
Whiteboard: closed460

Description Marius Slavescu CLA 2005-11-16 12:28:36 EST
To improve the import time in large resource (log) case we could build an
outputter that could generate (and also run) the SQL statements required to
build a large log resource or even better dump files in CSV format for each
affected table then load all at the end of GLA processing.
Comment 1 Harm Sluiman CLA 2005-11-16 12:46:01 EST
First this is clearly an enhancement request ;-)
It should also be separated into three parts.
1. A query driven RDB sensor
2. A CSV outputter
3. A RDB outputter

I am not sure I completely understand the importance or implication of each, 
but let's treat them separately.
The GLA has a very specific use case and should not be twisted into use cases 
where it does not belong.
Comment 2 Marius Slavescu CLA 2005-11-16 13:57:51 EST
I intended this feature to handle only the outputter case (for large log
resources only), the sensor I expect to be for other data sources (different
then Large log data sources), so please extend the description on point 1.

The idea is to take the producer CommonBaseEvent and push it into the database
directly in an Large Log resource (avoiding XML serialization, deserialization,
event loading and the related database operation to persist the model). 

We could provide one outputter per output type or have a parameterized one which
can support any database/output format that is currently supported in the large
log scenarios.

I think the performance can be further improved (compared with SQL dump or
directly through JDBC) by using CSV files (where applicable) and do the load at
the end of parsing process, although in this case the user would not be able to
view the resource until the whole log was parsed/loaded (possible when JDBC is
used and the resource is incrementally updated).


I think this is an important feature (I tagged as P1) as I expect big import
performance improvements by using it.
Comment 3 Harm Sluiman CLA 2005-11-16 15:34:32 EST
Let's discuss this in person. These are good ideas, but I am nto sure they fit 
GLA as well as they fit CBE and LTA
Comment 4 Alex Nan CLA 2005-11-16 15:52:27 EST
From our preliminary tests we concluded that for a scalable load operation 
(implicitly a scalable log import) the EMF model needs to be shortcutted and 
data needs to be loaded in large amounts, each database table that maps to an 
EMF model class needs to be loaded separately in a large chunk. We found that 
the GLA is a good place to add the code that generates the raw data files/SQL 
statement blocks that will be then loaded/executed in/against the database.
Comment 5 Alex Nan CLA 2005-11-16 19:01:37 EST
Do you mean by RDB outputter an outputter that generates bulks of SQL INSERT 
statements?
Comment 6 Marius Slavescu CLA 2005-11-18 11:51:08 EST
This feature will also include the UI required changes to make use of this
outputter (Import Log Wizard).
Comment 7 Marius Slavescu CLA 2006-02-21 11:29:54 EST
Add an entry in the user documentation where we recommend to dedicate a database server in their environment where they will have the AC installed as well to obtain maximum performance (of log import) when they will use this feature.
Comment 8 Marius Slavescu CLA 2006-04-10 15:18:42 EDT
Moved to i3 for remaining work which doesn't impact the UI, the behavior will be transparent to the user.
Comment 9 Sri Doddapaneni CLA 2006-04-13 16:25:10 EDT
Deferred to I3. Marius, confirm if this indeed is doable in I3. If not. please propose to drop it from 4.2 plan.
Comment 10 Sri Doddapaneni CLA 2006-05-23 03:26:01 EDT
Please provide status update.
Comment 11 Marius Slavescu CLA 2006-05-23 10:30:07 EDT
Done.
Comment 12 Paul Slauenwhite CLA 2009-06-30 09:36:40 EDT
As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant enhancements/defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement/defect is verified/closed by the Project Lead since this originator of this enhancement/defect has an inactive Bugzilla account and considered to be fixed. If this enhancement/defect is still unresolved and reproducible in the latest TPTP release (http://www.eclipse.org/tptp/home/downloads/), please re-open.