Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 316276

Summary: [LTTng] Inconsistent results in CFV, RV and SV
Product: z_Archived Reporter: Francois Chouinard <fchouinard>
Component: LinuxToolsAssignee: Francois Chouinard <fchouinard>
Status: CLOSED FIXED QA Contact: Francois Chouinard <fchouinard>
Severity: critical    
Priority: P3    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard:
Attachments:
Description Flags
Consistency patch
none
Updated consistency patch fchouinard: iplog-

Description Francois Chouinard CLA 2010-06-09 08:19:49 EDT
The results in the ControlFlow view, the Resources view and the Statistics view are not consistent and not systematically reproducible.

There seems to be a concurrency issue where the events are not consistently read in the same sequence depending on the order of the requests to the back-end.

Some background.

Initially, to keep things simple, requests were serviced in sequence and everything was fine except that "long" requests (i.e. going over the whole set of traces to build indexes, checkpoints, stats, etc) would adversely affect the application responsiveness.

To improve user experiences, 2 performance measures were implemented:

1. Request coalescing, where similar requests are merged to limit repeated back-end accesses. This acts essentially as mux/demux and works fine.

2. Request prioritization into "background" vs. "foreground", where FG requests would pre-empt the BG ones.


For the prioritization, 3 approaches were tried and all of them turned out to be problematic:

a. Use 2 request queues (FG and BG) and have the FG requests pre-empt the BG requests. The tests showed that switching between the requests was confusing for the library and that there was no obvious point where a request could be pre-empted safely using a single trace object (i.e file descriptor).

b. Use 2 independent trace objects to service the corresponding requests. By setting the thread priorities properly, it was possible to make the FG requests perform better. However, this was also confusing for the library and led to inconsistent accesses.

c. Use 1 trace object and 1 prioritized request queue where FG requests were given top priority and BG requests (running at lowest priority) were partitioned into smaller requests before being queued. This affected seriously the performance and turned out to be as unreliable as the other approaches.


All this suggests insufficient synchronization either at the library, the JNI component and/or the trace handler level.
Comment 1 Francois Chouinard CLA 2010-06-16 18:34:19 EDT
Created attachment 172080 [details]
Consistency patch

Some instability in the application was introduced when we started processing event requests concurrently. With this patch, the consistency is restored.

This patch addresses the problem by:
- Funneling all data accesses through the Experiment instead of accessing directly the underlying trace
- Removed the concept of main vs. clone to process FG/BG requests (a bad idea to start with) in favor of a PriorityBlockingQueue processed by a SingleThreadExecutor
- Modified the data queue shared by Producer/Consumer from a LinkedBlockingQueue to a SynchronousQueue (to accommodate LTTng's "special needs")
- Added some synchronization where concurrent accesses could corrupt things

A number of miscellaneous tweaks are also included:
- Added toString() and equals() where needed
- Set the HistogramView window to 1sec (in line with CFV and RV)

Note that some JUnits are now failing. They will need to be further analyzed to ensure no new problem was introduced.

This patch can't be committed before the JUnits pass.
Comment 2 Francois Chouinard CLA 2010-06-21 14:59:04 EDT
Created attachment 172358 [details]
Updated consistency patch

- Misc fixes so the JUnits pass cleanly
- Fixed the read/write timeout (temporary solution, see Bug216280)
Comment 3 Francois Chouinard CLA 2010-06-21 15:04:23 EDT
Comment on attachment 172358 [details]
Updated consistency patch

Patch committed
Comment 4 Francois Chouinard CLA 2010-06-21 15:04:41 EDT
Patch committed
Comment 5 Francois Chouinard CLA 2010-07-08 16:37:11 EDT
Patch also committed to to the Helios maintenance branch.
Comment 6 Francois Chouinard CLA 2011-07-22 14:36:20 EDT
Delivered with 0.6.1
Comment 7 Patrick Tasse CLA 2013-05-24 15:05:44 EDT
*** Bug 313966 has been marked as a duplicate of this bug. ***