Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 83354 - Sequence diagrams show incorrect sequence of events when clocks not in sync
Summary: Sequence diagrams show incorrect sequence of events when clocks not in sync
Status: CLOSED WONTFIX
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: TPTP (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows XP
: P2 enhancement with 5 votes (vote)
Target Milestone: ---   Edit
Assignee: Eugene Chan CLA
QA Contact:
URL: http://www.eclipse.org/tptp/groups/Ar...
Whiteboard: closed471
Keywords:
Depends on:
Blocks:
 
Reported: 2005-01-20 17:59 EST by Curtis d'Entremont CLA
Modified: 2016-05-05 11:01 EDT (History)
6 users (show)

See Also:


Attachments
profiling file from machine curtispd (2.85 KB, application/x-zip-compressed)
2005-01-20 18:01 EST, Curtis d'Entremont CLA
no flags Details
profiling file from machine vectra (3.47 KB, application/x-zip-compressed)
2005-01-20 18:01 EST, Curtis d'Entremont CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Curtis d'Entremont CLA 2005-01-20 17:59:18 EST
We are using the host interactions sequence diagram to view a distributed 
trace, where machine curtispd makes a remote call to machine vectra. However, 
the clocks on these two machines are not in sync with each other (there is a 
12 min difference.. I purposely did this to show the problem). The problem is 
that the diagram shows a strange and misleading sequence of events when the 
clocks are out of sync (even by a little bit).

I understand that I can set a time offset on one of the machines to correct 
this problem, and this works. However, users do not know this function exists, 
and even if they do, they don't know the offset to enter unless they look at 
the XML, which they never do (nor should they). And since this will affect 
virtually all users doing distributed tracing, we need a real fix for the 
problem.

I am not sure whether this problem is really in the sequence diagrams or in 
the model loaders. If you feel that this is a loader issue, please reassign 
the bug.

I believe the fix for this problem would involve normalizing the time on one 
of the hosts. There is enough information in the data to know that the times 
are out of sync. This can be done by comparing the time of the methodEntry of 
the source of the remote call, and the time of the methodEntry on the 
destination machine. We should assume that these two things happened at the 
same time (I realize it's not technically true, but there's no way to know 
real latency in the network from the information currently being sent). So we 
can conclude that those two times are really the same time, and use the 
difference as the offset, just as the user would enter it when setting the 
offset manually.

We need this in 3.3 as it is affecting a higher product and is having a 
significant impact on the product's usability.

Trace files to be attached..
Comment 1 Curtis d'Entremont CLA 2005-01-20 18:01:02 EST
Created attachment 17350 [details]
profiling file from machine curtispd
Comment 2 Curtis d'Entremont CLA 2005-01-20 18:01:25 EST
Created attachment 17351 [details]
profiling file from machine vectra
Comment 3 Dominique Guilbaud CLA 2005-01-21 06:04:09 EST
We could imagine a contextual menu, that would be enabled when the two (and only 
two) messages that are to be made synchronous are selected. The corresponding 
action would be to set the delta time accordingly and automatically.
Curtis, please indicate priority because P3 won't be achieved with certitude.
Comment 4 Valentina Popescu CLA 2005-01-21 09:12:56 EST
Let's not forget that the sequence diagram might not be the only view 
displaying async messages so the implementation should not be view specific.
Comment 5 Valentina Popescu CLA 2005-01-21 09:13:33 EST
CC Wayne since this is an usability problem
Comment 6 Curtis d'Entremont CLA 2005-01-21 10:51:49 EST
I don't think the context menu would work in this case, as there is in fact 
only one call in the sequence diagram that the user can see and select (the 
call goes from one host to the other). Also, I probably would not have been  
able to find this action as I wouldn't think to right click to synchronize.

I think we need to automate this and set the offset on each host as the remote 
calls are made - just as if the user did this manually. I'm now thinking this 
should probably be done by the loaders as they find methodEntries with 
invocation context (the element that allows it to correlate remote calls). 
This would make the fix view-neutral.

I feel that this is P1 for our product. Currently, if a user asked me how to 
get the value to set for the host offset, my explanation would probably 
involve diving into some trace XML files and searching manually for some 
timestamps. So we are essentially broken in this respect.
Comment 7 Wayne Ho CLA 2005-02-03 10:41:42 EST
This does appear to be a significant usability issue as the time offsets result
incorrect representation of information to the user when comparing hosts
(although I don't think this is purely a sequence diagram issue as this would be
a problem for any view comparing data from the hosts with incorrect time
offsets).  The user has no indication that something may be incorrect.

We should keep in mind that within TPTP's Log Navigator, they also have a
similar problem when comparing logs from various hosts (there may be a need to
synch time for distributed applications).  This is currently done by:
1) Remove filter for hosts in the Log Nav
2) Choosing Properties
3) Setting the "Delta time" property 

Obviously, this isn't a particularly usable solution either (as Curtis
described, how does the user know what the proper delta time should be?). 
However, we may need to find a consistent solution both within the Log Navigator
as well as in the Profiling Monitors view.

I believe it makes sense to set (automatically or manually by the user) when
they are choosing to look at a correlation/merge of data from two or more hosts.  

Witin the Log Navigator, this can be done when the user first chooses to create
a Log Correlation and the time can (ideally) sync automatically.  Alternatively,
the user can set the time offset but we should provide intelligent offset
suggestions given some analysis of the data we have.

Likewise, in the Profiling Monitor view this could be done during an Explicit
Merge (Bug 58385) which is a similar action to creating a Log Correlation. 
Likewise, if an automated mechanism to synch the time is possible, this would
provide improved usability.
Comment 8 Curtis d'Entremont CLA 2005-02-03 12:14:47 EST
Yes, agree this affects more than just the trace sequence diagrams. Log 
correlations are another case where it affects things quite significantly.

I should probably mention that I don't think it is always possible to sync up 
the times automatically, like in the case of logs, or traces where there are no 
distributed calls. There is simply not enough information to be able to do it 
(I think). However, the case where there are distributed calls, there's always 
an invocation context, which should allow the loaders to know the time 
difference.

I like the idea of putting this in the new correlation / new trace merge wizard 
(when we get to that point). The current location is hard to find.
Comment 9 Dominique Guilbaud CLA 2005-02-28 12:10:06 EST
Following this discussion, I'm modifying component and owner. Not sure this is 
you Valentina, sorry.
Comment 10 Marius Slavescu CLA 2005-02-28 13:11:39 EST
The only way to synchronize the time is to set the deltaTime at TRCAgentProxy or
TRCNode level and use it in the viewers.

Log viewer already does that, all the time values are computed by taking in
consideration the deltaTime value.
Comment 11 Valentina Popescu CLA 2005-02-28 13:23:38 EST
Marius, there should be an alghoritm defined inside the model to calculate 
clock skew.
Let's talk about this and decide what release can go into
Comment 12 Marius Slavescu CLA 2005-02-28 13:59:30 EST
We might be able to suggest the offset in some cases (like J2EE, for sure not in
the loaders, they don't know the synchronization context), but in most cases we
won't be able to compute that value.

I see more appropiately to have some UI widget to show us the timeline of each
invloved agent (in the test case would be execution results) and try to sync
them visually by draging each agent until we get the expected result in the
viewer, this in addition to the automated detection and correction.

The model should never be changed, the only thing that can be changed is the
deltaTime, which could be set by an automatic deskewing algorithm (executed in
the view whenever a synchronization problem is encountered).
Comment 13 Marius Slavescu CLA 2005-02-28 14:06:29 EST
We should also make the deltaTime more visible, some label or icon decorator on
agent/node entries in navigators might help.
Comment 14 Alex Nan CLA 2005-03-01 11:13:45 EST
The RAC should provide an API to get the delta time on a pair of hosts, we 
shouldn't have to discover it from the model.
Comment 15 Paul Slauenwhite CLA 2005-03-08 12:42:39 EST
Default time correlation is not configurable.

When log and trace data is correlated using the Default time correlation 
schema, the delta time is hard-coded to one second.  This value is not always a 
useful indicator of an association between events.  In order to be usable, this 
time should be customizable based on the data that is being correlated since 
the association between events is data-specific.
Comment 16 Paul Slauenwhite CLA 2005-03-08 12:43:07 EST
*** Bug 87309 has been marked as a duplicate of this bug. ***
Comment 17 Curtis d'Entremont CLA 2005-03-09 10:40:50 EST
I've noticed that when importing and correlating logs from different machines 
with out-of-sync clocks, the log interactions view shows arrows going "back in 
time", making it apparent that the clocks are out of sync. But the trace 
interactions views don't do this, and instead show fragments of a transaction 
back in time with no links to it coming from the "future". Just a thought - 
should these two not be consistent? Maybe this won't be an issue if we can 
automate things for the trace interactions, but if not, then perhaps something 
to at least think about.
Comment 18 Kent D Siefkes CLA 2005-08-31 23:42:38 EDT
Upgrading Priority to P1 per Eric for inclusion in TPTP 4.2 release planning.
Comment 19 Paul Slauenwhite CLA 2005-09-13 08:16:08 EDT
Please omit comments #15 and #16 since this defect is not related to defect 
#87309.
Comment 20 Sri Doddapaneni CLA 2005-12-09 12:57:55 EST
As per PMC F2F discusion, redirecting to Sequence diagram component. It appears to be regression from 3.0.
Comment 21 Eugene Chan CLA 2006-03-28 11:58:25 EST
PMC approved to defer enhencment to later release from 4.2
detail: http://dev.eclipse.org/mhonarc/lists/tptp-pmc/msg01319.html
Comment 22 Kathy Chan CLA 2009-02-19 16:47:02 EST
Moving untargetted enhancements to Future target.
Comment 23 Paul Slauenwhite CLA 2009-06-30 06:39:46 EDT
As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As such, TPTP is not delivering enhancements. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement is resolved as WONTFIX. For this enhancement to be considered, please re-open with an attached patch including the Description Document (see http://www.eclipse.org/tptp/home/documents/process/development/description_documents.html), code (see http://www.eclipse.org/tptp/home/documents/resources/TPTPDevGuide.htm), and test cases (see http://www.eclipse.org/tptp/home/documents/process/TPTP_Testing_Strategy.html).
Comment 24 Paul Slauenwhite CLA 2009-06-30 06:40:04 EDT
As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As such, TPTP is not delivering enhancements. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement is resolved as WONTFIX. For this enhancement to be considered, please re-open with an attached patch including the Description Document (see http://www.eclipse.org/tptp/home/documents/process/development/description_documents.html), code (see http://www.eclipse.org/tptp/home/documents/resources/TPTPDevGuide.htm), and test cases (see http://www.eclipse.org/tptp/home/documents/process/TPTP_Testing_Strategy.html).
Comment 25 Kathy Chan CLA 2010-11-18 18:53:57 EST
As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant enhancements/defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement/defect is verified/closed by the Project Lead since this enhancement/defect has been resolved and unverified for more than 1 year and considered to be fixed. If this enhancement/defect is still unresolved and reproducible in the latest TPTP release (http://www.eclipse.org/tptp/home/downloads/), please re-open.