Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 277431 - JVM crash when profiling with ThreadProf and "Contention Analysis" checked on WinXP
Summary: JVM crash when profiling with ThreadProf and "Contention Analysis" checked on...
Status: CLOSED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: TPTP (show other bugs)
Version: unspecified   Edit
Hardware: PC Windows XP
: P2 critical (vote)
Target Milestone: ---   Edit
Assignee: Yunan, He CLA
QA Contact: Kathy Chan CLA
URL:
Whiteboard: closed471
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-22 04:42 EDT by Chengrui Deng CLA
Modified: 2016-05-05 10:48 EDT (History)
7 users (show)

See Also:
kathy: pmc_approved? (oec)
kathy: pmc_approved? (chris.l.elford)
sluiman: pmc_approved+
kathy: pmc_approved? (ernest)
kathy: pmc_approved+
kathy: pmc_approved? (paulslau)
ewchan: pmc_approved+
ewchan: review+
jgwest: review+


Attachments
JVM crash log file for ThreadProf with StartStop (10.72 KB, text/plain)
2009-05-22 04:47 EDT, Chengrui Deng CLA
no flags Details
Patch to resolve JVM crash regression on ThreadProf (4.70 KB, patch)
2009-05-27 03:51 EDT, Chengrui Deng CLA
chengrui.deng: review?
Details | Diff
Profilers binary for testing on WindowsXP (1.08 MB, application/x-zip-compressed)
2009-05-27 03:52 EDT, Chengrui Deng CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chengrui Deng CLA 2009-05-22 04:42:37 EDT
Build ID: I20090430-2300

Steps To Reproduce:
1. Profing InOut/StartStop with ThreadProf and checked "Contention Analysis" option with TPTP 4.6.0 I3 TP1 candidate build all-in-one for Windows.
2. Execute profiling and the crash happens.


More information:
  TPTP build ID:TPTP-4.6.0-200905191217 (TPTP 4.6.0 I3 TP1 candidate)

  This bug seems like a regression. It is found when I execute TPTP 4.6.0 I3 TP1 testing with ThreadProf when "Contention Analysis" is checked on Windows XP. It is not exist before or using earlier TPTP all-in-one.

  You can just use simple InOut or StartStop test cases in TPTP test suite.

  Below is the crash information in command:
#
# An unexpected error has been detected by Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x02543fe9, pid=4240, tid=4484
#
# Java VM: Java HotSpot(TM) Client VM (10.0-b23 mixed mode windows-x86)
# Problematic frame:
# C  [JPIAgent.dll+0x3fe9]
#
# An error report file with more information is saved as:
# C:\Project\Java_workspace\InOut\hs_err_pid4240.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
  The log file produced by JVM for StartStop is attached.
Comment 1 Chengrui Deng CLA 2009-05-22 04:47:47 EDT
Created attachment 136774 [details]
JVM crash log file for ThreadProf with StartStop
Comment 2 Kathy Chan CLA 2009-05-22 10:13:50 EDT
Eugene,

Please take a look.
Comment 3 Eugene Chan CLA 2009-05-22 10:58:15 EDT
According to the log, there is a exception in the JPIAgent.dll+0x3fe9 file. 
Comment 4 Kathy Chan CLA 2009-05-25 13:32:04 EDT
Yunan and Chengrui,

Any update on this defect?  We need this for TPTP 4.6.
Comment 5 Chengrui Deng CLA 2009-05-25 19:23:55 EDT
(In reply to comment #4)
> Yunan and Chengrui,
> 
> Any update on this defect?  We need this for TPTP 4.6.
> 

Hi, Kathy,
  I am trying to find the cause and solve the problem now. If any update, I will report to Bugzilla.

  Thanks,
  Chengrui
Comment 6 Chengrui Deng CLA 2009-05-27 03:48:02 EDT
  It is found this regression is caused by incomplete modification for bug 270767 mainly in CProfEnv::GetStackTrace(TId threadId, SStackTrace_* stackTrace) function.

1. Problems
  In original GetStackTrace code before modification for bug 270767, it is like below:
====================================================
  SMethodInfo* methodInfo = GetMethodData(stackTrace->pStackEntries[i].methodId);
  stackTrace->pStackEntries[i].methodName = 
            	(char*)malloc(strlen(methodInfo->szMethodName)...
====================================================

  The pointer methodInfo is used directly and not decide whether it is NULL. So when I began to implement shadow trace solution for bug 270767, I added the NULL pointer condition in the code. So the original following code below will be executed when methodInfo is not NULL after modification:
====================================================
// It will be executed when methodInfo is not NULL
stackTrace->pStackEntries[i].methodName = 
            	(char*)malloc(strlen(methodInfo->szMethodName) + 
            	strlen(methodInfo->szMethodSig) + 
            	strlen(methodInfo->szClassName) + 3);
====================================================
  Unfortunately, another pointer stackTrace->pStackEntries[i].methodName will be undefined when methodInfo is NULL. It will be used by JPIAgent to output.

  Another problem is in JPIAgent binformat.h code: 
====================================================
inline size_t
getSize(const bf_string_t t[], const size_t length)
{
  ...
  for (size_t i = 0; i < length; ++i) {
    size += strlen(t[i]) + 1;	
  }
  ...
}
====================================================
  The function strlen is used directly on a char pointer t[i]. If t[i] is NULL, it will be equal to strlen(NULL). It will crash on Windows sometimes after testing.

2. Solutions
  According to analysis above, I have added memory initialization code and NULL pointer check condition. It will solve the regression. 
  I have tested the patch with StartStop and InOut (ThreadProf for "Contention Analysis" option checked and not checked). Regression testing for bug 270767 is also done. I will submit the patch and profilers binary for testing soon.

3. Other issues
  For TPTP 4.6.0 re-attach problem bug 270767, we have filed a counterpart as bug 194081. 
  Because bug 194081 is re-targeted to TPTP 4.5.2.1 milestone 3, we can commit a merged patch of bug 270767 and bug 277431 for it. I will update the patch for bug 194081 later.

  Thanks,
  Chengrui
Comment 7 Chengrui Deng CLA 2009-05-27 03:51:03 EDT
Created attachment 137277 [details]
Patch to resolve JVM crash regression on ThreadProf

Eugene & Jonathan,
  Please help to review the patch code. Thanks.
Comment 8 Chengrui Deng CLA 2009-05-27 03:52:47 EDT
Created attachment 137278 [details]
Profilers binary for testing on WindowsXP
Comment 9 Jonathan West CLA 2009-05-27 14:47:58 EDT
I've looked over the patch and it looks good.
Comment 10 Eugene Chan CLA 2009-05-27 15:00:29 EDT
Patch looks okay, +1 provided that all sprintf stmt is removed before committing.
Comment 11 Eugene Chan CLA 2009-05-27 15:09:32 EDT
(In reply to comment #10)
> Patch looks okay, +1 provided that all sprintf stmt is removed before
> committing.
> 

Sorry typo, I mean print out and not sprintf. There are print out to the console when run with IBM JVM 6 sr4.
Comment 12 Kathy Chan CLA 2009-05-28 10:42:17 EDT
I would like to request that the following defect be considered for PMC
approval for 4.6.0.

1.  Explain why you believe this is a stop-ship defect. How does the defect
manifest itself, and how will users of TPTP / consuming products be affected if
the defect is not fixed?

JVM crash when profiling with ThreadProf and "Contention Analysis" checked on. 

2. Is there a work-around? If so, why do you believe the work-around is
insufficient?
No.

3. Is this a regression or API breakage? Explain.
Yes.  This is a regression from previous driver because of bug 270767.

4. Does this require new API?
No.

5. Who performed the code review?
Jonathan and Eugene.

6. Is there a test case attached to the bugzilla record?
No.  Already covered by existing test cases.

7. What's the nature of the fix?  What is the risk associated with this fix? 
Added memory initialization code and NULL pointer check condition.  Risk is low.

8. Is this fix related to any standards that TPTP adheres to? If so, who has
validated that the fix continues to adhere to the standard?
No.

Please note that Eugene has checked the patch again and there are no print statement that needs to be cleaned up.
Comment 13 Kathy Chan CLA 2009-05-28 11:52:12 EDT
Jonathan,

PMC has approved this for TPTP 4.6.  Since the Intel China team is off on national holiday, could you please check in the patch so that we have this in a build for testing as early as possible?  Thanks!
Comment 14 Jonathan West CLA 2009-05-28 12:05:49 EDT
Patch checked into HEAD w/ PMC approval.
Comment 15 Kathy Chan CLA 2010-11-18 23:27:40 EST
As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant enhancements/defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement/defect is verified/closed by the Project Lead since this enhancement/defect has been resolved and unverified for more than 1 year and considered to be fixed. If this enhancement/defect is still unresolved and reproducible in the latest TPTP release (http://www.eclipse.org/tptp/home/downloads/), please re-open.