| Summary: | [CGProf] Crash on multi-core platforms when printing <methodDef> element | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | z_Archived | Reporter: | Asaf Yaffe <asaf.yaffe> | ||||||
| Component: | TPTP | Assignee: | Viacheslav <viacheslav.g.rybalov> | ||||||
| Status: | CLOSED FIXED | QA Contact: | |||||||
| Severity: | critical | ||||||||
| Priority: | P1 | CC: | analexee, guru.nagarajan, ivan.g.popov, stanislav.v.polevic | ||||||
| Version: | unspecified | ||||||||
| Target Milestone: | --- | ||||||||
| Hardware: | PC | ||||||||
| OS: | Windows XP | ||||||||
| Whiteboard: | closed460 | ||||||||
| Bug Depends on: | 168531 | ||||||||
| Bug Blocks: | 190202 | ||||||||
| Attachments: |
|
||||||||
During the test execution following profiler assert statement in debug mode fails sometimes:
[Error: assert "iRes == MRTE_RESULT_OK" failed. File: c:\work\tptp\4.3\jvmtiagent\baseprof\sources\profenv.cpp Line:277]
iRes in this case is -2147483648
This is the code:
SClassInfo* classInfo = new SClassInfo;
TResult iRes = m_pMpiApi->GetClassInfo(m_clientId, classId,
DR_JAVA_NATIVE_CLASS_NAME | DR_SOURCE_FILE_NAME, classInfo);
LOG_ASSERT(iRes == MRTE_RESULT_OK);
(In reply to comment #1) I saw this assertion failure once but was not able to reproduce. I am not sure whether it is related to this bug or not. Slava, can you please turn-on the Martini logging (level 5) and post a log file here in case you reproduce this assertion again? Thanks, Asaf Created attachment 67644 [details]
Martini log file
I was able to reproduce the assertion. The GetClassInfo API is called with an invalid class id. The class id passed to the function seems like a method id (judging by its value, which is too high for a class id). This is another indication that the New Method event handler data is corrupted. The crash happens because Martini produces multiple NewMethod events (see 168531 JVMTI CG profiler. Martini produces multiple NewMethod events). In each event handler profiler tries to store Method data into internal storage. If this storage already contains structure with specified method Id, it deletes the previously stored structure and puts into the storage new one. In the same time reference to deleted structure may be used in other thread that previously created this data structure. In 'PowerWorkload 10' test case the test creates 10 new threads almost simultaneously and Martini invokes NewMethodEvent handler for 'run' method in all threads also simultaneously. There are two ways to fix the bug: 1. To fix 168531 JVMTI CG profiler. Martini produces multiple NewMethod events 2. To fix logic of internal data storing in the profiler. Both approaches may be applied together. Easy way to fix the bug: do not delete stored structures, but it will cause memory leaks. It will be eliminated by fixing the bug 168531. Fixing Bug 168531 at this stage is risky. It may introduce regression problems in other profilers and event handlers (the New Method Event has multiple implementations for different JVMs and instrumentation scenarios). Therefore, I think this bug should be fixed by modifying the data storage logic in CGProf. It is also advisable to verify that the same problem do not happen in other databases maintained by CGProf and other profilers. Here's one possible way of fixing the New Method event handler: 1. Check if the method already exists in the database: profenv->GetMethodData(data.methodId) 2. If not exists: get information from Martini and store in the database: profenv->AddNewMethodData(data.methodId) This solution is thread safe since both GetMethodData and AddNewMethodData are protected by the same critical section (m_pMethodSDataLockObject). Created attachment 69508 [details]
quick patch for the bug
*** Bug 190194 has been marked as a duplicate of this bug. *** The patch is checked in CVS. Resolving as fixed since patch committed As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant enhancements/defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement/defect is verified/closed by the Project Lead since this enhancement/defect has been resolved and unverified for more than 1 year and considered to be fixed. If this enhancement/defect is still unresolved and reproducible in the latest TPTP release (http://www.eclipse.org/tptp/home/downloads/), please re-open. As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant enhancements/defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement/defect is verified/closed by the Project Lead since this enhancement/defect has been resolved and unverified for more than 1 year and considered to be fixed. If this enhancement/defect is still unresolved and reproducible in the latest TPTP release (http://www.eclipse.org/tptp/home/downloads/), please re-open. |
Driver: 4.4.0-200705080100A (TPTP 4.4.i3 Candidate Build, Patch A) O/S: Windows 2003 Server Platform: Dual Intel Xeon HT 3.2 GHz (4 virtual cores) JVM: reproduced with Sun and JRockit 1.5 for Windows IA-32 (latest releases) Crash in standalone Aggregated CGProf when using the org.eclipse.tptp.scenario.thread.PowerWorkload test scenario. Command line for reproducing the error: -cp <test framework dir>\bin -agentlib:JPIBootLoader=JPIAgent:server=standalone,file=trace.trcxml;CGProf org.eclipse.tptp.scenario.thread.PowerWorkload 10 Stack trace of the crashing thread: strlen() line 66 Martini::JPIAgent::CPrint::FormatName(const char * 0xfeeefeee) line 258 + 9 bytes Martini::JPIAgent::CPrintXML::printNewMethodElement(unsigned __int64 65875, Martini::MPI::SMethodInfo * 0x4382ade0) line 213 + 15 bytes PrintMethodDefElement(Martini::JPIAgent::EC_Env * 0x41cdb660, unsigned __int64 65875, Martini::MPI::SMethodInfo * 0x4382ade0) line 282 + 30 bytes Martini::JPIAgent::EC_Env::PrintMethodDefElement(unsigned __int64 65875, Martini::MPI::SMethodInfo * 0x4382ade0) line 118 + 26 bytes Martini::CGProf::CNewMethodEvent::HandleEvent(Martini::MPI::SNewMethodEventData & {...}) line 64 Martini::JPI::CNewMethodEventDispatcher::Notify(Martini::JPI::SEmData * 0x43cfe2ac, Martini::MPI::IEventObserver * 0x41d49588, unsigned int 8) line 422 + 17 bytes Martini::JPI::CEventDispatcher::NotifyObservers(Martini::JPI::SEmData * 0x43cfe2ac) line 248 + 30 bytes Martini::JPI::CEventManager::NotifyMpiEvent(unsigned int 3, Martini::JPI::SEmData * 0x43cfe2ac) line 881 Martini::JPI::CEventManager::NewMethodEvent(unsigned __int64 65875, const JNIEnv_ * 0x438eaf74) line 1391 MethodEnterHandler(JNIEnv_ * 0x438eaf74, _jobject * 0x43cfe3b4, unsigned char 0, long 65875) line 1449 It seems that the content of the methodInfo pointer passed from the CGProf New Method Event handler is corrupted by another thread. An initial investigation suggests that the Martini GetMethodInfo API returns correct data. Assigning to Slava for further investigation.