| Summary: | [POG] Fail to attach to process launched with IBM JDK and JVMTI agent. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | z_Archived | Reporter: | Eugene Chan <ewchan> | ||||||
| Component: | TPTP | Assignee: | Asaf Yaffe <asaf.yaffe> | ||||||
| Status: | CLOSED FIXED | QA Contact: | |||||||
| Severity: | critical | ||||||||
| Priority: | P1 | CC: | analexee, asaf.yaffe, chris.l.elford, jkubasta, rashraf | ||||||
| Version: | unspecified | Flags: | asaf.yaffe:
review?
(chris.l.elford) |
||||||
| Target Milestone: | --- | ||||||||
| Hardware: | PC | ||||||||
| OS: | Windows XP | ||||||||
| Whiteboard: | |||||||||
| Attachments: |
|
||||||||
|
Description
Eugene Chan
Created attachment 87565 [details]
patch from Asaf
With the attached patch from Asaf, The Attach works as expected.
However, start/stop monitoring after attach does not collect any information.
The patch inconsistently show problems on application after attach, something similar to Exception in thread "AWT-EventQueue-0" java.lang.InternalError: Non-native Primitive invoked natively is thrown in console view and target application is frozen and no user interaction is permitted. *** Bug 215436 has been marked as a duplicate of this bug. *** Created attachment 87977 [details]
Updated patch.
A patch to solve the problem (attached for the purpose of code review).
Important notes:
* For execution time analysis (CGProf): the patch guarantees that filtered-out classes will not be redefined. Therefore, if redefinition still does not work, try to filter-out problematic classes using the regular filtering mechanism.
* For Heap analysis (HeapProf): the user-defined filter affects the *types* of objects to track, and not the classes which are instrumented. Therefore, filters cannot be used to control the list of classes being instrumented/redefined. The patch excludes (hard-coded list) all "java.lang.*" classes to overcome the known limitation of the IBM 1.5 JVM.
* There are still some open issues with class redefinition in IBM 1.5 SR6. For example, the Java2D demo application (from the Java JDK) may not work properly after attach/detach.
Chris, can you please review the patch? Patch checked-in to 4.4.1 branch. Pre-approved by PMC. Patch checked-in to 4.5 (HEAD) branch I'm not sure if in CGAdapter.cpp
LOG_INFORMATIVE4("CGAdaptor", 0, false,
"Error instrumenting '%s.%s(%s)'. Reason: %x",
classInfo.szClassName, pMethodInstrumentationInfo->szName,
pMethodInstrumentationInfo->szSignature, res);
should be more aggressive than an informative message? If the pMethIter->HasNext() loop wants to instrument 2 methods and the first succeeds but the second fails, won't this still return successfully in some way? I'd think there should be a MRTE_ERROR_FAIL or something here? Unfortunately, I'm not quite sure how to put it in since you may have already instrumented a few methods... It seems the assumption is that instrMethodCount will be zero or the number of methods desired. This may well be correct (since all methods should fail to instrument on flawed classes) In this case, it not needed to have a failedMethod flag of some sort set before the break and you can use the count as you are doing.
This question ties into the consumption in JPI/DataManager.cpp... I would think we would want the failing IBM VM case to fall into the
// Error instrumenting class. This is a fatal error.
MARTINI_ERROR1("CDataManager",
"Failed to instrument class '%s'", pClassInfo->szNativeName.Get());
case in this file instead of the
else if (MRTE_ERROR_INSTRUMENTATION_NOT_NEEDED == iRes
|| MRTE_ERROR_UNABLE_TO_INSTRUMENT == iRes)
As I read it, it looks like it will return INSTRUMENTATION_NOT_NEEDED instead of MRTE_ERROR_FAIL. I may be reading this wrong though so feel free to correct me. :-)
There is a related question for JavaInstrumentorManager.cpp but I suspect that you will have a good answer to my confusion above that will also satisfy any concerns I have there as well... :-)
Thx,
Chris
(In reply to comment #8) > I'm not sure if in CGAdapter.cpp > LOG_INFORMATIVE4("CGAdaptor", 0, false, > "Error instrumenting '%s.%s(%s)'. Reason: %x", > classInfo.szClassName, > pMethodInstrumentationInfo->szName, > pMethodInstrumentationInfo->szSignature, res); > > should be more aggressive than an informative message? If the call to InstrumentMethod (just above the logging code) fails, the error code is stored in the 'res' variable, the loop ends, and the DoCallGraphInstrumentation() method will return an error code. Then it is up to the consuming code to decide what to do with this failure. This is also the reason of only logging the error and not terminating the application (which will happen if using the LOG_ERROR macro). As for the consuming code: JavaInstrumentationManager.cpp: in the ClassFileLoadHook event handler, instrumentation failures are logged but the application keeps running (with the non-instrumented class). Based on past experience, this gives better user experience because most instrumentation failures are actually caused by bugs in the instrumentation (and I do not want my bugs to cause crashes in the profiled application). In the logic for redefining a class (same case as in DataManager.cpp): failure to redefine means an inconsistent class state. This is a serious error, and therefore the application will be terminated in such a case. > This question ties into the consumption in JPI/DataManager.cpp... I would > think we would want the failing IBM VM case to fall into the > > // Error instrumenting class. This is a fatal error. > MARTINI_ERROR1("CDataManager", > "Failed to instrument class '%s'", > pClassInfo->szNativeName.Get()); > > case in this file instead of the > else if (MRTE_ERROR_INSTRUMENTATION_NOT_NEEDED == iRes > || MRTE_ERROR_UNABLE_TO_INSTRUMENT == iRes) > Do not confuse instrumentation failures with class redefinition failure (redefinition happens only if the class was successfully instrumented). The IBM JVM problems are related only to redefinition. I am not aware of any instrumentation limitations. Hi Asaf, I am just curious to find out what were the known limitations with IBM JVM? Are these fixed now? Why was java.lang hardcoded to be filtered out? thanks (In reply to comment #10) Our experiments show that the IBM JVMs (both 1.5 and 1.6) have problems with RedefineClasses for certain classes. A possible indication for this is the fact that the IBM JVM does not support the can_redefine_any_class capability, even in the latest 1.6 version. Some of the problems with RedefineClasses (mainly JVM crashes) were resolved in 1.5 SR6, but there are still problem. We found that if we exclude all classes the VM loads before VMInit (pretty much all java.lang.* classes and a few other packages) we can get some dynamic-attach scenarios to work without problems. Another interesting observation with the latest IBM JVM versions is that the errors we get when using RedefineClasses are Java-level errors: the JVM does not crash, but it starts to throw Java exceptions about incorrect method identifiers and other weird things (see for example bug 217717). Maybe RedefineClasses cause some inconsistencies between instrumented and non-instrumented classes? I don't have a clue... We forwarded our finding to the IBM JVM team but didn't get conclusive answers. The JVM team thinks there's a problem with our use of RedefineClasses. What we know for sure is that these problems are specific to IBM JVMs. Sun and Bea 1.5 JVMs support the can_redefine_any_class JVMTI capability and never crash on RedefineClasses... Thanks, Asaf Thanks Asaf, I think we should restart the discussions with the jvm team. I will discuss this with eugene. I am guessing that we saw crashes happening after calls to RedefineClasses from which we concluded that this is the actual problem? Is this specific to enable mode and not the other ones? (In reply to comment #12) > I am guessing that we saw crashes happening after calls to RedefineClasses from > which we concluded that this is the actual problem? Is this specific to enable > mode and not the other ones? Correct. These problems are happening only in dynamic-attach scenarios (e.g. enabled mode or detach during profiling) and are specific to IBM JVMs. CLOSE BUG |