| Summary: | [regression] Running a class in enabled mode in the workbench | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | z_Archived | Reporter: | Navid Mehregani <nmehrega> | ||||||||
| Component: | TPTP | Assignee: | Igor Alelekov <igor.alelekov> | ||||||||
| Status: | CLOSED FIXED | QA Contact: | |||||||||
| Severity: | major | ||||||||||
| Priority: | P1 | CC: | andrew.kaylor, duncan, haggarty, smith, vlegros | ||||||||
| Version: | unspecified | Keywords: | plan | ||||||||
| Target Milestone: | --- | ||||||||||
| Hardware: | PC | ||||||||||
| OS: | Windows XP | ||||||||||
| Whiteboard: | closed460 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
Navid Mehregani
Created attachment 41528 [details]
Error Screenshot
Note that the error message also needs to be updated, since there doesn't seem to be a dataChannelSize in serviceconfig.xml for the new AC. We are getting RC=-3 from ra_attachToShm when the piAgent first tries to connect to the Agent Controller (via BC). It seems to be failing in the shared memory code. A scan of the code suggests that the error is an OSS_ERR_NOTFOUND after either a CreateFileMapping() or OpenFileMapping() or shmget() call from ipcMemAttach in ossipcmemory.cpp. The code sems to be substantially the same for both AC and RAC but the problem has so far only surfaced with the AC. Navid can reproduce this problem solidly on his machine but it is intermittent on mine (once out of every several attempts). I do get alternative symptoms solidly: either the missing MethodA as reported by Navid or often no data in the execution stats view at all. Transferring to Platform.Collection as a shared memory problem. I originally overestimated the severity of this bug. This problem is actually just causing the 'ClientConnectionResetTestAttach' test case to fail in the automation framework. This seems to be a regression. I didn't experience this problem before. I'm reducing the severity to major. Note that you can use the automated framework to reproduce this bug. Just run the ClientConnectionResetTestAttach test case. I've tried this with the old RAC and it does NOT seem to suffer from this problem. It's only the new AC that's experiencing this. Note that when you attach to the profiling console, Eclipse automatically switches the console view to the console for the agent. There is a button on the console to switch back to the console of StartStop. Please see the attached image. Created attachment 41921 [details]
Console Switch
This problem seems to only occur with Sun JDK 1.5. I can't reproduce the problem on Sun 1.4, IBM 1.4/1.5. Hendra has also reproduced this problem on his laptop. I am not sure if this is useful information or not, but... We observed that the shared memory was created by the AC, but it disappeared quickly. The shared memory could have been deleted soon after it was created, so it seemed. Setting target and priority. I have gone through these steps and been successful each time. It is possible that we work we have done with removing static buffers has fixed this issue. I'm still able to reproduce this problem with the TPTP-4.2.0-200605190100 driver. Note that I consistently get this problem when I run the 'ClientConnectionResetTestAttach' automated test case. Can you try running this test case to see what happens? I have been testing using ClientConnectionResetTestAttach... and I am seeing a failure. I believe this to be a regression from earlier runs I have done... but I have not run this test for a while. The test suite tries to verify that a method has been executed and the trace dump does not look accurate. (it doesn't match the expected results which to look correct to me) Investigating further. there was an error with how BC was setting up shared memory... this allowed a race condition where we started a flusher thread before the agent was ready to monitor so the memory flusher exited (because nobody was attached) then we destroyed the shared memory before the agents attach was completed. Kevin, I'm still able to reprodce this problem with the TPTP-4.2.0-200605251528 driver and SUN JDK 1.5 specified in my local_config_file.xml. It either gives me the shared memory problem or methoA can't be found. I've also reproduced it on my desktop. If you're not observing the problem it could be due to the difference in the speed of our computers and how fast context switching is done. Andy - please revisit this in Kevin's absence. I just tested this again and I can't seem to reproduce it. For now, I'll mark this as fixed and reopen it if I come across the problem again. Verified. This problem is caused by a race condition. In some cases it seems to work fine, but the problem still exists. Kevin has been able to repduce the problem on his machine so I'll reopen the bug again. Retargetting - If time allows, will try to determine if further refinement of the fix can be done in i4. The failures are rare enough that I don't think this issue needs resolving in 4.2 and we are out of time. It is a candidate for 4.2.1, so setting the target to that for now. I believe this to be a timing issue. The AC is passing back all the profiled data that it is being sent but the method counts are incorrect and "methodA" from the does not have a corresponding methodEntry and methodExit. I have only been able to get this test to fail as part of the test framework... (Navid mentioned it is possible to time things to get it to fail using the gui) I am retargeting this bug to 4.3 and will look into getting some piAgent and probekit assistance to debug the issue further. I looking into old bugzilla reports I noticed that the RAC had some similiar timing issues that were resolved, this might be a direction to pursue. Resetting priority to P2. Unless resources free up to assist in looking at this issue, it will not be fixed in 4.3. Retargeting to 4.4 as 4.3 is closing down to all non-essential bug fixing. Reassign owner and set priority to P1 Added effort estimate: 10 days Created attachment 61096 [details]
patch
Andy, could you review the patch? Checked in Igor's fix Allan, can you please verify the fix for this with the automation framework? Using the 4.4.0GA candidate and Sun jdk 1.5.0_12 I get the error: [ClientConnectionResetTestAttach] methodEntry of methodA was not found about 1/3 of the time and not the shared memory error. Note in 4.4 we are now using jre 1.4.2 to test piagent. I used 1.5 specifically to try this. Alan, are you using the same test case? Yes - ClientConnectionResetTestAttach As of TPTP 4.6.0, TPTP is in maintenance mode and focusing on improving quality by resolving relevant enhancements/defects and increasing test coverage through test creation, automation, Build Verification Tests (BVTs), and expanded run-time execution. As part of the TPTP Bugzilla housecleaning process (see http://wiki.eclipse.org/Bugzilla_Housecleaning_Processes), this enhancement/defect is verified/closed by the Project Lead since this enhancement/defect has been resolved and unverified for more than 1 year and considered to be fixed. If this enhancement/defect is still unresolved and reproducible in the latest TPTP release (http://www.eclipse.org/tptp/home/downloads/), please re-open. |