Community
Participate
Working Groups
A segmentation fault occurs sometimes when attempting to open an experiment including a trace in version 2.3 format, the environment used: - In a Virutal Machine using Virtual box 3.0.12 - Using Ubuntu 9.10 - PC 32 bits - Eclipse Version: 3.6.0.v20090930-7b7kFHlFEx2XkxZQja7HFJ3 Build id: I20100312-1448 - JVM Sun 1.5.0.18 the problem has been reproduced by opening the same experiment multiple times or switching from an experiment using traces in 2.6 format, to an experiment in using traces in 2.3 format
Created attachment 171564 [details] seen when selecting same experiment muliple times (with traces in v 2.3. format)
Created attachment 171565 [details] switching 2.6 to 2.3
The bug is related to the maximum opened files. For example, a trace with all channel activated on a 8 cores systems will yield a trace directory with 176 files that are all opened at reading time. The default opened files limit for a process on Linux is 1024. Eclipse itself uses about 240 files itself, there are 784 file descriptors left for traces, or only 4 "full" traces on a 8 cores system. The provided patch closes traces from the current experiment before opening new ones. Hence, it prevent crash by opening and reopening traces multiple times. This bug will also occur in experiments that open multiple traces exceeding the maximum opened file limits.
Created attachment 193888 [details] close opened trace before opening new one
I'm not sure about this one. The code you add clears the traces of a 'TmfExperiment' while we are managing an 'LTTngExperiment', two different beasts that shouldn't mix. In fact there should not be any instance of a TmfExperiment in the LTTng application. If there is, we have a much more serious problem (that could explain the multiplication of file handles). I will run a few tests to see if this is the case. I will also try to get a hold of a 2.3 trace. You wouldn't have one handy, would you? If so, could you attach it to the bug? Thanks.
I ran the thing in Debug and everything looks OK. It seems that the only experiment live at any time is an LTTngExperiment (which extends TmfExperiment). So the proposed patch has practically no effect since, in practice, fCurrentExperiment == TmfExperiment.getCurrentExperiment(). There is a slight side effect where the (empty) checkpoints table is cleared again. Another side effect is that the TmfExperimentDisposedSignal is issued twice but that should not be too serious. A test with various trace formats is the next step.
Ugh... last patch is obviously silly, sorry for the inconvenience. I retested and confirm the bug when an experiment contains more traces than the maximum allowed opened files. I can reproduce with 2.6 traces. The original bug looked a lot like the problem I had because the same function segfaults: C [liblttvtraceread.so+0x6087] Java_org_eclipse_linuxtools_lttng_jni_JniEvent_ltt_1positionToFirstEvent+0x7
Unsetting target milestone for old bugs.
Legacy LTTng support is being removed in Linux Tools 2.0.