Community
Participate
Working Groups
Build Identifier: Elf executable can have multiple symbols at the same address and they may have zero size. ElfExecutableSymbolicsReader deliberately discards zero length symbols at an address it already has a symbol for. There is a comment which explains why: // Multiple symbol entries for the same address are generated. // Do not add duplicate symbols with 0 size to the list since it confuses // debugger However I need these other discarded symbols as they contain information which is required. Hence I want to patch ExecutableSymbolicsReader to not discard this information. How does it confuse the debugger? It does not cause any tests to fail (except the one which check that the number of symbols is the same as last time). Reproducible: Always
Ed, adding you to the CC: list so that you might comment on this, given your name on the comments in question when i annotate ElfExecutableSymbolicsReader.java .
From what I remember, one of the ARM compilers we use generates a ton of empty symbols and we could not distinguish these from "real" symbols at the given address. The main problem is, there is the API "ISymbol IExecutableSymbolicsReader#getSymbolAtAddress(IAddress)". By design it only returns one entry. So, unless we filter the symbols, this could potentially return a compiler-generated symbol instead of a "real" user symbol. This could potentially impact stack traces or lookups for the startup breakpoint function name, for example. Also, given their large numbers, there was a significant memory impact generating these symbols for some large executables. Since at the time we did not need them, it seemed like a win in both cases to ignore them. Probably the appropriate patch would be to store a multimap (TreeMap<IAddress, List<ISymbol>>) and add an "ISymbol[] IExecutableSymbolsReader#getSymbolsAtAddress(IAddress)" API. The old API should remain, but return only the symbol that the current implementation returns, which I hope would be the first in the List<ISymbol> above. The multimap would be needed in any case. In addition, it would be nice to see: (1) if the number of extra required symbols can be limited, rather than adding every zero-length symbol (due to the memory impact), by modifying the current tests or (2) if there could be some way to pass information about whether the zero-length symbols are needed. (I.e. if you directly use ExecutableSymbolicsReaderFactory, which is internal, then you could pass a new flag asking for all the zero-length symbols, leaving the rest EDC unaffected. Maybe the setting for this flag can be detected based on architecture, OS, debug format, etc.)
Created attachment 201127 [details] Add getSymbolsAtAddress method and record zero size symbols occuring at the same address I have tried to implement that. I have aimed at option 2 and have a field which specifies whether repeat zero size symbols should be included. However since I do not use ExecutableSymbolicsReaderFactory directly but come in via Symbols.getSymbolReader(IPath) working out how to pass the flag is more complex. While we could use the cpu type to determine whether to include the repeat symbols or not this would not reflect causation correctly. Whether the repeat zero size symbols should be included is a function of what the user wants from the API rather than a function of information contained in the executable. At present the boolean field correctly controls behaviour and if set to false the old behavior is used and the same number of symbols are returned (according to the tests).
> At present the boolean field correctly controls behaviour and if set to false > the old behavior is used and the same number of symbols are returned (according > to the tests). What's the difference in the number of symbols when the flag is set? (Maybe try the binaries under org.eclipse.cdt.edc.debug.tests.) > symbols or not this would not reflect causation correctly. Whether the repeat > zero size symbols should be included is a function of what the user wants from > the API rather than a function of information contained in the executable. Yes, you're right -- it's really up to the client whether the zero-size symbols are needed. I'm wondering, are you working on a custom debugger service which will use #getSymbolsAtAddress(), or will this be in EDC proper? I.e. what are the use cases for this? If there really are a much larger number of zero-size symbols than not, I'm thinking out loud here about other directions, if we want to reduce the memory impact of just keeping your new behavior enabled by default for all readers: -- the Elf class from which ElfExecutableSymbolicsReader works does contain all the symbols. Can you instantiate and use that directly for your use case? Or would your use case affect core EDC services functionality? -- or, do the symbols that you need follow a pattern, where most of the zero-size symbols can still be ignored? -- or, would it make sense to fetch the full address -> symbol list map only on demand if a client actually calls the getSymbol_s_AtAddress method? I.e. initially fetch only the one-to-one map rather than the one-to-N map, and refetch if the other API is used? (This would complicate things since the Elf object is only passed to the constructor.) --- As for the style of the patch itself: -- good call switching to TreeMap over the manual sorting and binary searching. Not sure why I never noticed IAddress implemented Comparable<>! -- please use ArrayList, not LinkedList. LinkedList only makes sense if you're planning to remove elements from the list. Otherwise ArrayList is faster and uses less memory, assuming you construct with initialCapacity==1. -- please refactor the logic, so the BaseExecutableSymbolicsReader handles the data structures (adding symbols to the map of lists, searching for them, gathering them, culling them for #getSymbolAtAddress), and leaving the ELF/PE specific symbol stuff to those concrete classes. As it is, essentially the same code is duplicated twice, which will be a maintenance problem. -- no new unit tests? Really? :) Even for a prototype, I'd expect you to add tests to prove that your use cases are satisfied and that the #getSymbolAtAddress API works the same way whether or not zero-size symbols have been parsed.
Created attachment 201381 [details] Add getSymbolsAtAddress method and record zero size symbols occuring at the same address (In reply to comment #4) > > At present the boolean field correctly controls behaviour and if set to false > > the old behavior is used and the same number of symbols are returned (according > > to the tests). > > What's the difference in the number of symbols when the flag is set? (Maybe > try the binaries under org.eclipse.cdt.edc.debug.tests.) I tested four different situations. The current behaviour, allowing repeat symbols but excluding those beginnign with C$debug_ (gcce appears to have a bug where it produces lots of these) or which are symbols representing files or sections, excluding which represent files or sections, and including all symbols. The results are as follows: Symbol source : no repeat zero/no C$debug_/no non ED/repeat zero BlackFlagMinGW.exe : 2520/2520/2520/2520 BlackFlag_gcce.sym : 2372/2387/2908/2941 BlackFlag_linuxgcc.exe : 429/443/443/469 BlackFlag_rvct.sym : 626/718/718/750 HelloWorld_rvct_2_2.exe.sym : 151/182/182/194 HelloWorld_rvct_4_0.exe.sym : 227/261/261/332 QtConsole_gcce_343.sym : 509/524/997/1008 > > symbols or not this would not reflect causation correctly. Whether the repeat > > zero size symbols should be included is a function of what the user wants from > > the API rather than a function of information contained in the executable. > > Yes, you're right -- it's really up to the client whether the zero-size symbols > are needed. I'm wondering, are you working on a custom debugger service which > will use #getSymbolsAtAddress(), or will this be in EDC proper? I.e. what are > the use cases for this? I am working on a custom plugin/product so no not in EDC proper (though testing reveals that possibly getSymbolsAtAddress() should be used in EDC proper as well and possibly getSymbolAtAddress() deprecated). > If there really are a much larger number of zero-size symbols than not, I'm > thinking out loud here about other directions, if we want to reduce the memory > impact of just keeping your new behavior enabled by default for all readers: Does doubling count as a much larger number? > -- the Elf class from which ElfExecutableSymbolicsReader works does contain all > the symbols. Can you instantiate and use that directly for your use case? Or > would your use case affect core EDC services functionality? Potentially though that would involve duplicating the functionality in readSymbols. I would much rather find a way of making it just work. > -- or, do the symbols that you need follow a pattern, where most of the > zero-size symbols can still be ignored? They don't particularly follow a pattern and since the symbols are specified by the user (both in their linker scripts and (eventually) when using the plugin) no pattern can be guaranteed. However the pattern of symbols erroneously produced by gcce can be matched against and excluded if necessary. > -- or, would it make sense to fetch the full address -> symbol list map only on > demand if a client actually calls the getSymbol_s_AtAddress method? I.e. > initially fetch only the one-to-one map rather than the one-to-N map, and > refetch if the other API is used? (This would complicate things since the Elf > object is only passed to the constructor.) That would be rather messy and it looks like getSymbolsAtAddress should be used all the time as even in the current case there can be more than one symbol at an address (see test patch) > --- > > As for the style of the patch itself: > -- please use ArrayList, not LinkedList. LinkedList only makes sense if you're > planning to remove elements from the list. Otherwise ArrayList is faster and > uses less memory, assuming you construct with initialCapacity==1. Yes sorry. > -- please refactor the logic, so the BaseExecutableSymbolicsReader handles the > data structures (adding symbols to the map of lists, searching for them, > gathering them, culling them for #getSymbolAtAddress), and leaving the ELF/PE > specific symbol stuff to those concrete classes. As it is, essentially the > same code is duplicated twice, which will be a maintenance problem. Done, though some functionality has to be almost duplicated in PEExecutableSymbolicsReader until another bug is fixed so that BaseExecutableSymbolicsReader functionality will work. > -- no new unit tests? Really? :) Even for a prototype, I'd expect you to add > tests to prove that your use cases are satisfied and that the > #getSymbolAtAddress API works the same way whether or not zero-size symbols > have been parsed. I have added some tests on existing behavior in terms of symbol collection on both getSymbolAtAddress and getSymbolsAtAddress. They both fail. getSymbolAtAddress fails because one of my tests catches a case where there is a zero length and a non-zero length symbol at the same address and the zero length symbol occurs first and so is the only one returned. getSymbolsAtAddress fails because a different address is given for a symbol in readelf to that given by EDC expected 0x00008015 and got 0x00008014.
Created attachment 201382 [details] (failing) Tests on getSymbolAtAddress and getSymbolsAtAddress
Created attachment 201383 [details] Refactor out duplicate functionality from ArmElfExecutableSymbolicsReader
Thanks, looks better. The comment trail is getting kind of large so I'll abbreviate -- hope I don't miss anything big. -- Ok, the extra symbols (C$debug...) only seems to affect GCC builds, and those symbols seem to be used for source lookup when the .sym file is not available. This wouldn't affect performance of big executables for us (RVCT doesn't do this). I'd be okay if you excluded those and commented why they're excluded. If we need them later, we'd know where to look. -- I see that EDC only currently uses this API for resolving function names in a stack frame where no .sym file is available, or as stage 1 of detecting whether a symbol is ARM/Thumb. Indeed, it will *reread the ELF file directly* in most cases to find the $a or $t symbol, outside IExecutableSymbolicsReader (!). Your patch would be the first step for us to avoid that horrible doubling of work and memory. > > -- please refactor the logic, so the BaseExecutableSymbolicsReader handles the > > data structures (adding symbols to the map of lists, searching for them, > > Done, though some functionality has to be almost duplicated in > PEExecutableSymbolicsReader until another bug is fixed so that > BaseExecutableSymbolicsReader functionality will work. Ok. > I have added some tests on existing behavior in terms of symbol collection on > both getSymbolAtAddress and getSymbolsAtAddress. They both fail. > getSymbolAtAddress fails because one of my tests catches a case where there is > a zero length and a non-zero length symbol at the same address and the zero > length symbol occurs first and so is the only one returned. Yes, we expect to see this case of multiple symbols at one address. I think if your patch simply added any non-zero-size symbols before the zero-size symbols in the list part of TreeMap<IAddress,List<ISymbol>>, that would preserve the behavior for the old API (via BaseExecutableSymbolicsReader#getSymbolAtAddress). It would make it easier to resolve the ARM/Thumb logic above as well. > getSymbolsAtAddress > fails because a different address is given for a symbol in readelf to that > given by EDC expected 0x00008015 and got 0x00008014. It seems that for whatever reason (@Warren), ARMSymbol clears the low bit of the address in its constructor and wants clients to use #isThumbMode() instead. I think there were a variety of reasons we had for canonicalizing the addresses this way. I'd say, edit the test so it matches 8014 for now.
Created attachment 201406 [details] Add getSymbolsAtAddress method and record zero size symbols occuring at the same address (In reply to comment #8) More abbreviation. > -- Ok, the extra symbols (C$debug...) only seems to affect GCC builds, and > those symbols seem to be used for source lookup when the .sym file is not > available. This wouldn't affect performance of big executables for us (RVCT > doesn't do this). I'd be okay if you excluded those and commented why they're > excluded. If we need them later, we'd know where to look. If it isn't necessary to exclude them I would rather not exclude them at all. > > I have added some tests on existing behavior in terms of symbol collection on > > both getSymbolAtAddress and getSymbolsAtAddress. They both fail. > > getSymbolAtAddress fails because one of my tests catches a case where there is > > a zero length and a non-zero length symbol at the same address and the zero > > length symbol occurs first and so is the only one returned. > > Yes, we expect to see this case of multiple symbols at one address. I think if > your patch simply added any non-zero-size symbols before the zero-size symbols > in the list part of TreeMap<IAddress,List<ISymbol>>, that would preserve the > behavior for the old API (via > BaseExecutableSymbolicsReader#getSymbolAtAddress). It would make it easier to > resolve the ARM/Thumb logic above as well. I have put the logic in the getSymbolAtAddress method rather than in the adding of the symbol to the list as adding to different ends of the list depending on whether or not the symbol is zero size is more likely to result in odd behavior. > > getSymbolsAtAddress > > fails because a different address is given for a symbol in readelf to that > > given by EDC expected 0x00008015 and got 0x00008014. > > It seems that for whatever reason (@Warren), ARMSymbol clears the low bit of > the address in its constructor and wants clients to use #isThumbMode() instead. > I think there were a variety of reasons we had for canonicalizing the > addresses this way. I'd say, edit the test so it matches 8014 for now. Edited the test and the patch so that the modified addressed used by the ARMSymbol is used to index the ARMSymbol so we can look up using 8014 and get the symbol back.
Created attachment 201407 [details] Tests on getSymbolAtAddress and getSymbolsAtAddress
> > excluded. If we need them later, we'd know where to look. > > If it isn't necessary to exclude them I would rather not exclude them at all. Agreed. > BaseExecutableSymbolicsReader#getSymbolAtAddress). It would make it easier to > > resolve the ARM/Thumb logic above as well. > > I have put the logic in the getSymbolAtAddress method rather than in the adding > of the symbol to the list as adding to different ends of the list depending on > whether or not the symbol is zero size is more likely to result in odd > behavior. Ok, makes sense. > > I think there were a variety of reasons we had for canonicalizing the > > addresses this way. I'd say, edit the test so it matches 8014 for now. > Edited the test and the patch so that the modified addressed used by the > ARMSymbol is used to index the ARMSymbol so we can look up using 8014 and get > the symbol back. Thanks. Looks good! Kirk?
Fixed in: http://git.eclipse.org/c/cdt/org.eclipse.cdt.edc.git/commit/?id=f7c4790c134d0cdc30cdbe8d2a016bf27cd25528