Shared library SIGSEGV on dlopen / static init when calling @plt function

My app dlopens a library with static initialization code. All other libraries do the same and are loaded fine before, but this one dies, when calling a function from another library. This is something like:

0x12311 <-- bad address
_static_initialization_0 <-- function call
....
dlopen

Now, the function call in the disassembly looks like

call _Z6MyFuncRA37_Kc@plt

However this call ends up calling invalid address 0x12311, ie the PLT entry gets the wrong address.

The problem is highly possible that the library in question is kind of 3rd-party one, ie comes in binary prebuilt form even though it depends on other libraries. Previous week we did a big optimization and changed a lot of headers and so on. The function MyFunc whos PLT is wrong is located in our (another) library, that got massive optimization changes.

How this is possible? The exact question is:

  • what is the mechanism that causes PLT mismatch
  • is there a way to fix it without touching the precompiled library - OPTIONAL as I could get the rebuilt version, but I'm still curious why it crashes
  • Also, the same app works fine when compiled with -O2 optimization, which is what I call strange (the binary library is same in both cases).

    PS ubuntu 12.04 x86_64 but app is i386.

    UPDATE: The suggestion in comments (deleted for some reason) to check LD_DEBUG was good, in LD_DEBUG=bindings I see this in the "crashing" version of app:

     10272:  /media/EXT/work/build32/bin/libMyLib.so: error: 
        symbol lookup error: undefined symbol: omp_set_num_threads (fatal)
    

    And then it stops binding libMyLib.so symbols, while in non-failing version it keeps binding other symbols. But I don't understand why it then continues execution and tries to load the parent library. Actually the scheme is as follows:

    libA -> libB -> libMyLib
    

    libMyLib fails (as indicated by LD_DEBUG output above) so it skips it and also libB completely (!) and continues with binding libA symbols. The non-failing version fully loads libMyLib symbols, then continues with libB symbols, and then with libA symbols.

    Frankly to me it looks like ld bug.

    As for why optimized version works I suppose omp_ method is not really needed and is thrown out by linker optimization, thus it does not fail to find it at runtime.

    Here's what I see in LD_DEBUG=all log after the omp_ symbol is not found for libC:

    19225: symbol=omp_set_num_threads; lookup in file=/usr/lib/i386-linux-gnu/libXdmcp.so.6 [0]
    19225: /media/EXT/Work/libC.so: error: symbol lookup error: undefined symbol: omp_set_num_threads (fatal)
    19225:
    19225: file=/media/EXT/libA.so [0]; destroying link map
    19225:
    19225: file=/media/EXT/libA.so [0]; dynamically loaded by /media/EXT/libX.so [0]
    19225: file=/media/EXT/libA.so [0]; generating link map
    19225: dynamic: 0xf2fdb764 base: 0xf2f81000 size: 0x00064a28
    19225: entry: 0xf2f8ffd0 phdr: 0xf2f81034 phnum: 7
    19225:
    19225: checking for version `GCC_3.0' in file /lib/i386-linux-gnu/libgcc_s.so.1 [0] required by file /media/EXT/libA.so [0]
    ... few more checking
    19225: object=/media/EXT/libA.so [0]
    19225: scope 0: bin/mainapp /lib/i386-linux-gnu/libpthread.so.0 /media/EXT/libX.so ...
    19225: scope 1:...
    19225:
    19225:
    19225: relocation processing: /media/EXT/libA.so
    19225: symbol=_ZTVN10__cxxabiv117__class_type_infoE; lookup in file=bin/mainapp [0]
    19225: symbol=_ZTVN10__cxxabiv117__class_type_infoE; lookup in file=/lib/i386-linux-gnu/libpthread.so.0 [0]
    19225: symbol=_ZTVN10__cxxabiv117__class_type_infoE; lookup in file=/media/EXT/libX.so [0]
    19225: binding file /media/EXT/libA.so [0] to /media/EXT/libX.so [0]: normal symbol `_ZTVN10__cxxabiv117__class_type_infoE'
    
    ... here it continues to bind libA symbols, and after finishing that
    
    19225:
    19225:
    19225: calling init: /media/EXT/libC.so
    19225:
    

    it calls init for the non-initialized libC.so module.

    (Just to mention libX.so is the base module that calls dlopen and also contains basic methods used by all other libs.)

    After destroying link map for libA the log shows that it is generated again, I just don't understand if loader continues to load libA or starts from scratch this time without bothering about libB/libC. Well, it ignores libB/libC in any case until init is called for libC.


    omp_set_num_threads is related to OpenMP support inside GCC.

    You probably should pass the -fopenmp flag to gcc at compile & link times (even if you are just dlopen -ing a library using OpenMP).

    Maybe the original library provider forgot that.

    (OpenMP is altering the entire behavior of the compilation process)

    链接地址: http://www.djcxy.com/p/60530.html

    上一篇: 如何实现可选的延迟处理

    下一篇: 调用@plt函数时,在dlopen / static init上共享库SIGSEGV