Nios® II Processor Reference Guide

ID 683836
Date 8/28/2023
Public
Document Table of Contents

7.9.6.3. Procedure Linkage Table

Function calls in a position-dependent executable may use the call and jmpi instructions, which address the contents of a 256-MB segment. They may also use the %lo, %hi, and %hiadj operators to take the address of a function. If the function is in another shared object, the link editor creates a callable stub in the executable called a PLT entry. The PLT entry loads the address of the called function from the PLT GOT (a region at the start of the GOT) and transfers control to it.

The PLT GOT entry needs a relocation referring to the final symbol, of type R_NIOS2_JUMP_SLOT. The dynamic linker may immediately resolve it, or may leave it unmodified for lazy binding. The link editor fills in an initial value pointing to the lazy binding stubs at the start of the PLT section.

Each PLT entry appears as shown in the example below.

PLT Entry

.PLTn:
   orhi   r15, r0, %hiadj(plt_got_slot_address)
   ldw    r15, %lo(plt_got_slot_address)(r15)
   jmp    r15

The example below shows the PLT entry when the PLT GOT is close enough to the small data area for a relative jump.

PLT Entry Near Small Data Area

.PLTn:
   ldw    r15, %gprel(plt_got_slot_address)(gp)
   jmp    r15

Initial PLT Entry

res_0:
   br .PLTresolve
   ...
.PLTresolve:
   orhi    r14, r0, %hiadj(res_0)
   addi    r14, r14, %lo(res_0)
   sub     r15, r15, r14
   orhi    r13, %hiadj(_GLOBAL_OFFSET_TABLE_)
   ldw     r14, %lo(_GLOBAL_OFFSET_TABLE_+4)(r13)
   ldw     r13, %lo(_GLOBAL_OFFSET_TABLE_+8)(r13)
   jmp     r13

In front of the initial PLT entry, a series of branches start of the initial entry (the nextpc instruction). There is one branch for each PLT entry, labelled res_0 through res_N. The last several branches may be replaced by nop instructions to improve performance. The link editor arranges for the Nth PLT entry to point to the Nth branch; res_Nres_0 is four times the index into the .rela.plt section for the corresponding R_JUMP_SLOT relocation.

The dynamic linker initializes GOT[1] to a unique identifier for each library and GOT[2] to the address of the runtime resolver routine. In order for the two loads in .PLTresolve to share the same %hiadj, _GLOBAL_OFFSET_TABLE_ must be aligned to a 16-byte boundary.

The runtime resolver receives the original function arguments in r4 through r7, the shared library identifier from GOT[1] in r14, and the relocation index times four in r15. The resolver updates the corresponding PLT GOT entry so that the PLT entry transfers control directly to the target in the future, and then transfers control to the target.

In shared objects, the call and jmpi instructions can not be used because the library load address is not known at link time. Calls to functions outside the current shared object must pass through the GOT. The program loads function addresses using %call, and the link editor may arrange for such entries to be lazily bound. Because PLT entries are only used for lazy binding, shared object PLTs are smaller, as shown below.

Shared Object PLT

.PLTn:
   orhi    r15, r0, %hiadj(index * 4)  
   addi    r15, r15, %lo(index * 4)
   br      .PLTresolve

Initial PLT Entry

.PLTresolve:
   nextpc    r14
   orhi      r13, r0, %hiadj(_GLOBAL_OFFSET_TABLE_)
   add       r13, r13, r14
   ldw       r14, %lo(_GLOBAL_OFFSET_TABLE_+4)(r13)
   ldw       r13, %lo(_GLOBAL_OFFSET_TABLE_+8)(r13)
   jmp       r13

If the initial PLT entry is out of range, the resolver can be inline, because it is only one instruction longer than a long branch, as shown below.

Initial PLT Entry Out of Range

.PLTn:
   orhi      r15, r0, %hiadj(index * 4)
   addi      r15, r15, %lo(index * 4)
   nextpc    r14
   orhi      r13, r0, %hiadj(_GLOBAL_OFFSET_TABLE_)
   add       r13, r13, r14
   ldw       r14, %lo(_GLOBAL_OFFSET_TABLE_+4)(r13)
   ldw       r13, %lo(_GLOBAL_OFFSET_TABLE_+8)(r13)
   jmp       r13