System calls : difference between sys

What is the difference between SYS_exit, sys_exit() and exit()?

What I understand :

  • The linux kernel provides system calls, which are listed in man 2 syscalls .
  • There are wrapper functions of those syscalls provided by glibc which have mostly similar names as the syscalls.
  • My question : In man 2 syscalls , there is no mention of SYS_exit and sys_exit(), for example. What are they?

    Note : The syscall exit here is only an example. My question really is : What are SYS_xxx and sys_xxx()?


    I'll use exit() as in your example although this applies to all system calls.

    The functions of the form sys_exit() are the actual entry points to the kernel routine that implements the function you think of as exit(). These symbols are not even available to user-mode programmers. That is, unless you are hacking the kernel, you cannot link to these functions because their symbols are not available outside the kernel. If I wrote libmsw.a which had a file scope function like

    static int msw_func() {}
    

    defined in it, you would have no success trying to link to it because it is not exported in the libmsw symbol table; that is:

    cc your_program.c libmsw.a
    

    would yield an error like:

    ld: cannot resolve symbol msw_func
    

    because it isn't exported; the same applies for sys_exit() as contained in the kernel.

    In order for a user program to get to kernel routines, the syscall(2) interface needs to be used to effect a switch from user-mode to kernel mode. When that mode-switch (somtimes called a trap) occurs a small integer is used to look up the proper kernel routine in a kernel table that maps integers to kernel functions. An entry in the table has the form

    {SYS_exit, sys_exit},
    

    Where SYS_exit is an preprocessor macro which is

    #define SYS_exit (1)
    

    and has been 1 since before you were born because there hasn't been reason to change it. It also happens to be the first entry in the table of system calls which makes look up a simple array index.

    As you note in your question, the proper way for a regular user-mode program to access sys_exit is through the thin wrapper in glibc (or similar core library). The only reason you'd ever need to mess with SYS_exit or sys_exit is if you were writing kernel code.


    This is now addressed in man syscall itself,

    Roughly speaking, the code belonging to the system call with number __NR_xxx defined in /usr/include/asm/unistd.h can be found in the Linux kernel source in the routine sys_xxx() . (The dispatch table for i386 can be found in /usr/src/linux/arch/i386/kernel/entry.S .) There are many exceptions, however, mostly because older system calls were superseded by newer ones, and this has been treated somewhat unsystematically. On platforms with proprietary operating-system emulation, such as parisc, sparc, sparc64, and alpha, there are many additional system calls; mips64 also contains a full set of 32-bit system calls.

    At least now /usr/include/asm/unistd.h is a preprocessor hack that links to either,

  • /usr/include/asm/unistd_32.h
  • /usr/include/asm/unistd_x32.h
  • /usr/include/asm/unistd_64.h
  • The C function exit() is defined in stdlib.h . Think of this as a high level event driven interface that allows you to register a callback with atexit()

    /* Call all functions registered with `atexit' and `on_exit',
       in the reverse of the order in which they were registered,
       perform stdio cleanup, and terminate program execution with STATUS.  */
    
    extern void exit (int __status) __THROW __attribute__ ((__noreturn__));
    

    So essentially the kernel provides an interface (C symbols) called __NR_xxx . Traditionally people want sys_exit() which is defined with a preprocessor macro SYS_exit . This macro creates the sys_exit() function. The exit() function is part of the standard C library stdlib.h and ported to other operating systems that lack the Linux Kernel ABI entirely (there may not be __NR_xxx functions) and potentially don't even have sys_* functions available either (you could write exit() to send the interrupt or use VDSO in Assembly).

    链接地址: http://www.djcxy.com/p/16138.html

    上一篇: JPA 2 @SequenceGenerator @GeneratedValue产生唯一的约束违规

    下一篇: 系统调用:sys之间的区别