Why are these constructs (using ++) undefined behavior in C?

#include <stdio.h>

int main(void)
{
   int i = 0;
   i = i++ + ++i;
   printf("%dn", i); // 3

   i = 1;
   i = (i++);
   printf("%dn", i); // 2 Should be 1, no ?

   volatile int u = 0;
   u = u++ + ++u;
   printf("%dn", u); // 1

   u = 1;
   u = (u++);
   printf("%dn", u); // 2 Should also be one, no ?

   register int v = 0;
   v = v++ + ++v;
   printf("%dn", v); // 3 (Should be the same as u ?)

   int w = 0;
   printf("%d %d %dn", w++, ++w, w); // shouldn't this print 0 2 2

   int x[2] = { 5, 8 }, y = 0;
   x[y] = y ++;
   printf("%d %dn", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}

C has the concept of undefined behavior, ie some language constructs are syntactically valid but you can't predict the behavior when the code is run.

As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of ie requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.

So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile , that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.

Your most interesting-looking example, the one with

u = (u++);

is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).


Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.

This is what I get on my machine, together with what I think is going on:

$ cat evil.c
void evil(){
  int i = 0;
  i+= i++ + ++i;
}
$ gcc evil.c -c -o evil.bin
$ gdb evil.bin
(gdb) disassemble evil
Dump of assembler code for function evil:
   0x00000000 <+0>:   push   %ebp
   0x00000001 <+1>:   mov    %esp,%ebp
   0x00000003 <+3>:   sub    $0x10,%esp
   0x00000006 <+6>:   movl   $0x0,-0x4(%ebp)  // i = 0   i = 0
   0x0000000d <+13>:  addl   $0x1,-0x4(%ebp)  // i++     i = 1
   0x00000011 <+17>:  mov    -0x4(%ebp),%eax  // j = i   i = 1  j = 1
   0x00000014 <+20>:  add    %eax,%eax        // j += j  i = 1  j = 2
   0x00000016 <+22>:  add    %eax,-0x4(%ebp)  // i += j  i = 3
   0x00000019 <+25>:  addl   $0x1,-0x4(%ebp)  // i++     i = 4
   0x0000001d <+29>:  leave  
   0x0000001e <+30>:  ret
End of assembler dump.

(I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)


I think the relevant parts of the C99 standard are 6.5 Expressions, §2

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

and 6.5.16 Assignment operators, §4:

The order of evaluation of the operands is unspecified. If an attempt is made to modify the result of an assignment operator or to access it after the next sequence point, the behavior is undefined.

链接地址: http://www.djcxy.com/p/12734.html

上一篇: C ++链接如何在实践中工作?

下一篇: 为什么这些构造(使用++)在C中的未定义行为?