Error that is neither syntactic nor semantic?

I had this question on a homework assignment (don't worry, already done):

[Using your favorite imperative language, give an example of each of ...] An error that the compiler can neither catch nor easily generate code to catch (this should be a violation of the language definition, not just a program bug)

From "Programming Language Pragmatics" (3rd ed) Michael L. Scott

My answer, call main from main by passing in the same arguments (in C and Java), inspired by this. But I personally felt like that would just be a semantic error.

To me this question's asking how to producing an error that is neither syntactic nor semantic, and frankly, I can't really think of situation where it wouldn't fall in either.

Would it be code that is susceptible to exploitation, like buffer overflows (and maybe other exploitation I've never heard about)? Some sort of pit fall from the structure of the language (IDK, but lazy evaluation/weak type checking)? I'd like a simple example in Java/C++/C, but other examples are welcome.


Undefined behaviour springs to mind. A statement invoking UB is neither syntactically nor semantically incorrect, but rather the result of the code cannot be predicted and is considered erroneous.

An example of this would be (from the Wikipedia page) an attempt to modify a string-constant:

char * str = "Hello world!";
str[0] = 'h'; // undefined-behaviour here

Not all UB-statements are so easily identified though. Consider for example the possibility of signed-integer overflow in this case, if the user enters a number that is too big:

// get number from user
char input[100];
fgets(input, sizeof input, stdin);
int number = strtol(input, NULL, 10);
// print its square: possible integer-overflow if number * number > INT_MAX
printf("%i^2 = %in", number, number * number);

Here there may not necessarily be signed-integer overflow. And it is impossible to detect it at compile- or link-time since it involves user-input.


Statements invoking undefined behavior 1 are semantically as well as syntactically correct but make programs behave erratically.

a[i++] = i;   // Syntax (symbolic representation) and semantic (meaning) both are correct. But invokes UB.   

Another example is using a pointer without initializing it.
Logical errors are also neither semantic nor syntactic.


1. Undefined behavior: Anything at all can happen; the Standard imposes no requirements. The program may fail to compile, or it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended.


Here's an example for C++. Suppose we have a function:

int incsum(int &a, int &b) {
    return ++a + ++b;
}

Then the following code has undefined behavior because it modifies an object twice with no intervening sequence point:

int i = 0;
incsum(i, i);

If the call to incsum is in a different TU from the definition of the function, then it's impossible to catch the error at compile time, because neither bit of code is inherently wrong on its own. It could be detected at link time by a sufficiently intelligent linker.

You can generate as many examples as you like of this kind, where code in one TU has behavior that's conditionally undefined for certain input values passed by another TU. I went for one that's slightly obscure, you could just as easily use an invalid pointer dereference or a signed integer arithmetic overflow.

You can argue how easy it is to generate code to catch this -- I wouldn't say it's very easy, but a compiler could notice that ++a + ++b is invalid if a and b alias the same object, and add the equivalent of assert (&a != &b); at that line. So detection code can be generated by local analysis.

链接地址: http://www.djcxy.com/p/86814.html

上一篇: 下标值既不是数组也不是指针,也不是具有argv的向量

下一篇: 错误既不是句法也不是语义?