What are the rules about using an underscore in a C++ identifier?
It's common in C++ to name member variables with some kind of prefix to denote the fact that they're member variables, rather than local variables or parameters. If you've come from an MFC background, you'll probably use m_foo
. I've also seen myFoo
occasionally.
C# (or possibly just .NET) seems to recommend using just an underscore, as in _foo
. Is this allowed by the C++ standard?
The rules (which did not change in C++11):
std
namespace is reserved. (You are allowed to add template specializations, though.) From the 2003 C++ Standard:
17.4.3.1.2 Global names [lib.global.names]
Certain sets of names and function signatures are always reserved to the implementation:
__
) or begins with an underscore followed by an uppercase letter (2.11) is reserved to the implementation for any use. 165) Such names are also reserved in namespace ::std
(17.4.3.1).
Because C++ is based on the C standard (1.1/2, C++03) and C99 is a normative reference (1.2/1, C++03) these also apply, from the 1999 C Standard:
7.1.3 Reserved identifiers
Each header declares or defines all identifiers listed in its associated subclause, and optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers.
No other identifiers are reserved. If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.
If the program removes (with #undef
) any macro definition of an identifier in the first group listed above, the behavior is undefined.
154) The list of reserved identifiers with external linkage includes errno
, math_errhandling
, setjmp
, and va_end
.
Other restrictions might apply. For example, the POSIX standard reserves a lot of identifiers that are likely to show up in normal code:
E
followed a digit or uppercase letter: is
or to
followed by a lowercase letter LC_
followed by an uppercase letter f
or l
are reserved SIG
followed by an uppercase letter are reserved SIG_
followed by an uppercase letter are reserved str
, mem
, or wcs
followed by a lowercase letter are reserved PRI
or SCN
followed by any lowercase letter or X
are reserved _t
are reserved While using these names for your own purposes right now might not cause a problem, they do raise the possibility of conflict with future versions of that standard.
Personally I just don't start identifiers with underscores. New addition to my rule: Don't use double underscores anywhere, which is easy as I rarely use underscore.
After doing research on this article I no longer end my identifiers with _t
as this is reserved by the POSIX standard.
The rule about any identifier ending with _t
surprised me a lot. I think that is a POSIX standard (not sure yet) looking for clarification and official chapter and verse. This is from the GNU libtool manual, listing reserved names.
CesarB provided the following link to the POSIX 2004 reserved symbols and notes 'that many other reserved prefixes and suffixes ... can be found there'. The POSIX 2008 reserved symbols are defined here. The restrictions are somewhat more nuanced than those above.
The rules to avoid collision of names are both in the C++ standard (see Stroustrup book) and mentioned by C++ gurus (Sutter, etc.).
Personal rule
Because I did not want to deal with cases, and wanted a simple rule, I have designed a personal one that is both simple and correct:
When naming a symbol, you will avoid collision with compiler/OS/standard libraries if you:
Of course, putting your code in an unique namespace helps to avoid collision, too (but won't protect against evil macros)
Some examples
(I use macros because they are the more code-polluting of C/C++ symbols, but it could be anything from variable name to class name)
#define _WRONG
#define __WRONG_AGAIN
#define RIGHT_
#define WRONG__WRONG
#define RIGHT_RIGHT
#define RIGHT_x_RIGHT
Extracts from C++0x draft
From the n3242.pdf file (I expect the final standard text to be similar):
17.6.3.3.2 Global names [global.names]
Certain sets of names and function signatures are always reserved to the implementation:
— Each name that contains a double underscore _ _ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.
— Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
But also:
17.6.3.3.5 User-defined literal suffixes [usrlit.suffix]
Literal suffix identifiers that do not start with an underscore are reserved for future standardization.
This last clause is confusing, unless you consider that a name starting with one underscore and followed by a lowercase letter would be Ok if not defined in the global namespace...
From MSDN:
Use of two sequential underscore characters ( __ ) at the beginning of an identifier, or a single leading underscore followed by a capital letter, is reserved for C++ implementations in all scopes. You should avoid using one leading underscore followed by a lowercase letter for names with file scope because of possible conflicts with current or future reserved identifiers.
This means that you can use a single underscore as a member variable prefix, as long as it's followed by a lower-case letter.
This is apparently taken from section 17.4.3.1.2 of the C++ standard, but I can't find an original source for the full standard online.
See also this question.
链接地址: http://www.djcxy.com/p/1048.html