Declaring variables inside loops, good practice or bad practice?

Question #1: Is declaring a variable inside a loop a good practice or bad practice?

I've read the other threads about whether or not there is a performance issue (most said no), and that you should always declare variables as close to where they are going to be used. What I'm wondering is whether or not this should be avoided or if it's actually preferred.

Example:

for(int counter = 0; counter <= 10; counter++)
{
   string someString = "testing";

   cout << someString;
}

Question #2: Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?


This is excellent practice.

By creating variables inside loops, you ensure their scope is restricted to inside the loop. It cannot be referenced nor called outside of the loop.

This way:

  • If the name of the variable is a bit "generic" (like "i"), there is no risk to mix it with another variable of same name somewhere later in your code (can also be mitigated using the -Wshadow warning instruction on GCC)

  • The compiler knows that the variable scope is limited to inside the loop, and therefore will issue a proper error message if the variable is by mistake called elsewhere

  • Last but not least, some dedicated optimization can be performed more efficiently by the compiler (most importantly register allocation), since it knows that the variable cannot be used outside of the loop. For example, no need to store the result for later re-use.

  • In short, you are right to do it.

    Note however that the variable is not supposed to retain its value between each loop. In such case, you may need to initialize it every time. You can also create a larger block, encompassing the loop, which sole purpose is to declare variables which must retain their value from one loop to another. This typically includes the loop counter itself.

    {
        int i, retainValue;
        for (i=0; i<N; i++)
        {
           int tmpValue;
           /* tmpValue is uninitialized */
           /* retainValue still has its previous value from previous loop */
    
           /* Do some stuff here */
        }
        /* Here, retainValue is still valid; tmpValue no longer */
    }
    

    For question #2: The variable is allocated once, when the function is called. In fact, from an allocation perspective, it is (nearly) the same as declaring the variable at the beginning of the function. The only difference is the scope: the variable cannot be used outside of the loop. It may even be possible that the variable is not allocated, just re-using some free slot (from other variable which scope has ended).

    With restricted and more precise scope come more accurate optimizations. But more importantly, it makes your code safer, with less states (ie variables) to worry about when reading other parts of the code.

    This is true even outside of an if(){...} loop. Typically, instead of :

        int result;
        (...)
        result = f1();
        if (result) then { (...) }
        (...)
        result = f2();
        if (result) then { (...) }
    

    it's safer to write :

        (...)
        {
            int const result = f1();
            if (result) then { (...) }
        }
        (...)
        {
            int const result = f2();
            if (result) then { (...) }
        }
    

    The difference may seem minor, especially on such a small example. But on a larger code base, it will help : now there is no risk to transport some result value from f1() to f2() block. Each result is strictly limited to its own scope, making its role more accurate. From a reviewer perspective, it's much nicer, since he has less long range state variables to worry about and track.

    Even the compiler will help better : assuming that, in the future, after some erroneous change of code, result is not properly initialized with f2() . The second version will simply refuse to work, stating a clear error message at compile time (way better than run time). The first version will not spot anything, the result of f1() will simply be tested a second time, being confused for the result of f2() .

    Complementary information

    The open-source tool CppCheck (a static analysis tool for C/C++ code) provides some excellent hints regarding optimal scope of variables.

    In response to comment on allocation: The above rule is true in C, but might not be for some C++ classes.

    For standard types and structures, the size of variable is known at compilation time. There is no such thing as "construction" in C, so the space for the variable will simply be allocated into the stack (without any initialization), when the function is called. That's why there is a "zero" cost when declaring the variable inside a loop.

    However, for C++ classes, there is this constructor thing which I know much less about. I guess allocation is probably not going to be the issue, since the compiler shall be clever enough to reuse the same space, but the initialization is likely to take place at each loop iteration.


    Generally, it's a very good practice to keep it very close.

    In some cases, there will be a consideration such as performance which justifies pulling the variable out of the loop.

    In your example, the program creates and destroys the string each time. Some libraries use a small string optimization (SSO), so the dynamic allocation could be avoided in some cases.

    Suppose you wanted to avoid those redundant creations/allocations, you would write it as:

    for (int counter = 0; counter <= 10; counter++) {
       // compiler can pull this out
       const char testing[] = "testing";
       cout << testing;
    }
    

    or you can pull the constant out:

    const std::string testing = "testing";
    for (int counter = 0; counter <= 10; counter++) {
       cout << testing;
    }
    

    Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?

    It can reuse the space the variable consumes, and it can pull invariants out of your loop. In the case of the const char array (above) - that array could be pulled out. However, the constructor and destructor must be executed at each iteration in the case of an object (such as std::string ). In the case of the std::string , that 'space' includes a pointer which contains the dynamic allocation representing the characters. So this:

    for (int counter = 0; counter <= 10; counter++) {
       string testing = "testing";
       cout << testing;
    }
    

    would require redundant copying in each case, and dynamic allocation and free if the variable sits above the threshold for SSO character count (and SSO is implemented by your std library).

    Doing this:

    string testing;
    for (int counter = 0; counter <= 10; counter++) {
       testing = "testing";
       cout << testing;
    }
    

    would still require a physical copy of the characters at each iteration, but the form could result in one dynamic allocation because you assign the string and the implementation should see there is no need to resize the string's backing allocation. Of course, you wouldn't do that in this example (because multiple superior alternatives have already been demonstrated), but you might consider it when the string or vector's content varies.

    So what do you do with all those options (and more)? Keep it very close as a default -- until you understand the costs well and know when you should deviate.


    For C++ it depends on what you are doing. OK, it is stupid code but imagine

    class myTimeEatingClass
    {
     public:
     //constructor
          myTimeEatingClass()
          {
              sleep(2000);
              ms_usedTime+=2;
          }
          ~myTimeEatingClass()
          {
              sleep(3000);
              ms_usedTime+=3;
          }
          const unsigned int getTime() const
          {
              return  ms_usedTime;
          }
          static unsigned int ms_usedTime;
    };
    
    myTimeEatingClass::ms_CreationTime=0; 
    myFunc()
    {
        for (int counter = 0; counter <= 10; counter++) {
    
            myTimeEatingClass timeEater();
            //do something
        }
        cout << "Creating class took "<< timeEater.getTime() <<"seconds at all<<endl;
    
    }
    myOtherFunc()
    {
        myTimeEatingClass timeEater();
        for (int counter = 0; counter <= 10; counter++) {
            //do something
        }
        cout << "Creating class took "<< timeEater.getTime() <<"seconds at all<<endl;
    
    }
    

    You will wait 55 seconds until you get the output of myFunc. Just because each loop contructor and destructor together need 5 seconds to finish.

    You will need 5 seconds until you get the output of myOtherFunc.

    Of course, this is a crazy example.

    But it illustrates that it might become a performance issue when each loop the same construction is done when the constructor and / or destructor needs some time.

    链接地址: http://www.djcxy.com/p/5060.html

    上一篇: 为什么命名空间在JavaScript中被认为是不好的做法?

    下一篇: 在循环内声明变量,好的做法还是不好的做法?