Why are move semantics necessary to elide temporary copies?
So my understanding of move semantics is that they allow you to override functions for use with temporary values (rvalues) and avoid potentially expensive copies (by moving the state from an unnamed temporary into your named lvalue).
My question is why do we need special semantics for this? Why couldn't a C++98 compiler elide these copies, since it's the compiler that determines whether a given expression is an lvalue or an rvalue? As an example:
void func(const std::string& s) {
// Do something with s
}
int main() {
func(std::string("abc") + std::string("def"));
}
Even without C++11's move semantics, the compiler should still be able to determine that the expression passed to func()
is an rvalue, and thus the copy from a temporary object is unnecessary. So why have the distinction at all? It seems like this application of move semantics is essentially a variant of copy elision or other similar compiler optimizations.
As another example, why bother having code like the following?
void func(const std::string& s) {
// Do something with lvalue string
}
void func(std::string&& s) {
// Do something with rvalue string
}
int main() {
std::string s("abc");
// Presumably calls func(const std::string&) overload
func(s);
// Presumably calls func(std::string&&) overload
func(std::string("abc") + std::string("def"));
}
It seems like the const std::string&
overload could handle both cases: lvalues as usual, and rvalues as a const reference (since temporary expressions are sort of const by definition). Since the compiler knows when an expression is an lvalue or an rvalue, it could decide whether to elide the copy in the case of an rvalue.
Basically, why are move semantics considered special and not just a compiler optimization that could have been performed by pre-C++11 compilers?
Move functions do not elide temporary copies, exactly.
The same number of temporaries exists, it's just that instead of calling the copy constructor typically, the move constructor is called, which is allowed to cannibalize the original rather than make an independent copy. This may sometimes be vastly more efficient.
The C++ formal object model is not at all modified by move semantics. Objects still have a well-defined lifetime, starting at some particular address, and ending when they are destroyed there. They never "move" during their life time. When they are "moved from", what is really happening is the guts are scooped out of an object that is scheduled to die soon, and placed efficiently in a new object. It may look like they moved, but formally, they didn't really, as that would totally break C++.
Being moved from is not death. Move is required to leave objects in a "valid state" in which they are still alive, and the destructor will always be called later.
Eliding copies is a totally different thing, where in some chain of temporary objects, some of the intermediates are skipped. Compilers are not required to elide copies in C++11 and C++14, they are permitted to do this even when it may violate the "as-if" rule that usually guides optimization. That is even if the copy ctor may have side effects, the compiler at high optimization settings may still skip some of the temporaries.
By contrast, "guaranteed copy ellision" is a new C++17 feature, which means that the standard requires copy ellision to take place in certain cases.
Move semantics and copy ellision give two different approaches to enabling greater efficiency in these "chain of temporaries" scenarios. In move semantics, all the temporaries still exist, but instead of calling the copy constructor, we get to call a (hopefully) less expensive constructor, the move constructor. In copy ellision, we get to skip some of the objects all together.
Basically, why are move semantics considered special and not just a compiler optimization that could have been performed by pre-C++11 compilers?
Move semantics are not a "compiler optimization". They are a new part of the type system. Move semantics happens even when you compile with -O0
on gcc
and clang
-- it causes different functions to be called, because, the fact that an object is about to die is now "annotated" in the type of reference. It allows "application level optimizations" but this is different from what the optimizer does.
Maybe you can think of it as a safety-net. Sure, in an ideal world the optimizer would always eliminate every unnecessary copy. Sometimes, though, constructing a temporary is complex, involves dynamic allocations, and the compiler doesn't see through it all. In many such cases, you will be saved by move semantics, which might allow you to avoid making a dynamic allocation at all. That in turn may lead to generated code that is then easier for the optimizer to analyze.
The guaranteed copy ellision thing is sort of like, they found a way to formalize some of this "common sense" about temporaries, so that more code not only works the way you expect when it gets optimized, but is required to work the way you expect when it gets compiled, and not call a copy constructor when you think there shouldn't really be a copy. So you can eg return non-copyable, non-moveable types by value from a factory function. The compiler figures out that no copy happens much earlier in the process, before it even gets to the optimizer. This is really the next iteration of this series of improvements.
Copy elision and move semantics are not exactly the same. With copy elision, the entire object is not copied, it stays in place. With a move, "something" still gets copied. The copy is not really eliminated. But that "something" is a pale shadow of what a full-blown copy has to haul.
A simple example:
class Bar {
std::vector<int> foo;
public:
Bar(const std::vector<int> &bar) : foo(bar)
{
}
};
std::vector<int> foo();
int main()
{
Bar bar=foo();
}
Good luck trying to get your compiler to eliminate the copy, here.
Now, add this constructor:
Bar(std::vector<int> &&bar) : foo(std::move(bar))
{
}
And now, the object in the main()
gets constructed using a move operation. The full copy has not actually been eliminated, but the move operation is just some line noise.
On the other hand:
Bar foo();
int main()
{
Bar bar=foo();
}
That's going to get a full copy-elision here. Nothing gets copied copied.
In conclusion: move semantics does not actually elide, or eliminate a copy. It just makes the resulting copy "less".
You have a fundamental misunderstanding of how certain things in C++ work:
Even without C++11's move semantics, the compiler should still be able to determine that the expression passed to func() is an rvalue, and thus the copy from a temporary object is unnecessary.
That code does not provoke any copying, even in C++98. A const&
is a reference not a value. And because it's const
, it is perfectly capable of referencing a temporary. As such, a function taking a const string&
never gets a copy of the parameter.
That code will create a temporary and pass a reference to that temporary to func
. No copying happens, at all.
As another example, why bother having code like the following?
Nobody does. A function should only take a parameter by rvalue-reference if that function will move from it. If a function is only going to observe the value without modifying it, they take it by const&
, just like in C++98.
Most important of all:
So my understanding of move semantics is that they allow you to override functions for use with temporary values (rvalues) and avoid potentially expensive copies (by moving the state from an unnamed temporary into your named lvalue).
Your understanding is wrong.
Moving is not solely about temporary values; if it was, we wouldn't have std::move
that allows us to move from lvalues. Moving is about transfering ownership of data from one object to another. While that frequently does happen with temporaries, it can also happen with lvalues:
std::unique_ptr<T> p = ...
std::unique_ptr<T> other_p = std::move(p);
assert(p == nullptr); //Will always be true.
This code creates a unique_ptr, then moves the contents of that pointer into another unique_ptr
object. It is not dealing with temporaries; it is transferring ownership of the internal pointer to another object.
This is not something a compiler could deduce that you wanted to do. You have to be explicit that you want to perform such a move on an lvalue (which is why std::move
is there).
上一篇: 静态数组与C ++中的动态数组
下一篇: 为什么需要移动语义来删除临时副本?