Why does the VS2008 std::string.erase() move its buffer?
I want to read a file line by line and capture one particular line of input. For maximum performance I could do this in a low level way by reading the entire file in and just iterating over its contents using pointers, but this code is not performance critical so therefore I wish to use a more readable and typesafe std library style implementation.
So what I have is this:
std::string line;
line.reserve(1024);
std::ifstream file(filePath);
while(file)
{
std::getline(file, line);
if(line.substr(0, 8) == "Whatever")
{
// Do something ...
}
}
While this isn't performance critical code I've called line.reserve(1024) before the parsing operation to preclude multiple reallocations of the string as larger lines are read in.
Inside std::getline the string is erased before having the characters from each line added to it. I stepped through this code to satisfy myself that the memory wasn't being reallocated each iteration, what I found fried my brain.
Deep inside string::erase rather than just resetting its size variable to zero what it's actually doing is calling memmove_s with pointer values that would overwrite the used part of the buffer with the unused part of the buffer immediately following it, except that memmove_s is being called with a count argument of zero, ie requesting a move of zero bytes.
Questions:
Why would I want the overhead of a library function call in the middle of my lovely loop, especially one that is being called to do nothing at all?
I haven't picked it apart myself yet but under what circumstances would this call not actually do nothing but would in fact start moving chunks of buffer around?
And why is it doing this at all?
Bonus question: What the C++ standard library tag?
This is a known issue I reported a year ago, to take advantage of the fix you'll have to upgrade to a future version of the compiler.
Connect Bug: " std::string::erase
is stupidly slow when erasing to the end, which impacts std::string::resize
"
The standard doesn't say anything about the complexity of any std::string
functions, except swap
.
std::string::clear()
is defined in terms of std::string::erase()
, and std::string::erase()
does have to move all of the characters after the block which was erased. So why shouldn't it call a standard function to do so? If you've got some profiler output which proves that this is a bottleneck, then perhaps you can complain about it, but otherwise, frankly, I can't see it making a difference. (The logic necessary to avoid the call could end up costing more than the call.)
Also, you're not checking the results of the call to getline
before using them. Your loop should be something like:
while ( std::getline( file, line ) ) {
// ...
}
And if you're so worried about performance, creating a substring (a new std::string
) just in order to do a comparison is far more expensive than a call to memmove_s
. What's wrong with something like:
static std::string const target( "Whatever" );
if ( line.size() >= target.size()
&& std::equal( target.begin(), target().end(), line.being() ) ) {
// ...
}
I'ld consider this the most idiomatic way of determining whether a string starts with a specific value.
(I might add that from experience, the reserve
here doesn't buy you much either. After you've read a couple of lines in the file, your string isn't going to grow much anyway, so there'll be very few reallocations after the first couple of lines. Another case of premature optimization?)
In this case, I think the idea you mention of reading the entire file and iterating over the result may actually give about as simple of code. You're simply changing: "read line, check for prefix, process" to "read file, scan for prefix, process":
size_t not_found = std::string::npos;
std::istringstream buffer;
buffer << file.rdbuf();
std::string &data = buffer.str();
char const target[] = "nWhatever";
size_t len = sizeof(target)-1;
for (size_t pos=0; not_found!=(pos=data.find(target, pos)); pos+=len)
{
// process relevant line starting at contents[pos+1]
}
链接地址: http://www.djcxy.com/p/91706.html
上一篇: 我们可以有多少级别的指针?