Is there any scenario where the Rope data structure is more efficient than a string builder
Related to this question, based on a comment of user Eric Lippert.
Is there any scenario where the Rope data structure is more efficient than a string builder? It is some people's opinion that rope data structures are almost never better in terms of speed than the native string or string builder operations in typical cases, so I am curious to see realistic scenarios where indeed ropes are better.
The documentation for the SGI C++ implementation goes into some detail on the big O behaviours verses the constant factors which is instructive.
Their documentation assumes very long strings being involved, the examples posited for reference talk about 10 MB strings. Very few programs will be written which deal with such things and, for many classes of problems with such requirements reworking them to be stream based rather than requiring the full string to be available where possible will lead to significantly superior results. As such ropes are for non streaming manipulation of multi megabyte character sequences when you are able to appropriately treat the rope as sections (themselves ropes) rather than just a sequence of characters.
Significant Pros:
Significant Cons:
This leads to a few 'obvious' uses (the first mentioned explicitly by SGI).
There are cases where domain specific behaviour in the string can be coupled with relatively simple augmentations to the Rope implementation to allow:
As you can see from the examples listed, all fall well into the 'niche' category. Further, several may well have superior alternatives if you are willing/able to rewrite the algorithm as a stream processing operation instead.
the short answer to this question is yes, and that requires little explanation. Of course there's situations where the Rope data structure is more efficient than a string builder. they work differently, so they are more suited for different purposes.
(From a C# perspective)
The rope data structure as a binary tree is better in certain situations. When you're looking at extremely large string values (think 100+ MB of xml coming in from SQL), the rope data structure could keep the entire process off the large object heap, where the string object hits it when it passes 85000 bytes.
If you're looking at strings of 5-1000 characters, it probably doesn't improve the performance enough to be worth it. this is another case of a data structure that is designed for the 5% of people that have an extreme situation.
The 10th ICFP Programming Contest relied, basically, on people using the rope data structure for efficient solving. That was the big trick to get a VM that ran in reasonable time.
Rope is excellent if there are lots of prefixing (apparently the word "prepending" is made up by IT folks and isn't a proper word!) and potentially better for insertions; StringBuilders use continuous memory, so only work efficiently for appending.
Therefore, StringBuilder is great for building strings by appending fragments - a very normal use-case. As developers need to do this a lot, StringBuilders are a very mainstream technology.
Ropes are great for edit buffers, eg the data-structure behind, say, an enterprise-strength TextArea. So (a relaxation of Ropes, eg a linked list of lines rather than a binary tree) is very common in the UI controls world, but that's not often exposed to the developers and users of those controls.
You need really really big amounts of data and churn to make the rope pay-off - processors are very good at stream operations, and if you have the RAM then simply realloc for prefixing does work acceptably for normal use-cases. That competition mentioned at the top was the only time I've seen it needed.
链接地址: http://www.djcxy.com/p/6776.html上一篇: 什么是较少知道但有用的数据结构?