Important data structures in search
I'm interested in teaching myself different data structures, something I currently know very little about. My plan is to implement a few key structures so I understand how they work. I'm looking for suggestions on important data structures to start with.
I'm primarily interested in data structures that are relevant to search applications (eg Google / Lucene) and the general trade-off between delayed computation and precomputation. I'm also interested in distributed data structures -- data structures that can scale across hundreds / thousands of servers -- and probabilistic data structures -- data structures that help finding an approximate answer but do not need to always be correct.
Wikipedia has a list of data structures. I am currently considering:
Are there better choices?
Finally, is there any (major) problem with implementing these structures in a language like F#?
Very ambitious. I voted your question up just for its scope.
MIT has an on-line algorithms and data structures course. The companion book is a classic. I'm not sure if it addresses the distributed and probabilistic features, but they'll give you an excellent grounding in the fundamentals.
I'd add red-black tree, hash tables, patricia trie, and skip lists to your agenda.
Good luck.
If you are going to tackle this sort of thing with a functional language you should have a look at Purely Functional Data Structures by Chris Okasaki. Basic lesson is: the data structures you are familiar with for imperative programming may not be the best choices for functional programming. I expect there's a lot of similar material to be Googled for.
For search, algorithms are more important than data structures. When searching a large search space you often have to have sophisticated methods for pruning the search space.
You might look at classic search algorithms such as alpha-beta, A*, AO*.
Then look at something like iteratively deepening search.
In search algorithms, things like stacks and linked lists (which are really a form of a stack) and trees are more important than hash tables, B-trees etc. Of course, you will undoubtedly have hash tables in there, but it won't be the heart of the algorithm.
Here's some more important search algorithsm:
As far as specific data structures for search goes, you really don't need any. Basically, you just need your regular tool kit of data structures - trees, hashes, lists.
链接地址: http://www.djcxy.com/p/39870.html上一篇: 计算机学习中的结构?
下一篇: 搜索中的重要数据结构