Software to find the most occurring unique words in a file

I remember once visiting a website that would present a book/novel summary in the most interesting way. It would show a list of words that were repeated most often in that book and which were unique/uncommon. In other words, it would show the maximum frequency words but not common words like the, I, you etc. Then it needs to have things like showing a phrase if that is repeated often. For example, Treasure Island would probably have words like - pirates, storm, fought, treasure, pieces of eight, island, Long John Silver, Jim, omen etc.

This was the most interesting way to quickly get a good feel of whether I would like reading that book or not. I cannot find that website anymore. So I thought of looking for a software that would do the job. I have a couple of books in pdf and doc format which I would like to have analyzed. Does anyone know of a good tool/software out there that can do this?

Of course I could probably code this myself but it sure would be nice to not reinvent the wheel. So my question is, "do you know of any such software?"

Thanks,
Mugen
(bookworm)


Try Wordle.net. It creates tag clouds and renders them in different ways.

Answer attribution: Dave

链接地址: http://www.djcxy.com/p/18132.html

上一篇: 发现给定文本中的“模板”?

下一篇: 软件在文件中查找最常出现的唯一字