Storing structured data in Lucene
I have seen many references pointing to the use of Lucene or Solr as a NoSQL data store, not just the indexing engine: NoSQL (MongoDB) vs Lucene (or Solr) as your database http://searchhub.org/2010/04/29/for-the-guardian-solr-is-the-new-database/
However, because Lucene only provides a "flat" document structure, where each field can be multi-value (scalar), I can't seem to fully understand how people are mapping complex structured data into Lucene for index and store. For example:
{
"firstName": "Joe",
"lastName": "Smith",
"addresses" : [
    {
        "type" : "home", 
        "line1" : "1 Main Street",
        "city" : "New York",
    },
    {
        "type" : "office",
        "line1" : "P.O. Box 1234",
        "zip:“10000”
    }
]
}
Things can obviously get more complex. Ie what if the object has two collections: addresses and phone numbers? what if address itself has a collection?
I can think of two ways to map this two lucene "document":
Create a stored but not indexed field to store a JSON/BSON version of the object, and then create other index but don't store fields for indexing/searching.
Find a smart way to somehow fit the object into Lucene way of storing data. Ie use dot notation to flat the fields, use multi-value fields to store individual collection value and then somehow recreate the object on its way back...
I wonder if people have dealt with similar problems before and what solution have you used?
看看我的愚蠢的Lucene技巧:一种方法的层次结构。
It depends what the usage is. If you only need them for display, you can the complex value (addresses) as a JSON string and store it as multiple value field, if you need to use them as index, you can choose following struture:
    "addresses_type": [
    "home",
    "office"
    ],
    "addresses_line1": [
    "1 Main Street",
    "P.O. Box 1234"
    ],
    "addresses_city": [
    "New York",
    ""
    ],
    "addresses_zip": [
    "",
    "10000"
    ]
                        链接地址: http://www.djcxy.com/p/86394.html
                        上一篇: PHP mongo查找字段开头
下一篇: 在Lucene中存储结构化数据
