Search API on Datastore with frequently changing data

Datastore entities have the following fields:

  • id
  • created
  • user_id
  • not_unique_id
  • name
  • description
  • number
  • I want to be able to perform a Full-text search on name and description. In order to do this we need to create Google Search Api documents from the Datastore entities.

    However, the datastore data has the following attributes:

  • Data for an user is removed 12 hours after its added.
  • On user demand, the data for the user is removed and new data is added.
  • The not_unique_id is a third-party id that should be used to combine related entities in the Search Api because we only want one per search.
  • When a user submits new data it will be +- 1000 entities at a time.
  • The biggest issue I have is preventing to have Google API documents that reference Datastore entities that do not exist anymore. (Not exist as in the not_unique_id does not appear in the Datastore anymore)

    I would like to see some concepts, guideliness, ideas and tips so I can verify I'm on the right way. Thanks!!

    Solution in progress:

    Below is the routine to keep the Search API in sync with the Datastore. CreateUpdateDelete are executed on user request. Read is executed on App request. The Cron job will use Delete to keep the Search API in sync with the Datastore.

    Datastore Entities

    id = user_id

    ancestor = not_unique_id

    | ancestor | id | created | name | description | number |
    | 19385020 | 1  | 1234567 | Foo  | Qwerty      | 63     |
    | 19385020 | 2  | 1234567 | Foo  | Qwerty2     | 12     |
    | 19385020 | 3  | 1234567 | Foo  | Qwerty      | 74     |
    

    Search API Documents

    | not_unique_id | name | description |
    | 19385020      | Foo  | Qwerty      |
    

    Create

  • If ancestor+id combination already exists, go to Update.
  • User data is inserted into the Datastore.
  • Search documents are created based on ancestor (not_unique_id). Name and description of the Document is the most common name that appears in the Entity group.
  • Read

  • Full-text Search API Query on name/description to obtain the not_unique_id.
  • Query Datastore for entities with ancestor == not_unique_id and number > 0.
  • TODO what if no entities exist anymore for one or more of the found not_unique_id's? I am expecting a certain amount of results for pagination.
  • Update

  • Update datastore entity.
  • Delete

  • Set number to 0 of datastore entity.
  • Cron

    Fetch all Entities where number == 0 or created < 12 hours ago. Remove Document if last decendant Entity is about to be removed. Remove Entity.

    链接地址: http://www.djcxy.com/p/23222.html

    上一篇: Google App Engine(Python)减慢祖先查询

    下一篇: 使用频繁更改的数据在数据存储上搜索API