How to layout GAE datastore

Introduction

I am new to GAE and wrote a little app that unfortunately hits the quota limit of datastore reads per day very rapidly although there is not much data in the datastore.
This question should be about the layout and the possible use of indexes (currently I don't have any clue about how to use them).

What the app should do

  • The app should keep track of scores at a card game (Tichu for those of you interested^^). A game consists of several rounds and is finished as soon as one team reaches 1000 points.
  • The app should display statistical information of played games
  • First layout of the app

    My first layout approach was using the following entities:

    class Player(db.Model):
        Name = db.StringProperty(required = True)
    
    class Game(db.Model):
        Players = db.ListProperty(db.Key)
        Start = db.DateTimeProperty(auto_now_add = True, required = True)
        End = db.DateTimeProperty()
    
    class Round(db.Model):
        Game = db.Reference(Game, required = True)
        RoundNumber = db.IntegerProperty(required = True)
        PointsTeamA = db.IntegerProperty(required = True)
        PointsTeamB = db.IntegerProperty(required = True)
        FinishedFirst = db.ReferenceProperty(Player, required = True)
        TichuCalls = db.ListProperty(db.Key)
    

    As you can see above the entities are normalized (at least I hope they are). However, with this approach simple calculations like

  • What player won the most games
  • This could look something like this

    #Untested snippet just to get an idea of what I am doing here
    Wins = dict.fromkeys(Player.all().fetch(None), 0)
    for r in Round.all():
        wins[r.FinishedFirst] += 1
    

    But also other statistics like

  • What player finished first most often
  • What player has the highest win rate
  • etc.
  • produce a very large amount of dataset read operations. On a page which displays only limited amount of statistics the quota for a day was reached with just a couple of refreshes with only 60 rounds and a hand full of games. Also the use of memcache did not solve the problem here.
    This led led to my second approach:

    Second layout of the app

    class Player(db.Model):
        Name = db.StringProperty(required = True)
    
    class Game(db.Model):
        Players = db.ListProperty(db.Key)
        Start = db.DateTimeProperty(auto_now_add = True, required = True)
        End = db.DateTimeProperty()
        Rounds = db.BlobProperty()
    
        def GetRounds(self):
            if self.Rounds:
                return pickle.loads(self.Rounds)
            else:
                return []
    
        def AddRound(self, R):
            Rounds = self.GetRounds()
            Rounds.append(R)
            self.Rounds = pickle.dumps(Rounds, -1)
    
    class Round(object):
        def __init__(self, Game, RoundNumber, PointsTeamA, PointsTeamB, FinishedFirst, TichuCalls):
            self.Game = Game
            self.RoundNumber = RoundNumber
            self.PointsTeamA = PointsTeamA
            self.PointsTeamB = PointsTeamB
            self.FinishedFirst = FinishedFirst
            self.TichuCalls = TichuCalls
    

    Now every Game stores a list of Rounds which are no longer a db.Model . This reduces the amount of dataset reads considerably.

    Questions

  • How would you set up the data model? (Does it make sense to use the BlobProperty storing objects that are not of type db.Model ?)
  • How could an index of this model look like? (Please elaborate on that since I have a very limited understanding of indexes.)
  • With increasing number of elements in the datastore, the quota of reads per day will eventually be reached with the second apporach as well. How would you take this fact into account when designing the model?

  • Short answer - get used to not 'Normalizing' your data. This is sort of the beauty of NoSQL DBS. I would add either a list property or a bunch of integer properties (whichever makes more sense to your application) to the Player model, tracking their Game finishes. Like this:

    class Player(db.Model):
        Name = db.StringProperty(required = True)
        FinishedFirst = db.IntegerProperty(default=0)
        FinishedSecond = db.IntegerProperty(default=0)
        ...
    

    OR

    class Player(db.Model):
        Name = db.StringProperty(required = True)
        Finishes = db.ListProperty() # A list of 1s, 2s, 3s, etc... for each finish
    

    The point is both of these will help save you of querying/using more resources and then programmatically trying to figure out how many times the user has finished in first.

    When you have data that you know you are going to use A LOT, think about storing redundant properties in the main model so it's always at your fingertips without having to re-query.

    Also, take a look at the NDB API https://developers.google.com/appengine/docs/python/ndb/properties You can take advantage of JsonProperty for your game rounds.

    Bottom-line, Normalizing is old school RDB stuff.

    链接地址: http://www.djcxy.com/p/67458.html

    上一篇: 删除的实体使用数据存储管理员继续返回

    下一篇: 如何布局GAE数据存储