Database design rules to follow for a programmer

We are working on a mapping application that uses Google Maps API to display points on a map. All points are currently fetched from a MySQL database (holding some 5M + records). Currently all entities are stored in separate tables with attributes representing individual properties.

This presents following problems:

  • Every time there's a new property we have to make changes in the database, application code and the front-end. This is all fine but some properties have to be added for all entities so that's when it becomes a nightmare to go through 50+ different tables and add new properties.

  • There's no way to find all entities which share any given property eg no way to find all schools/colleges or universities that have a geography dept (without querying schools,uni's and colleges separately).

  • Removing a property is equally painful.

  • No standards for defining properties in individual tables. Same property can exist with different name or data type in another table.

  • No way to link or group points based on their properties (somehow related to point 2).

  • We are thinking to redesign the whole database but without DBA's help and lack of professional DB design experience we are really struggling.

    Another problem we're facing with the new design is that there are lot of shared attributes/properties between entities.

    For example:

    An entity called " university " has 100+ attributes. Other entities (eg hospitals,banks,etc) share quite a few attributes with universities for example atm machines, parking, cafeteria etc etc.

    We dont really want to have properties in separate table [and then linking them back to entities w/ foreign keys] as it will require us adding/removing manually. Also generalizing properties will results in groups containing 50+ attributes. Not all records (ie entities) require those properties.

    So with keeping that in mind here's what we are thinking about the new design:

  • Have separate tables for each entity containing some basic info eg id,name,etc etc.

  • Have 2 tables attribute type and attribute to store properties information.

  • Link each entity (or a table if you like) to attribute using a many-to-many relation.

  • Store addresses in different table called addresses link entities via foreign keys.

  • We think this will allow us to be more flexible when adding, removing or querying on attributes.

    This design, however, will result in increased number of joins when fetching data egto display all " attributes " for a given university we might have a query with 20+ joins to fetch all related attributes in a single row.

    We desperately need to know some opinions or possible flaws in this design approach.

    Thanks for your time.


    In trying to generalize your question without more specific examples, it's hard to truly critique your approach. If you'd like some more in depth analysis, try whipping up an ER diagram.

    If your data model is changing so much that you're constantly adding/removing properties and many of these properties overlap, you might be better off using EAV.

    Otherwise, if you want to maintain a relational approach but are finding a lot of overlap with properties, you can analyze the entities and look for abstractions that link to them.

    Ex) My Db has Puppies, Kittens, and Walruses all with a hasFur and furColor attribute. Remove those attributes from the 3 tables and create a FurryAnimal table that links to each of those 3.

    Of course, the simplest answer is to not touch the data model. Instead, create Views on the underlying tables that you can use to address (5), (4) and (2)


    1 cannot be an issue. There is one place where your objects are defined. Everything else is generated/derived from that. Just refactor your code until this is the case.

    2 is solved by having a metamodel, where you describe which properties are where. This is probably needed for 1 too.

    You might want to totally avoid the problem by programming this in Smalltalk with Seaside on a Gemstone object oriented database. Then you can just have objects with collections and don't need so many joins.

    链接地址: http://www.djcxy.com/p/70620.html

    上一篇: 基于帖子类型的数据库结构

    下一篇: 数据库设计规则要遵循的程序员