What's the more idiomatic option in Datomic land for this schema?
I have a question regarding what's a more idiomatic schema for Datomic.
Let's say we have the entities User
, Post
and Topic
.
Post
can belong to Topic
, User
and other Post
(replies). Now, should I,
a) Create a :posts
attribute, that is just a list of Post
s, and inject that into every entity that requires reference to a number of Post
s?
or
b) Establish more explicit relationships, such that a Post
has a :post/author
attribute that is a ref to a User, and perhaps a :post/belongs-to
attribute that can refer to either a Topic
or another Post
?
Observations: If I do b , I seem to get more semantic relationships. I can for example do (:post/_author user-entity)
, which is more descriptive of the nature of their relationshop than is (:posts user-entity)
(since, what does it mean that a User
has :posts
? Are those the User
s favorited Post
s, authored Post
s, or what?)
Another side effect of b is that I can create a new Post
without mutating any other entity. If I do a , I need to create the Post
and also insert it into the :posts
attribute of User
, requiring two operations instead of one.
However, I have a feeling that a might be the more idiomatic way of doing it. It seems, for example, that it would be easier to see how the list of the attribute :posts
have changed over time, should I want to do it, if User
references :posts
rather than having Post
reference User
through the :post/author
attribute.
What would be preferable, and why?
Your option ( b ) is essentially the idiomatic and only way to go in datomic.
All datomic schema is codified only as what values an attribute can take in the structure of an entity-attribute-value datom (EAV).
See http://docs.datomic.com/schema.html - a key proposition taken from the docs is:
Each Datomic database has a schema that describes the set of attributes that can be associated with entities. A schema only defines the characteristics of the attributes themselves. It does not define which attributes can be associated with which entities.
Entities themselves are highly abstract (and internally are just numbers), all the interesting properties of entities are codified as attribute-value assertions. Entities are not typed! You create the semantics of an entity by what attributes you assert for it such as :user/firstname, :post/title, :post/content, :topic/description etc. This is why you really really want to namespace the attributes.
A special case of this is the attribute type :db.type/ref
where the value of "V" in EAV is itself another entity. This is what creates semantic associations between entities. You give each attribute a "name" (as a :db/ident
) to capture what the E<->E connection actually means. So you could have an attribute of :db.type/ref
with the :db/ident
":post/author".
Note that all :db.type/ref
attributes are inherently bidirectional so if Eu
is an entity representing a user and Ep
is an entity representing a post then the following are equivalent both in datom creation and query:
[Ep :post/author Eu]
[Eu :post/_author Ep]
All entity relationships being just more attribute assertions is really flexible. If you later want to add the concept of favorite posts it is just another attribute of :db.type/ref
. Create it with a :db/ident
such as ":user/favorites" and assert connections between preexisting users and posts (which have different user entities as authors).
[aUser :user/favorites somePost]
There is no notion of collection valued attributes so what you suggest in ( a ) is not properly expressible in datomic. You would use query to aggregate the posts. Post deletion would be modeled by a retraction of the entity itself. Such a retracted post will remain visible in the database history.
This does create a challenge of how to specify an order for lists of entities. You either need to use a "natural" ordering such as the date of a post (captured either in the datomic transaction or as an explicit attribute of the post) or use explicit attribute-value based ordering such as via a :post/up-votes numeric attribute.
If you need semantic grouping of entities, where "sub-entities" are only meaningful and only exist as part of something bigger - eg the line item entities in an order - then see datomic components.
I think it depends mostly on your access patterns. If every time you access an entity that can embed its related posts you do need those posts, it makes sense to embed them ( a ). If most of the time you access them separately, then separating them may be better ( b ).
Or you could do both ( c ), by considering the separate Post entity to be canonical one, and the embedded one in the various entities to be cached versions. This way you need a script/batch that update the embedded posts every time the canonical version is updated. This makes all reads easier since the information is always present, but writes more complicated since you need to keep them synchronised by yourself. Also this pattern is usable only if you can accept some inconsistency between the canonical version and the embedded ones and the delay of resynchronisation isn't critical for you.
Note : this advice have no relationship specifically with Datomic, these are techniques borrowed from the NoSQL world and by no mean I'm a specialist.
链接地址: http://www.djcxy.com/p/65262.html上一篇: 将任意键/值条目存储在一个数据组实体中