I was recently playing around with a database for a hobby project, and wanted to design a database schema for it.
Lots of looking about, and stumbled upon
(TAO) — The Object Association.
There is actually a lot in there, but the main moment was the
Lets go from a relational model to a graph model!
… and turns out a lot has changed in the last 7~ years since I’ve last touched MySQL. It
now supports JSON data types, which made this journey so much
This is it here, just 2 tables for everything 🎉
# The nodes or entities in the system
CREATE TABLE object
id char(16) NOT NULL default (lower(hex(random_bytes(8)))),
otype int NOT NULL,
created timestamp default current_timestamp,
updated timestamp default current_timestamp on update current_timestamp,
PRIMARY KEY (id)
) ENGINE = InnoDB;
# and the relationship between them
CREATE TABLE assoc
id1 char(16) NOT NULL,
atype int NOT NULL,
id2 char(16) NOT NULL,
time timestamp default current_timestamp,
PRIMARY KEY (id1, atype, id2)
) ENGINE = InnoDB;
I picked an
int here to store the type, as it could be an index into an application tier enum. Cheap to store, and
But what about joins?
Joins happen at the application level, not the database level. Typically you ask the database to join say the users
and the comments tables and return a subset of data, this is not what is happening here. Far easier to apply access or
privacy policies in application tier (as that is where that context lives), than in the queries.
It just makes caching so much easier as you can just cache a whole row. No need to juggle cache keys on a coarse
selection of fields.
💡 It’s easier to cache a whole row, than to cache a selection of fields.
Some patterns of working with this data model, because does take some getting used too.
Typically a query is something like;
JOIN <table> ON <condition>, the graph model this is now written as
JOIN <table> WHERE <condition>.
Get any object by it’s id
SELECT id, type, data FROM object WHERE id = ?
Validate that a relationship exists
SELECT true FROM assoc WHERE type = ? AND id1 = ? AND id2 = ?
# eg: SELECT true FROM assoc WHERE type = 'AUTHORED' AND ...
# see if a user has a authored a post
Get a connection of nodes
Something like getting all blog posts by user,
id2 here being the object ID (remember join at the application level).
SELECT id2 FROM assoc WHERE type = ? AND id1 = ?
…and that’s it. You only really need to get a node, check for the existence of nodes, and get a collection of
nodes. The rest is built up abstractions in the application tier.