Persistent Objects Metadata

26 Jan 2009

If I was to employ some handy analogy I would say that metadata is to persistent object what interfaces are to object oriented programming. More or less. Persistent Objects Metadata lets you spread common functionality (as described by interface) across different schemas (classes). There is a variety of scenarios where you would like to use metadata, hence I tend to treat this technique as design pattern. (It is different one than Metadata Mapping by Martin Fowler). Areas in which I successfully used this pattern are: searching across several incompatible table schemas, logging access to various objects, building permission system for arbitrary objects. So you can see that this technique brings a layer of common functionality on top of set of incompatible types.

Before we begin there are prerequisites, as there is huge chasm between object-oriented and relational world. And a bridge is needed to map objects to records and the way around. It is important to stress that Persistence Metadata builds relations between objects without using standard relational integrity checks, therefore you cannot rely on restricting/cascading any more. Here is the core interface for dealing with metadata-attached objects:

interface IDistinctive {

	public function getDistinctiveId();

	public static function instantiate($id);

}
Class must implement this essential interface to use Persistence Metadata technique

Example scenario

One practical example I can show is how to implement logging system that will work for incompatible objects. Imagine set of classes representing persistent objects like Video, Photo, and Song. Assume that db schema is utterly incompatible between each of the three. Also assume that two common operations are required: 1. log plays or views, 2. increase plays/views count. Additional db table is needed for storing metadata information only.

CREATE TABLE videos (
	video_id SERIAL PRIMARY KEY,
	short_desc TEXT,
	long_desc TEXT,
	file TEXT,
	views_count INT
);
CREATE TABLE photos (
	id SERIAL PRIMARY KEY,
	name TEXT,
	tags TEXT,
	album_id INT,
	views INT
);
CREATE TABLE songs (
	sid SERIAL PRIMARY KEY,
	title TEXT,
	plays_count INT
);
CREATE TABLE access_log (
	distinctive_id INT,
	class_name TEXT,
	accessed_on DATETIME DEFAULT current_date,
	ip_addr VARCHAR(15)
);
Three incompatible table schemas and metadata storage table (in bold).
Not-null constraints and integrity checks skipped for better clarity.

Examining the last table might give you an idea how metadata records are bound with objects. It is by storing class name and distinctive identifier in database (think of it as of a composite pointer to object), so you can store/retrieve persistent object and its metadata independently. It is done via interface previously defined. See example implementation for video class.

class Video extends SomeFancyORMEntity implements IDistinctive {

	public function getDistinctiveId() {
		return $this->video_id;
	}

	public static function instantiate($id) {
		return Video::findById($id);
	}

}

And now logger implementation. All required information is provided by IDistinctive interface and handy get_class function. See this pseudocode example:

class Logger {

	public function getLastAccessTime(IDistinctive $o, $ip) {
		$cn = get_class(o);
		$id = $o->getDistinctiveId();
			SELECT max(accessed_on) AS last_access FROM access_log
			WHERE class_name = $cn AND distinctive_id = $id AND ip_addr = $ip
		return last_access
	}

	public function log(IDistinctive $o, $ip) {
		$cn = get_class(o);
		$id = $o->getDistinctiveId();
			INSERT INTO access_log (class_name, distinctive_id, ip_addr)
			VALUES ($cn, $id, $ip)
	}

}
Adding metadata to incompatible objects is now possible thanks to IDistinctive interface.

By providing class names and unique identifiers via IDistinctive interface it is now possible to add metadata for persistent objects of different classes and db schemas. In fact metadata can be stored in different database or server. Or, if you like technology at its extreme, persistent objects could be kept in XML. Not bad, eh?

Having working logger with ability to check last access we can easily detect multiple hits from the same IP and decide whether to increase views/plays count or not. (Just to mention: it is not the best idea to rely solely on IP address to detect false hits).

More goodies

Static log seems to be an append-only data store, but having historical record of all views/plays you can easily report, say last 10 multimedia files. Just improve Logger class slightly by adding this method:

	public function getLastItems($n) {
		foreach
			SELECT class_name, distinctive_id FROM access_log
			ORDER BY accessed_on DESC
			LIMIT $n
		$items[] = class_name::instantiate(distinctive_id);
		return $items;
	}

Disadvantages

Alert reader will immediately spot two problems here. One (minor) is PHP inability to use variable in static call. The other, much more serious one, is sequential instantiation of persistent objects which is terribly slow for large datasets. There are ugly ways to workaround these problems. To squash vars in static calls eval expression can be employed. It looks really ugly but at least does the job:

eval('$o = ' . $className . "::instantiate($distinctiveId)");
return $o;
Workaround for making static calls on parametrised class.

With a little bit of iterative sorcery sequential instantiation can be brought down to number of different classes in result set returned. This would equate to turning this:

SELECT * FROM videos WHERE video_id = 3653;
SELECT * FROM videos WHERE video_id = 824;
SELECT * FROM videos WHERE video_id = 5004;
SELECT * FROM videos WHERE video_id = 19077;
SELECT * FROM songs WHERE sid = 1829;
SELECT * FROM songs WHERE sid = 5141;
SELECT * FROM photos WHERE id = 8923;
SELECT * FROM photos WHERE id = 844;

…into that:

SELECT * FROM videos WHERE video_id IN (3653, 824, 5004, 19077);
SELECT * FROM songs WHERE sid IN (1829, 5141);
SELECT * FROM photos WHERE id IN (8923, 844);

…which is much better. To achieve this your ORM library must support arbitrary conditions, which is more than likely it does. Splitting and grouping identifiers by class names is done easily:

$groups = array();
foreach ($results as $record) {
	$cn = $record['class_name'];
	$id = $record['distinctive_id'];
	array_key_exists($cn, $groups)
		? $groups[$cn][] = $id
		: $groups[$cn] = array($id);
}

…and the last thing you need to take care is to restore original order in which records were returned, which I spare myself implementing in this place ;-)

Final remarks

As you can observe Persistent Object Metadata is a quick way of delivering common functionality to variety of incompatible persistent objects. Service class (like logger from example above) implements that functionality relying on metadata storage combined with link (composite pointer) to persistent object. All of the applications I mentioned at the beginning (logging, searching, access control) are present in my current project at work and they show their power as system grows and amount AND versatility of data increases. Permission system is particularly interesting for it maps one persistent object to another (source_id, source_class, target_id, target_class, access_type) and same-class looping can represent hierarchical structure (e.g. user 456 reports to user 234). Metadata introduce overhead in performance, but with a little bit of coding skills and common sense it can be overcome.

Digg del.icio.us StumbleUpon Wykop Reddit Folksr

permalink | trackback | rss

 
 

Your turn:

nick:
and?:
www (if any):
Wpisz kod:code
Persistent Objects Metadata php persistence metadata orm