The Agent, The Tool and The Act

- Posted in Tech by - Permalink

Reality bites in different ways, almost always bringing up some edge case. In today's world of the mobile apps, there are lots of events and one needs to track referrers (which traffic source led to install of an app); what user if logged on, if any; what physical device is used; what version / strain of the app was involved in the event; etc. Thing is complicated by the fact basically any of these can change. App may be uninstalled and reinstalled. The same or the different strain / version. User may log out / log in. In the same or in different device. The naïve 1-...-1 correspondence is often broken.

If one tries to track all of the variables involved, the complexity will rise exponentially. If the code has grown in an incremental, evolutional manner, using "Do The Simplest Things That Could Possibly Work" (a great idea that should be stuck to), the code probably started with 1-...-1 correspondences anyway, and needed to be adapted (better or worse) to edge cases modifying the simple 1-...-1 correspondences.

To unclog the resulting complexity, a methodology is proposed in this article. It tries to put some order in these scenarios, by defining three abstract entities of data involved, and trying to build the resulting domain model around these abstractions.

These three types of data are: The Agent, The Tool and The Act. The two former are primitive building blocks, the third is their relationship.

The Agent is an entity that the holds the agency – more colloquially known as "user", but the word is undoubtedly used in many libraries, frameworks etc. so a bit less overused word is used. The essence is indeed "having the agency", that is, being the sentient being that is responsible for starting and stopping the tasks / events. In normal cases, it represents the "user as we know it", eg. a logged on account via an identity provider.

The Tool is an entity that The Agent uses to make actions, and only The Tool (not The Agent) can be used to communicate in an information network. The Tool thus combines all the necessary data to identify the hw / sw / net artifacts that are used by The Agent. Example of one instance of The Agent is "The mobile app foo, driven by referrer bar, installed on device baz and communicating on channel quux on provider quib". In general, The Tool is not mutable, eg. the "foo", "bar" etc. are not changed for the data entry representing the tool; instead, if new combination is needed, new instance of The Tool is created.

The third piece is The Act. In essence, it is just a [Agent, Tool] pair. Whenever some agent starts to use some tool (user logs in the app), The Act should be created for the pair. It is in fact the policies of working with The Act that try to tame the complexity accordingly to the need of the project. You may want to create and retain every single act (in which case there may be multiple acts for the same [Agent, Tool] pair and you need The Act to have its own key). You may decide the [Agent, Tool] pair is the composite key and thus there is always at most one act for the given pair. You may go more user-centric and decide there is only one act per user; or device-centric and decide there is only one act per device (yes, you could then just have used a foreign key, but). Or you may have some more elaborate policy on how to treat acts. The bottom line is, all these "data policy" and "relationship of data" question are, more or less, boiled down to how you treat your acts – the agents as well as the tools will need minimal, if every any, changes once you find out what they represent in your case.

Any living app contains lots of event data, and The Tool, The Agent and The Act should be used to build those events. Depending on the type of the event, you may specify who did it (The Agent), what did it (The Tool), both (easy, just use two foreign keys) or, in case you retain multiple acts for agent-tool pair, the historic context (The Act itself). The possible combination here are low, though – just a few options to select from, for each type of the event. The analytics done later on the events is made easy by this fact. When you know you can only have agent-events, tool-event, pair-evcnts or session-events, there is only finite ways of how to work with them. You keep the possible complexity down, plus having the very trivial set of acts, you could always try to look it up and try to fill up missing pieces (eg. we need to communicate with the agent, let's find last act of his and send the message to the tool he used; there are other ad-hoc examples like this).

To recap: to tame complexity of tracking multiple strain of data in todays shifting world, split your data to represent The Agent (one having agency), The Tool (used to make actions and communicate), The Act (pairing of the two), and only refer to one of these in your events.