TL; DR;
Coming up with a (fairly) efficient algorithm to organise Accounts into hierarchies in Apex and update fields in those hierarchies accurately.
🔗 Shortcut to code
The Problem
There are Account fields that need to be propagated down the hierarchy when they change (but not up). Only the changed fields need to be updated in the child Accounts. When a new account is added to a hierarchy it should have all the relevant fields updated according to its new parent. Each child Account also holds a reference to the top Account in its hierarchy.
Solution
If we limit this to dealing with only one Account at the time it’s fairly simple: find all the child accounts of the changed one using a recursive findAllChildren method and update them. But when multiple accounts are updated at the same time there are many more challenges coming our way. The changed accounts may belong to overlapping hierarchies. There may be one field updated at the top level, but another field changed a few levels down. We can even have a reparented Account in the same transaction.
I’ve decided to solve this by building a hierarchy structure in memory and using that to determine which updates to keep and which to throw away. Straight away I knew that I’d have to tread carefully as traversing a tree structure repeatedly was going to be quite expensive.
Part 1 – Building the Hierarchy
This was a fun exercise. I start with a list of Accounts and based on the Id and ParentId organise them into tree structures representing their hierarchy positions. I don’t know if all the accounts are related in a single hierarchy or if they are all completely separate. It must not matter.
A class AccountHierarchyMember
that holds the Account and a List of the same that represent its immediate children is the starting point. Those get converted to a list of AccountHierarchyMember instances and then during a Bubble-sort-like iteration they are joined together into hierarchies.
Compare item 0 to 1, 0 to 2,…, 0 to N, 1 to 2, …, N-1 to N, each time the two items can be joined merge them and go back to start. Items can be merged if one’s parent account is a child account of any of the other’s accounts and vice versa
This sounds simple, but it’s actually quite inefficient unless we make sure we don’t just keep looking at all the accounts in the respective items’ lists of children. That’s a big no-no in the CPU limited Apex transaction.
Do you know this Id?
First of all I need to be able to efficiently tell if an Account can be joined into an existing hierarchy and add it. To help with that there is a member Set<Id> allIds
in the AccountHierarchyMember
class and a method contains(Id newMemberId)
to check against it. Another method called addChild(AccountHierarchyMember newMember)
adds the full allIds
set of the new-comer to the current node. Then it checks if the current node is the new-comer top Account’s parent. If it is then it’s just added to the list. If not, the same addChild
method gets called on each of the children until it finds the right place. This ensures each node in the hierarchy has an accurate allIds
set of its own Id and Ids of all its children and the addChild
method is only called on a hierarchy branch that will eventually be a match (I never go down the tree to a dead end).
public Boolean addChild(AccountHierarchyMember newChildAccountHierarchyMember) {
if (
newChildAccountHierarchyMember.accountRecord.ParentId == null ||
!this.contains(newChildAccountHierarchyMember.accountRecord.ParentId)
) {
return false;
}
this.allIds.addAll(newChildAccountHierarchyMember.allIds);
if (this.isDirectParentTo(newChildAccountHierarchyMember)) {
this.children.add(newChildAccountHierarchyMember);
return true;
} else {
for (AccountHierarchyMember existingChild : this.children) {
if (existingChild.addChild(newChildAccountHierarchyMember)) {
return true;
}
}
}
return false;
}
Sorting the Accounts
This is the algorithm resolving the hierarchies. The input is a List of Accounts converted to List of AccountHierarchyMember.
Integer index = 0;
Integer index2 = 1;
while (hierarchies.size() > index + 1) {
AccountHierarchyMember firstMember = hierarchies[index];
AccountHierarchyMember secondMember = hierarchies[index2];
if (secondMember.addChild(firstMember)) {
hierarchies.remove(index);
index = 0;
index2 = 1;
continue;
} else if (firstMember.addChild(secondMember)) {
hierarchies.remove(index2);
index = 0;
index2 = 1;
continue;
}
if (index2 < hierarchies.size() - 1) {
index2++;
} else {
index++;
index2 = index + 1;
}
}
It seemed a bit wasteful to just be iterating over the Accounts while converting them to the AccountHierarchyMember
type so I decided to actually try and organise them into hierarchies straight away. Each new AccountHierarchyMember
is checked against all the existing hierarchies and if they are a match they are joined. Depending on the ordering of the Accounts some matches could be missed; therefore the original “Bubble-sort” part of the algorithm is still needed to reconcile. It has a lot less left to do though which is important for an algorithm with such bad worst-case complexity.
for (Account a : accounts) {
AccountHierarchyMember newMember = new AccountHierarchyMember(a);
Integer replaceIndex = null;
for (Integer i = 0; i < hierarchies.size(); i++) {
AccountHierarchyMember hierarchy = hierarchies[i];
if (hierarchy.addChild(newMember)) {
replaceIndex = -1;
break;
}
if (newMember.addChild(hierarchy)) {
replaceIndex = i;
break;
}
}
if (replaceIndex == null) {
hierarchies.add(newMember);
} else if (replaceIndex >= 0) {
hierarchies[replaceIndex] = newMember;
}
}
With the test data I had during development (about 50 Accounts in 3 hierarchies) the number of individual loop iterations went from a few hundred down to a few dozen. Especially combined with a bit of ORDER BY
optimisation by ParentId when getting the input Accounts this can be expected to perform reasonably well.
Part 2 – Updating the Accounts
Remember the update of a given field has to propagate to all of the children of the originally updated Account. Since I am handling multiple updated Accounts at the same time it’s possible that the original updated Account is not in the top level of my constructed hierarchies. The contains(Id) method is again useful here, this time looking for the Id rather than the ParentId, making sure I find the starting Account as efficiently as possible. Then a new method updateHierarchy() checks the Account’s (and all its children’s) values and change them if needed. And if that’s the case it marks the current account as “changed” to be updated in the Database* later.
If you are using the Apex Common framework you can share an instance of the UnitOfWork class across your hierarchies to achieve the same
The updateHierarchy method’s parameter is a map of SObjectField to Object. This is very important. It may seem that I could just update the fields I’m interested in directly. But I somehow have to remember which values were changed based on which input Account. It is entirely possible that more than one Account from the same hierarchy has been updated. And the updates from higher up should prevail where they overlap with the same updated fields from lower down the hierarchy*.
At least in my case. The opposite may seem more logical but for my requirement the highest level update in the transaction is meant to be the most decisive one. Considering each field separately though!
To achieve that I use a map that holds the Field updated and the hierarchy level the update started at. To know the level there is a relationship to the parent AccountHierarchyMember and a method getLevel() that uses it to determine how far down a given node finds itself; like so:
public Integer getLevel() {
if (this.parent == null) {
return 1;
} else {
return this.parent.getLevel() + 1;
}
}
For each of the originally provided changed accounts I find its place in the hierarchy and check what level it is at, compare the level against already updated fields, filter out those that are no longer relevant and only then apply the updates.
private Map<Id, Map<SObjectField, Integer>> hierarchyFieldUpdatedAtLevel;
public void updateHierarchy(Id accountId, Map<SObjectField, Object> fieldToNewValue) {
for (AccountHierarchyMember hierarchy : this.hierarchies) {
if (hierarchy.contains(accountId)) {
AccountHierarchyMember requiredHierarchyBranch = hierarchy.get(accountId);
Integer currentUpdateLevel = requiredHierarchyBranch.getLevel();
Map<SObjectField, Object> fieldsToUpdate = filterUpdatedAtHigherLevel(
hierarchy.accountRecord.Id,
currentUpdateLevel,
fieldToNewValue
);
if (!fieldsToUpdate.isEmpty()) {
requiredHierarchyBranch.updateHierarchy(fieldsToUpdate);
}
break;
}
}
}
}
private Map<SObjectField, Object> filterUpdatedAtHigherLevel(
Id topLevelAccountId,
Integer currentUpdateLevel,
Map<SObjectField, Object> fieldToNewValue
) {
Map<SObjectField, Object> stillToUpdateFieldsToNewValue = new Map<SObjectField, Object>();
Map<SObjectField, Integer> hierarchyUpdatedFields = hierarchyFieldUpdatedAtLevel.containsKey(topLevelAccountId)
? hierarchyFieldUpdatedAtLevel.get(topLevelAccountId)
: new Map<SObjectField, Integer>();
for (SObjectField updatedField : fieldToNewValue.keySet()) {
if (!hierarchyUpdatedFields.containsKey(updatedField) || hierarchyUpdatedFields.get(updatedField) >= currentUpdateLevel) {
stillToUpdateFieldsToNewValue.put(updatedField, fieldToNewValue.get(updatedField));
hierarchyUpdatedFields.put(updatedField, currentUpdateLevel);
}
}
hierarchyFieldUpdatedAtLevel.put(topLevelAccountId, hierarchyUpdatedFields);
return stillToUpdateFieldsToNewValue;
}
Update of the Hierarchy is fairly obvious. One minor exception being the Root Id which is meant to be blank at the top level.
public void updateHierarchy(Map<SObjectField, Object> fieldToNewValue) {
updateIfChanged(fieldToNewValue);
for (AccountHierarchyMember child : this.children) {
child.updateHierarchy(fieldToNewValue);
}
}
private void updateIfChanged(Map<SObjectField, Object> fieldToNewValue) {
Boolean isChangedThisTime = false;
for (SObjectField field : fieldToNewValue.keySet()) {
Object oldValue = this.accountRecord.get(field);
Object newValue = fieldToNewValue.get(field);
//Top Level Accounts should never have Root Account Id filled in
if (field == Account.RootAccountId__c && this.isTopLevelAccount) {
newValue = null;
}
if (oldValue != newValue) {
this.accountRecord.put(field, newValue);
isChangedThisTime = true;
}
}
this.isChanged = this.isChanged || isChangedThisTime;
}
Once this all is done I store updated Accounts’ Ids in a Trigger recursion control map and do the DML. This is needed to tell the difference between genuine updates to the observed fields (which should cause the hierarchy update) and those done as part of the hierarchy update.
Part 3 – Initiating Hierarchy update
I like to keep the record evaluation and collecting relevant records away from Trigger Handlers to keep those easier to read. In this case in AccountHierarchyBranchUpdate class which has methods to call from triggers to see if relevant fields have been updated and collect their new values. Besides consulting the recursion control mentioned above it also remembers which field updates were already seen in the transaction in case Account Triggers go round more than once. Very much the same filtering happens as during the updateHierarchy process. Once ready it “schedules” the whole process to run as AsycJob .
Showing more code examples here for that would make this (already long) post far too big. I might write a separate post going into more detail about the challenges (there were several) of wiring this update to triggers correctly.
Complexity
So how “efficient” is the algorithm then? Well.. on paper, not very efficient at all.
It's OK for SOQL Queries (3 – 11) and great for DML (1). However, BigO is
N^2 + n^2*m
where n is the input Account list size, N are all involved accounts including children and m is the number of fields updated.
It’s obvious that optimisations are super important here. Thanks to ordering Accounts by Parent Id (helps with the reconciliation bubble-sort finding matches sooner) and storing the hierarchy branch Account Ids on each node (not having to check all children of all branches all the time) the algorithm never goes even close to the worst case. In my case I am handling 3 fields across 10-20 hierarchies with 100s of Accounts without problems.
Limitations and Considerations
Depending how big the normal Account hierarchy is in the Org the single DML call may be the most CPU expensive part. This solution needs to be extended to split the updates up into separate transactions. I know I had to do that. There is no room to cover that in this post but you can see it in my complete source on GitHub.
The intention here is to handle only several fields. With more and more fields the heap usage and loop count go up and issues arise. So it’s important to be able to control batch size at least. But if you have so many fields that need to be updated through the Account hierarchy like that, maybe consider a separate SObject instead.