Black Magic & Document Merging in Objective-C

Black Magic & Document Merging in Objective-C

When we sat down to build a new feature into our Tap Inspect iOS application, we quickly realized that we had a very complicated problem to solve and would need to think outside of the box to keep the code efficient and manageable. Our final solution ended up taking us to a strange place in the Objective-C language that you have probably never ventured before (the dark side?).


The Challenge

Tap Inspect is a mobile and web application for home inspectors and property managers. It is a complete business system that allows inspectors to record data, take photos, create a formal PDF report with the results, and transmit it to their client all from the field. A typical home inspection can take several hours to complete (especially with large properties), and so many of our customers work in teams to save time. In the early versions of Tap Inspect, a report could only be modified by a single device at a time and so inspectors working in teams would have to pass a single iPad or iPhone back and forth during inspections. Obviously this was very cumbersome, and as our business grew this scenario became more common among our customers. The iPad also appeared on the market and it suddenly became more common that a single inspector would want to switch between multiple devices at an inspection (iPhone for the checklist and photos, iPad for the comments that involve more typing and to review the final report with the customer). The primary goal of our product is to save time during inspections, and this was becoming a more important problem to solve every day.

What we envisioned building was a Google Docs-style collaboration engine that would allow multiple users to modify a single document simultaneously. This would allow a team of inspectors to record data for different parts of the inspection on their own devices while seamlessly combining the data into a single comprehensive report. The concept is simple and straightforward, but the implementation gets complicated really quickly when you start to think about our architecture:

  • Inspection report data is stored in XML files on the device (as opposed to records in a web database)
  • Our servers implement a revision control system of those XML reports
  • The application is designed to work in areas with poor or no internet connection (like the crawlspace of a house), so slow or unreliable internet connections are common


The Approach

3-Way Document Merging In Action

3-Way Document Merging In Action

One of the most common ways to handle this problem is an algorithm called the 3-way merge, which combines the information from 3 sources:

  1. A “Baseline” version (in our system, the most recent version of a document from the server before changes were made by device A)
  2. The changes made by device A
  3. The changes made to the same baseline version by all other remote devices

With this in hand, you have enough information to tell what data was added, removed, or rearranged. In our architecture, we decided it made the most sense for the merge to be done by the device rather than the server. It takes a healthy amount of processing power to perform the algorithm,  and this allowed us to distribute the computational overhead  among the remote devices instead of creating a bottleneck at our servers. It also takes the strain off of the frequently-poor internet connection since there would be less data transferred overall.


The Problem

Inspection data is organized in a large tree of objects. Different leaves of the tree have different classes and no real common base class. Some of the nodes are objects with a big set of properties, some are primitive types like strings or numbers. The tree can be radically reorganized during the course of an inspection, since we allow users to add/remove/rearrange content to fit the property they are inspecting. Users can also fully customize the content of the reports to begin with, so there is a great deal of variance in reports throughout our system.

Screen Shot 2014-04-25 at 10.57.28 AMIf you were to start writing a “merge” function on each of the dozens of different classes involved, you would end up with tens of thousands of lines of code. More importantly, it would look like the same code over and over again with only small differences for typecasting and property names. Debugging and maintaining that much code would be a nightmare. There must be a better way!


The Goal

What I really want is a single block of code that can merge any part of the tree, regardless of the type of the node. I want to be able to pass in the 3 versions of that node, and have the code just figure it out. Ideally, I want to just pass in the top node of the trees and have the code recursively traverse the entire document, merging as it goes. But how can we do this? If the nodes don’t have a common base class (and some of them are primitive types like integers), we can’t use inheritance or categories to inject a common merging function. To accomplish this we would need to write code that can:

  1. Call methods on objects without knowing their class
  2. Set, retrieve, and compare object properties without knowing their type
  3. Recursively merge sub-objects or properties automatically when needed

Fortunately, there are some unusual properties of the Objective-C language we can take advantage of to do just this.


Objective-C Messaging

Method calls are handled much differently in Objective-C than traditional languages like C++. When you call a method in C++, the compiler binds the method call to a specific part of the code at compile time. This directly executes that code and returns. That binding can not change once compiled, and it requires enforcing type checking at the time of compilation.

Objective-C is based on the concept of passing messages to object instances. Instead of binding a method call to an object class at compile time, it happens at run time. Type checking and method resolution also happens at run time, which has the strange caveat where its actually possible to call a method on an object without knowing if it exists, or what the return value is. Interesting!

This behind-the-scenes process is mostly hidden from the developer by way that the compiler works in Objective-C. Let’s look at an example.

Your Code Is An Illusion

cats_151Let’s say you are writing the next great cat-themed rock and roll game (BILLION $ IDEA). You might define a class named “kitty” like this:

If you want to call this method from somewhere else in the application, you would write code that looks like this:

But this is not what really ends up in your application. At compile time, your method call is replaced by the following code that links it into the dynamic messaging system:

If this method accepted parameters, like the following:

It would be replaced at compile time with this:

What’s going on here?


Selectors & Dispatch Tables

The compiler looked at the method you are trying to call, and builds something called a selector from it. A selector is a unique identifier that helps the messaging system match method calls to objects that can receive them. Note that it did NOT actually look at the Kitty class when creating the selector, it just evaluated the text in the line of code that you typed. If you have a typo in your function call:


XCode will show you a warning (or error) depending on your settings that looks something like this:


How does this work?

When you compile your application, XCode builds something called a Dispatch Table for every class. This dispatch table lists all of the selectors of messages that instances of that class are capable of receiving. The table also includes a pointer to the dispatch table for any superclass in the inheritance chain.


Every object instance in Objective-C also has a special property called the “isa”. This property links the instance to the class and dispatch table it belongs to, allowing the run-time engine to find the information it needs to send messages. If you have ever spent time in the XCode debugger, you have inevitable encountered the isa, probably while looking for something else.


This is great and all, but how does it help us with the merging algorithm? The key is how Objective-C treats object properties.


Declared Properties FTW

When you create a property for a class you might declare it like this:

The key for us is the synthesize keyword. This does several things behind the scenes when your code is compiled:

  1. It creates getter and setter methods in the class on your behalf. You customize exactly what happens inside of those methods by adding keywords to the @property declaration line in your class (i.e. strong, weak, nonatomic, etc.).
  2. It creates a set of metadata that keeps track of parameter/return types and other things to be used during message sending.

Put another way, when you type the line:


The compiler will replace it with the following:


And most importantly for us, the names of these getter and setter functions are predictable! If you have an instance of an object that you know has a property named “title”, then you can be confident (with some exceptions) that the object will respond to the following method calls:


Progress! Now, how could we get and set the values of a property if the property name itself is a variable?  First we would need to construct the appropriate selectors for the getter and setter methods based on the property name.


Next, we could use the magical performSelector method to send the message using the selectors we have constructed:


Note that the setter in this case will only accept an object as a parameter. Supporting primitive types like int or bool requires a more involved process.


Property Attribute Descriptions

With what we have learned so far we can get and set values for the properties of objects without necessarily knowing what the object is. The problem now is, how do we handle also not knowing what the type of the property is?

To solve this, we can take advantage of the metadata that is also created along with the dispatch tables when the class is compiled. When the compiler  encounters a declared property, it stores something called the property attribute description. This description is a string that defines several pieces of information about the property, including its type. Objective-C provides the property_getAttributes method to retrieve this information at runtime. There are also a number of other functions that allow you to list the properties of an object, retrieve their names, and more. Perfect!

Screen Shot 2014-04-01 at 6.30.27 PM


Bringing It All Together

To see all of this in action, let’s look at a small piece of the merging code. The following function will evaluate two objects and compare a specific property to see if the value is equal. The class of the objects is unknown, as is the type of the property. The only thing we know for sure is the name of the property when calling this function (and that the objects are of the same class).


This function will work for any property of any object, regardless of whether the value of the property is a primitive scaler value like a boolean or an object itself (as long the property value’s class implements the isEqual part of the NSObject protocol). Try it out!

It’s pretty unlikely you will ever need to dive to this level of the Objective-C messaging stack, but it’s certainly good to know how it works. This technique allowed us to build very compact code for our merging algorithm, and hopefully you will find it useful too.


photo credit: seanmcgrath


Write a Comment