Box Notes, our built-in note taking app built for teams, aims to leverage its strength in real-time collaboration to 10x the productivity of Box users. One of the challenges with this, though, is that Notes are constantly changing. This makes it harder to figure out what has changed since you last visited a Note; you end up having to visually figure out what different changes have happened, and it becomes a very manual process. So, to further enhance the end-user experience, we wanted to make consuming content easier than ever. Today, we are excited to bring a new feature to Box Notes: Differences. Differences makes it easy to see all the new changes that have been made in a Note since you’ve last viewed it by highlighting all new content. Figuring out diffs for plaintext is easy, but figuring out diffs for a real-time, rich text environment is an entirely different problem. In order to bring Differences to life, we had to take a deep dive into the inner workings of our editor.

One of the interesting things Box Notes engineers get to do is work on a real-time collaboration layer. Specifically, we use an algorithm known as Operational Transformation (OT). This layer handles consistency of the Note model data between clients and our servers. It tells us where to insert/delete text, what kind of styles should be applied, who made certain changes, how to resolve conflicts if users’ changes collide, etc. When we started implementing Differences, we realized we could leverage our existing collaboration model and avoid implementing an entirely new layer (such as doing a diff on the DOM); in fact, we needed to hook into our OT model to fully flesh out the feature.

The basic implementation

The Note data model is not too complex. It contains information about the plaintext of the Note, and what styles are associated with various ranges on the plaintext. Getting the basic case of diffs to work is fairly simple. The client receives a snapshot of the Note that it last saw, as well as the current version of the Note. We pulled in an npm module for Google’s diff-match-patch library (DMP) to calculate the plaintext diff between the two versions, which lets the client know where net new content was added. One of the nice things that this library has over other diff libraries is the ability to make diffs more semantic, meaning it will attempt to diff word by word instead of character by character. However, it doesn’t tell the client where style changes (such as bolding, underlining, etc.) may have occurred. OT, on the other hand, can tell the client what has actually changed style-wise between two versions of the Note. Combining the results of DMP and the Note’s OT data model, the client can figure out both where text was newly added, and where styles were changed.

The diagram below shows roughly how our data is structured and what the diff data structure looks like when it gets calculated.

How caching and collaboration makes things challenging

One of the neat features in Box Notes is that users can immediately see and update their content even if it is outdated. For example, if you open a Note in a browser tab, leave that tab out of view for several days, then return to it, you’ll be able to type immediately, even if you have not yet received the latest content from the server and rolled it in just yet (which can happen if, say, your network is slower). This points to an immediate problem: how can we show you a change if you don’t even have the latest version of the Note to diff against? The first step is obvious — wait for the Note to catch up and then calculate the diff; however, all the changes you might have made before your Note caught up shouldn’t be part of the diff. You knew those changes were just made by you, so it wouldn’t make sense for them to be reflected back as a diff.

This posed one of the key problems for Differences: how do we make a diff ignore all the changes that happened before your outdated Note was caught up with the server? One great thing about OT is that it is easy to rebase a change so it can fit on top of new content. Similar to version control programs like Git, OT in Box Notes has a concept of rebasing. Rebasing a change is when a client submits a change while outdated, so the change must first take into account all of the changes that exist on the up-to-date Note before it can be applied. Next, we’ll look at an example of how this is done in Box Notes, and how Differences takes advantage of it.

Let’s say you have an outdated (cached) Note in a dormant tab on your browser client and you’re missing the latest 500 revisions (r501 – r1000), where revisions are the individual changes users enter into a Note. You saw the Note at some in-between state on your mobile device earlier in the day (r0 – r750). Ideally, when you go back to the Note in your dormant browser tab, you want to see highlights of new or changed content you haven’t seen yet (r751 – r1000); this would be all changes on the Note since you last viewed it on your mobile device earlier in the day.

To throw in some more complexity, let’s say you typed some changes before you even received any missing changes on your client (r501* – r600*). Note: the local changes can just be though of as the individual 100 local changes made, but in actuality, they get merged into a single change when they get sent to the server.

From this point on, we just need to follow the correct order of operations. All the pieces are available and our goal is to create a change that highlights the diffs, but does not highlight any local changes you’ve made before diffs could be calculated.

First off, when we receive the missing changes from the server, we can apply them to the state of the Note that does not contain any local changes (r0 – r500) in order to recreate the server view of the Note (r0 – r1000). Now that the client has the same view of the Note the server has, it can make use of the last seen Note snapshot that was sent earlier. This snapshot (r0 – r750) is the exact Note contents at the time the user last saw the Note. Using both the up-to-date Note and this snapshot, the client can figure out the diffs (r751 – r1000). We also want local changes to go on top of the missing changes once the local changes are acknowledged by the server and so we have to rebase those changes to become r1001 – r1100. The user is now seeing what should be the most up-to-date correct version of their Note.

I’ll hand wave the nitty-gritty details of the next part a bit, but the diffs calculated earlier are used to make a change that will highlight the new content you have not seen yet.

And now we can rebase these highlights anytime we need to in order to always highlight the relevant diffs for the user. For example, users can choose to hide their Differences with the highlighter button next to Presence, and when the button is toggled back on to show the Differences, the original change to highlight the text gets rebased to fit on top of the new content.

Having this type of collaboration model is powerful. As long as we can figure out how to format any new editor feature to talk in OT, we have a lot of possibilities in what we can do to continue to power up realtime collaboration.

Want to see Differences in action? Get some collaborators and try out Box Notes. If you’re interested in solving complex problems like this, we are always looking to further grow our team, so check out Engineering Careers @ Box for more information.

Box