Suppose there is a class AddressBook, which contains the following fields:
public class AddressBook implements FastCopyable {
private int numMembers;
private ImmutableAddress[] birthAddress;
private Address[] currAddress;
private int[] score;
...
}
The integer numMembers is the number of people currently stored in the address book. Each of them
has a birth address, a current address, and a score, all stored in parallel arrays. The Address
class is defined elsewhere, and has getters and setters. The ImmutableAddress class is also defined
elsewhere, and has getters, but no setters. It is immutable, so its contents never change after it is instantiated.Updating the current address of the person in position 5 might be done like this:
addressBook.getCurrAddress(5).setStreet("Main");
Updating a person's birth address street will be slower, because the birth address is immutable, so an entirely new
address must be created:
addressBook.setBirthAddress(5,new ImmutableAddress(123,"Main","Pittsburgh","PA",15213)The
AddressBook class implements FastCopyable, so it must implement a copy()
method that returns a fast copy of itself. In addition, it might have several getters and setters for its variables. The
class and its copy/get/set methods can be implemented several different ways, with different degrees of optimization.
AddressBook.copy() method so that it instantiates a new AddressBook
object, copies numMembers from the original, makes a shallow copy of all 3 arrays, and walks through
the currAddress array in the copy and replaces each Address object with a copy
of that object.
If the Address class implements FastCopyable, then it can make a fast copy,
using the Address.copy() method. Otherwise, it will have to make a deep copy.
Fortunately, there is no need to spend any time making copies of the objects in the birthAddress
array. They are immutable, so a shallow copy of the array is safe.
In this case, the getters and setters can all be implemented in the obvious way. It is only in the optimized cases that anything special needs to be done for them.
AddressBook.copy() method is fast when its fields include only primitives and immutable objects.
It is slower when it contains arrays or collections of primitives and immutable objects. Slower still are arrays or
collections of mutable FastCopyable objects. And the slowest is arrays or collections of mutable objects that aren't
FastCopyable. So the simplest optimization is to minimize the use of the slower things. Make everything either a
primitive or an immutable object. Or an array or collection of primitives and immutable objects. In some cases, it may
seem like the objects in an array need to be mutable, because they have one field that changes frequently. But perhaps
that field can be moved into a separate array of its own, and then the remaining object could be made immutable. That
kind of refactoring can make a large difference in speed, because it also allows optimizations 2 and 3.
AddressBook can be written to only contain primitives and arrays and collections of immutable
objects, then a further optimization can be done. Create a new class AddressBookData, which is
visible only within the package and not visible to the user. Put all of the arrays and data in it, and let AddressBook
have only a single variable, which is a reference to an AddressBookData. In other words, let the AddressBook
object be a handle pointing to the real object, which is of class AddressBookData. In addition, add
an integer reference count variable to AddressBookData, which remembers how many handles have been
created that point to it.
Now, the AddressBook.copy() method can be made much faster. It instantiates a new AddressBook
handle containing the same reference as the original, and it increments the reference counter in the referenced
AddressBookData object. No arrays need to be copied, not even with shallow copies. And the
getters are still very fast for reads. When a setter method of AddressBook is called, it first
checks the reference counter. If it is 1, then the setter simply makes the requested change. If it is greater than
1, then it instantiates a new AddressBookData object, and makes shallow copies of all the arrays
and collections, and initializes its reference counter to 1, while decrementing the original reference counter.
Whenever code inside AddressBook needs to change something in
AddressBookData, it first performs that check to see if the reference counter is 1, and makes
copies if it is not. In many cases, this is easiest to accomplish by having it call one of its own setters, so those
checks only have to be written in the setters.
This is a powerful optimization. It ensures very fast copies, reads, and writes, except for the first write after a copy. But this optimization is only possible if all of the arrays and collections contain only primitives or immutable objects.
AddressBook needs to modify current addresses internally, so it cannot limit itself to
immutable addresses. But suppose its users do not need to modify the current addresses returned by its getters. In that
case, the current address could still be a mutable Address, but the getter method AddressBook.getCurrAddress
could return an object of class ImmutableAddress, which it instantiates and wraps around the
Address object at the time it is called. Then, every Address method other than
the setters will be replicated in ImmutableAddress, and each method in the latter will simply call
the corresponding method in the former. Of course, it will be necessary to look through the code to ensure that none of
the methods change the address as a side effect. But in general, it should be possible to make the ImmutableAddress
object act like an immutable version of Address.
After these changes, optimization 2 can be applied. As before, it checks the reference counter before each write. If
the counter is more than 1, then it makes the required copy, including shallow copies of all the arrays. But there
is one additional step beyond optimization 2. In the new object, for each array or collection of mutables, such as
the AddressBookData.currAddress array in this example, it must take each element of the array
and replace it with a copy of itself. If the object is a FastCopyable, then a fast copy is sufficient. Otherwise, it
will need to be a deep copy.
So, if Address implements FastCopyable, then this approach only slightly
slows down AddressBook. But if Address is large, mutable, and not
implementing FastCopyable, then it will slow it down considerably. Fortunately, this slowdown
only occurs during the first write after a copy, but not on subsequent writes, and not on reads.
AddressBook.getCurrAddress actually needs to modify the returned
object, so it needs to return an Address rather than an ImmutableAddress. In
that case, we can apply optimization 3, but without the immutable wrappers, and with one additional change. Each getter
in AddressBook that returns a mutable object, must do the same check of the reference counter that
is required for the setters.
So, with this optimization, if the user calls a getter that returns an immutable, then the operation happens as
before. If the user calls a getter that returns a mutable, or calls a setter that modifies an array inside AddressBook,
then the getter or setter will first check the reference count, and make the appropriate copies if it is greater
than 1. After making those copies, it will then execute the getter or setter. That same check will also be made
before any mutables are modified by code internal to AddressBook itself.
This optimization means that the first call to any getter or setter of mutables after a copy will result in shallow copies of everything. But after that, all getters and setters will be very fast, until the next time a copy is made.
copy
method somewhat slow, by forcing it to make shallow copies of all arrays and collections, and to make fast copies of all
FastCopyables in those arrays and collections, and to make deep copies of any mutable object that isn't a FastCopyable.
This process can be made faster by using handles and reference counters. It can be made faster still by wrapping returned mutables inside immutables. And it can be made fastest by avoiding mutables entirely.