How deeply have you contemplated the subtleties of cloning?
Cloning is a commonly known and selectively used tool for a variety of purposes. Most people believe it is simple and straightforward. But it is a sensitive issue that few truly understand, far fewer than engage in it. It is deceptively easy to do, merely override the
Looks fine and dandy to most. It copies all the members and returns a deep copy of the object. That is just what the contract of
Now we have an object hierarchy that is fully clonable. This is what most new Java programmers and nearly all migrating C++ programmers produce the first time they are asked to make an object clonable. The C++ programmers usually do this as an after-thought, after writing a copy constructor and then being told that the Java paradigm is to use
Except that it does not work. Think carefully, what are the principles of object-oriented programming? Inheritance, encapsulation, and polymorphism. What does the above method do in the face of polymorphism? The key subtlety of the
Now
Thanks to polymorphism and the vtable, it will know at runtime that the object is really a
Now let us imagine another programmer comes along and uses these classes for himself. But he needs something other than the provided classes, so he writes his own.
Notice that this programmer did not implement
Now it should be that
The prescribed implementation is to call
There is a twist to this still undiscussed. The root implementation,
This is why the suggested implementation of
In summary, cloning is an activity that should only be entered into with careful thought, reflection, and a thorough understanding of the subtle issues involved. I skipped over the times it is appropriate to return a different type from that being cloned, and how one can break the clonable chain by making use of the
Object.clone() method, have it return a new object of the appropriate type, and initialize all the member variables with copies of the fields in this. The naïve and the converted C++ coder will then jump in and write something resembling the following code.public class House {
protected int stories;
protected String address;
public House clone() {
House clone = new House();
clone.stories = stories;
clone.address = new String(address);
return clone;
}
}Looks fine and dandy to most. It copies all the members and returns a deep copy of the object. That is just what the contract of
clone says. Now consider a subclass of House:public class Duplex extends House {
protected int sections = 2;
public Duplex clone() {
Duplex clone = new Duplex();
clone.stories = stories;
clone.address = new String(address);
clone.sections = sections;
return clone;
}
}Now we have an object hierarchy that is fully clonable. This is what most new Java programmers and nearly all migrating C++ programmers produce the first time they are asked to make an object clonable. The C++ programmers usually do this as an after-thought, after writing a copy constructor and then being told that the Java paradigm is to use
obj.clone() instead. This looks remarkably like a copy constructor with a different signature, and is the obvious implementation.Except that it does not work. Think carefully, what are the principles of object-oriented programming? Inheritance, encapsulation, and polymorphism. What does the above method do in the face of polymorphism? The key subtlety of the
clone method is that it must return the proper type for the object being cloned, even if that is not known. Suppose we make use of polymorphism to hold a reference to a generic House. House house = new Duplex();Now
clone it. House newHouse = house.clone();Thanks to polymorphism and the vtable, it will know at runtime that the object is really a
Duplex and invoke Duplex.clone(), and assign that object to the handle of type House because it is a supertype. Looks fine, right?Now let us imagine another programmer comes along and uses these classes for himself. But he needs something other than the provided classes, so he writes his own.
public class ApartmentBuilding {
private int apartments;
private java.util.Set<String> tenants = new java.util.HashSet<String>();
}Notice that this programmer did not implement
clone(), either because he was lazy or did not care. But he makes use of existing methods in the library, some of which clone Houses. What does one get with the following? House house = new ApartmentBuilding();
House house2 = house.clone();Now it should be that
house2 is an exact copy of house, which is an ApartmentBuilding. Instead, house2 is an instance of House and not any subclass. Imagine the new coder’s surprise when he expects to get back an ApartmentBuilding from the library and instead gets a House. This can break applications, in a way difficult to debug unless one is aware of the potential pitfalls of clone. Most people read the JavaDoc and trust it, so when it says, Creates and returns a copy of this object. … By convention, the object returned by this method should be independent of this object (which is being cloned).In computer science terminology, this is a deep copy. They also expect the type to be correct, as the JavaDoc says,
The general intent is that, for any object. In simple language, that means the object returned byx, the expression:x.clone() != xwill betrue, and that the expression:x.clone().getClass() == x.getClass()will betrue
clone() will be of the same type as that being cloned.The prescribed implementation is to call
super.clone() and then modify any fields of the class that need it. One advantage is that this obviates any need to copy fields in superclasses, only the class being implemented. But more importantly, it handles the polymorphism case elegantly, because the root implementation of the chain, Object.clone(), is implemented as part of the JVM and uses reflection to allocate enough memory for the actual subclass being cloned, then does a byte copy that copies all primitives and points all handles to the same objects as in the object being cloned. For any immutable objects, this is functionally equivalent to a deep copy, so only mutable object references will need to be further cloned in subclass implementations. Note that many library classes are immutable though it is not clearly identified in their descriptions. Integer is immutable, as is Double, but most important of all, the pervasive String is immutable and need not be handled in any subclass clone implementation.There is a twist to this still undiscussed. The root implementation,
Object.clone(), will use reflection to check the object being cloned, looking for the interface Cloneable. If the class does not implement that zero-method flag interface, Object.clone() will throw an exception. Any class that wishes to be clonable must implement this interface. However, due to inheritance, any subclass will also be clonable, so once the chain has been started it should not be broken.This is why the suggested implementation of
clone() begins with getting the clone to populate by invoking super.clone() and casting it to the current class type. Now look up to the above code, with the naïve programmer returning a concrete type from clone(). Should the user of his library try to subclass House, and make ApartmentBuilding clonable using the proper technique, his method will fail with a ClassCastException because ApartmentBuilding’s call to super.clone() will return a House and not an ApartmentBuilding as Object.clone() would have.In summary, cloning is an activity that should only be entered into with careful thought, reflection, and a thorough understanding of the subtle issues involved. I skipped over the times it is appropriate to return a different type from that being cloned, and how one can break the clonable chain by making use of the
throws CloneNotSupportedException escape hatch built into the method contract. I figure anyone making use of those techniques has already contemplated and understood these issues, this is for people new to cloning or the language in general. 