Monday, October 19, 2009

Item 24: Make defensive copies when needed




< BACKCONTINUE >


Item 24: Make defensive copies when needed


One thing that makes the Java programming language such a pleasure to use is that it is a safe language. This means that in the absence of native methods it is immune to buffer overruns, array overruns, wild pointers, and other memory corruption errors that plague unsafe languages such as C and C++. In a safe language it is possible to write classes and to know with certainty that their invariants will remain true, no matter what happens in any other part of the system. This is not possible in languages that treat all of memory as one giant array.



Even in a safe language, you aren't insulated from other classes without some effort on your part. You must program defensively with the assumption that clients of your class will do their best to destroy its invariants. This may actually be true if someone tries to break the security of your system, but more likely your class will have to cope with unexpected behavior resulting from honest mistakes on the part of the programmer using your API. Either way, it is worth taking the time to write classes that are robust in the face of ill-behaved clients.



While it is impossible for another class to modify an object's internal state without some assistance from the object, it is surprisingly easy to provide such assistance without meaning to do so. For example, consider the following class, which purports to represent an immutable time period:





// Broken "immutable" time period class
public final class Period {
private final Date start;
private final Date end;

/**
* @param start the beginning of the period.
* @param end the end of the period; must not precede start.
* @throws IllegalArgumentException if start is after end.
* @throws NullPointerException if start or end is null.
*/
public Period(Date start, Date end) {
if (start.compareTo(end) > 0)
throw new IllegalArgumentException(start + " after "
+ end);
this.start = start;
this.end = end;
}

public Date start() {
return start;
}
public Date end() {
return end;
}

... // Remainder omitted
}


At first glance, this class may appear to be immutable and to enforce the invariant that the start of a period does not follow its end. It is, however, easy to violate this invariant by exploiting the fact that Date is mutable:





// Attack the internals of a Period instance
Date start = new Date();
Date end = new Date();
Period p = new Period(start, end);
end.setYear(78); // Modifies internals of p!


To protect the internals of a Period instance from this sort of attack, it is essential to make a defensive copy of each mutable parameter to the constructor and to use the copies as components of the Period instance in place of the originals:





// Repaired constructor - makes defensive copies of parameters
public Period(Date start, Date end) {
this.start = new Date(start.getTime());
this.end = new Date(end.getTime());

if (this.start.compareTo(this.end) > 0)
throw new IllegalArgumentException(start +" after "+ end);
}


With the new constructor in place, the previous attack will have no effect on the Period instance. Note that defensive copies are made before checking the validity of the parameters (Item 23), and the validity check is performed on the copies rather than on the originals. While this may seem unnatural, it is necessary. It protects the class against changes to the parameters from another thread during the "window of vulnerability" between the time the parameters are checked and the time they are copied.



Note also that we did not use Date's clone method to make the defensive copies. Because Date is nonfinal, the clone method is not guaranteed to return an object whose class is java.util.Date; it could return an instance of an untrusted subclass specifically designed for malicious mischief. Such a subclass could, for example, record a reference to each instance in a private static list at the time of its creation and allow the attacker access to this list. This would give the attacker free reign over all instances. To prevent this sort of attack, do not use the clone method to make a defensive copy of a parameter whose type is subclassable by untrusted parties.



While the replacement constructor successfully defends against the previous attack, it is still possible to mutate a Period instance because its accessors offer access to its mutable internals:





// Second attack on the internals of a Period instance
Date start = new Date();
Date end = new Date();
Period p = new Period(start, end);
p.end().setYear(78); // Modifies internals of p!


To defend against the second attack, merely modify the accessors to return defensive copies of mutable internal fields:





// Repaired accessors - make defensive copies of internal fields
public Date start() {
return (Date) start.clone();
}

public Date end() {
return (Date) end.clone();
}


With the new constructor and the new accessors in place, Period is truly immutable. No matter how malicious or incompetent a programmer, there is simply no way he can violate the invariant that the start of a period does not follow its end. This is true because there is no way for any class other than Period itself to gain access to either of the mutable fields in a Period instance. These fields are truly encapsulated within the object.



Note that the new accessors, unlike the new constructor, do use the clone method to make defensive copies. This is acceptable (although not required), as we know with certainty that the class of Period's internal Date objects is java.util.Date rather than some potentially untrusted subclass.



Defensive copying of parameters is not just for immutable classes. Anytime you write a method or constructor that enters a client-provided object into an internal data structure, think about whether the client-provided object is potentially mutable. If it is, think about whether your class could tolerate a change in the object after it was entered into the data structure. If the answer is no, you must defensively copy the object and enter the copy into the data structure in place of the original. For example, if you are considering using a client-provided object reference as an element in an internal Set instance or as a key in an internal Map instance, you should be aware that the invariants of the set or map would be destroyed if the object were modified after it were inserted.



The same is true for defensive copying of internal components prior to returning them to clients. Whether or not your class is immutable, you should think twice before returning a reference to an internal component that is mutable. Chances are you should be returning a defensive copy. Also, it is critical to remember that nonzero-length arrays are always mutable. Therefore you should always make a defensive copy of an internal array before returning it to a client. Alternatively, you could return an immutable view of the array to the user. Both of these techniques are shown in Item 12.



Arguably, the real lesson in all of this is that you should, where possible, use immutable objects as components of your objects so that you that don't have to worry about defensive copying (Item 13). In the case of our Period example, it is worth pointing out that experienced programmers often use the primitive long returned by Date.getTime() as an internal time representation rather than using a Date object reference. They do this primarily because Date is mutable.



It is not always appropriate to make a defensive copy of a mutable parameter before integrating it into an object. There are some methods and constructors whose invocation indicates an explicit handoff of the object referenced by a parameter. When invoking such a method, the client promises that it will no longer modify the object directly. A method or constructor that expects to take control of a client-provided mutable object must make this clear in its documentation.



Classes containing methods or constructors whose invocation indicates a transfer of control cannot defend themselves against malicious clients. Such classes are acceptable only when there is mutual trust between the class and its client or when damage to the class's invariants would harm no one but the client. An example of the latter situation is the wrapper class pattern (Item 14). Depending on the nature of the wrapper class, the client could destroy the class's invariants by directly accessing an object after it has been wrapped, but this typically would harm only the client.





< BACKCONTINUE >

No comments:

Post a Comment