Jackrabbit's:
public interface RemoteIterator extends Remote {
long getSize() throws RemoteException;
void skip(long items) throws NoSuchElementException, RemoteException;
Object[] nextObjects() throws IllegalArgumentException, RemoteException;
}
The RemoteIterator interface I've recreated a number of times over the years:
public interface RemoteIterator<T > extends Remote {
public boolean hasMore() throws RemoteException;
public List<T > next( int preferredBatchSize ) throws RemoteException;
}
Let's go through the methods individually:
- long getSize() throws RemoteException
This seems problematic to me. Perhaps in Jackrabbit, it is reasonable to assume that you will always know the size of the collection to be iterated. But in many situations, this is not the case. For example, if the iterator is ultimately coming from a database cursor and the collection is large, we want to avoid having to run two SQL queries or store the entire collection in memory (which may be prohibitively expensive). I can see a derived interface, perhaps named BoundedRemoteIterator or something, but not in this interface. - void skip(long items) throws NoSuchElementException, RemoteException
I love this idea. Wish I'd thought of it myself. A trivial implementation would do whatever next( items ), would do and simply not return the items. A smart implementation can save time and space by not actually generating the objects (since they won't be returned). The only change I would make is to have a return value of the number of items skipped and get rid of the NoSuchElementException. If I ask to skip 50 items and only 29 more exist, I would get a return of 29 and no exception. Admittedly, if the interface included a getSize operation, throwing NoSuchElementException is less strange. But even in that case, it seems too rigid to consider skipping more items than are left an 'exception'. - boolean hasMore() throws RemoteException
At first, you might think this is a wasteful operation. Why make a call to find out if there are more items? This results in 'extra' remote calls instead of simply asking for the next item. However, if we imagine the items themselves as quite large (e.g. multi-megapixel images), we start to see a use for this API. If we were keeping the getSize operation, this might not be necessary, but again, getSize() requires state tracking on the client side that can substantially complicate the client code. - Object[] nextObjects() throws IllegalArgumentException, RemoteException
This is similar to my next operation discussed below, except (1) it throws IllegalArgumentException, (2) returns an array of objects instead of a List of a specific type, and (3) does not take a preferred size.
(1) I'm guessing that the getSize operation is intended to be called and then the caller is expected to know when all items in the iterator are consumed. This creates a problem if the client consumes different parts of the iterator in different parts of the code. Now, they have to pass around 2 or 3 items (remote iterator, number of items consumed so far, and possibly the total size) to do this. This is clunky to say the least.
(2) Jackrabbit may have been specified before 1.5 was in wide use, or may feel the need for backward compatibility, hence the use of an array of Object. The downside to this is the loss of type-safety and loss of an API that can be optimized. See the discussion of next below.
(3) See the discussion of next below for the rationale for a preferred batch size. - List<T> next( int preferredBatchSize ) throws RemoteException
By returning a List<T>, we not only introduce type information/safety, we also allow for further optimization. The List returned could be a simple implementation such as LinkedList or ArrayList. Or, it could be a smart List that collaborates with the RemoteIterator to delay retrieving its contents or retrieve the contents in separate threads. There are lots of possibilities introduced by avoiding raw Java arrays.
Allowing the client to pass a preferred batch size provides many optimization opportunities. For example, if displaying the items in a GUI, they can ask for 'page-sized' numbers of items. By making the preferred size a suggestion, it allows an implementation leeway to ignore requests that are deemed unmanageably small or large for the kind of information being returned.
So, where does this leave us? Our RemoteIterator now looks like this:
import java.rmi.Remote;
import java.rmi.RemoteException;
import java.util.List;
/**
* Interface for a remote iterator that returns entries of a particular type.
* @see java.util.Iterator
* @see java.rmi.Remote
*/
public interface RemoteIterator<T > extends Remote {
/**
* Determine if there are more items to be iterated across
* @return true if there are more items to be
* iterated, false otherwise.
* @throws RemoteException if a problem occurs with
* communication or on the remote server.
*/
boolean hasMore() throws RemoteException;
/**
* Skip some number of items.
* @param items the number of items to skip at maximum. If there are fewer
* items than this left, all remaining items are skipped.
* @return the number of itmes actually skipped. This will equal <code >items</code >
* unless there were not that many items left in the iteration.
* @throws RemoteException if a problem occurs with
* communication or on the remote server.
*/
int skip(long items) throws RemoteException;
/**
* Get some number of items
* @param preferredBatchSize a suggested number of items to return. The implementation
* is not required to honor this request if it will prove too difficult.
* @return a List of the next items in the iteration, in iteration order.
* @throws RemoteException if a problem occurs with
* communication or on the remote server.
*/
List<T > next( int preferredBatchSize ) throws RemoteException;
}
I'm still wondering why this isn't part of the JDK.