A post on Christian Güdemann's blog and a followup on Nathan Freeman's made me figure it could be useful to discuss how I deal with the job of working with document collections in the frostillic.us Framework.
First of all: I am not solving the same original problem Christian had. His post discussed strategies for actually opening and processing every document and secondarily building a collection using inadvisable selection formulas for views (namely, @Today). I don't generally do that stuff, so our paths diverge.
The collection code in the Framework is very focused on dancing on top of existing view indexes: using them for collecting documents, for sorting/categorization, and accessing summary data. In many ways, the main Framework collections can be thought of as a re-implementation of the Domino view data source in XPages, except without the same cheating they do and with some side benefits.
The core essence of a Framework collection - a
DominoModelList - is that it stores information about how to access the underlying view, which category filter to use, and which model class to use to create objects. This can be seen in its constructor; it grabs important metadata information about the view and that's about it. It's not until it's required that the list does the dirty work of actually fetching a
ViewNavigator (or re-using one in the current request). As much as possible, I aim to use
ViewNavigators - which is to say, whenever there's not an FT search involved. Navigators are (for Domino) highly efficient, particularly compared to the shockingly-bad performance characteristics of
The primary consumer of the navigator is the
get(int) method (among other things, collections implement
List). If you'll kindly ignore some workaround code for a bug I haven't reliably been able to pin on either my code or the OpenNTF API yet, you can see the navigator use in action: unless there's a search in place (in which case it falls back to VEC), the method fetches an active navigator, skips to the appropriate requested entry, and uses
getCurrent() to retrieve it. Though the skip/retrieve mix is odd when the average case will be iterating over successive entries, the performance is speedy enough that I haven't felt the need to try to cover both random and sequential access differently.
Since 8.5.0, we have the ability to use click-to-sort columns in the back-end API, and I make use of this to expose sortable columns though
TabularDataModel's sorting methods. Once a sorted column is chosen, I pass that along to the underlying view when fetching it. If there's an FT seach query specified, I use the surfaced-in-8.5.3
FTSearchSorted method to retain the sorting.
In 9.0.1, IBM added the ability to collapse categories in navigators via the
setAutoExpandGuidance method paired with the view's
TabularDataModel's expand/collapse methods, I maintain a
Set of the faux category note IDs generated when
setEnableNoteIDsForCategories is enabled and pass them to the navigator when appropriate. This allows my code to deal with arbitrarily-collapsed categories without containing any other special code - considering how uselessly buggy my implementation of this was before 9.0.1, I'm quite happy those methods are there.
The combination of the sorting and categorization capabilities means that my collections are able to support the same
xp:viewPanel UI features that standard
xp:dominoView data sources are.
Deferred Data Access
The final key concept in the framework is the deferral of actually accessing a document until it's necessary. Each model object can be constructed in one of three ways: as a new document in a database, as a wrapper around an existing document, and as a wrapper around a view entry. In the view entry case, the model object doesn't touch the underlying document. Instead, it makes a note of the database path and the UNID (if a non-category) for if it DOES need to access it later and then stores the entry's column values in a map. While the model objects don't make a user-side distinction between view entries and documents (you can request any item value whether or not it's in the view), it DOES use these cached column values first. So if your code only requests values that are present in the view, the underlying document is never accessed at all. This leads to exceedingly-efficient (comparatively) data access without making the user of the objects worry about manually accessing the document if the value isn't in the view.
The result of all this is that the Framework collections share all the advantages and pitfalls of the underlying views. Some things are easy and fast (categorization, sorting, multi-entry documents, summary data) while some are still impractical or slow (Rich Text, MIMEBeans, arbitrary queries). But you go to production with the database indexer you have, and so far this method has been serving me well.