Showing posts for tag "c"

How I Learned to Stop Worrying and Love C Structs

Aug 4, 2014, 10:38 PM

Tags: c java

Since a large amount of on-site client time has put my framework work on hold for a bit, I figured I'd continue my dalliance in the world of raw item data.

Specifically, what I've been doing is filling out the collection of structs with Java wrappers, particularly in the "cd" subpackage. For someone like me who has only done bits and pieces of C over the years, this is a fruitful experience, particularly since I'm dealing with just the core concept of the Notes API data structures without the messy business of actually worrying about memory or crashing programs.

If you're not familiar with structs, what they are is effectively just the data portion of a class. They're like a blueprint of related pieces of data - ints, doubles, other structs, etc. - used both for creating new elements and for a contract for the type of data received from an API. The latter is what I'm dealing with: since the raw item data in DXL is presented as just a series of bytes, the struct documentation is vital in figuring out what to expect in which spot. Here is a representative example of a struct declaration from the Notes API, one which includes bonus concepts:

typedef struct{
	LSIG	Header;		/* Signature and Length */
	WORD 	FileExtLen;		/* Length of file extenstion */
	DWORD	FileDataSize;	/* Size (in bytes) of the file data */
	DWORD	SegCount;		/* Number of CDFILESEGMENT records expected to follow */
	DWORD	Flags;		/* Flags (currently unused) */
	DWORD	Reserved;		/* Reserved for future use */
	/*	Variable length string follows (not null terminated).
		This string is the file extension for the file. */

So... that's a thing, isn't it? It represents the start of a File Resource (or CSS file, Java class, XPage, etc.). I'll start from the top:

  1. LSIG Header: An "LSIG" is actually another struct, but a smaller one: its job is to let a processor (like my code) identify the following record, as well as providing the total size of the record. When processing the byte stream, my code looks to the first two bytes of each record to determine what to expect for the next block.
  2. WORD FileExtLen: To start with, "WORD" here is just an alias for an "unsigned short" - a 16-bit integer ranging from 0 to 65535 (64K). The term "word" itself refers to the computer-architecture term. This field specifically denotes the number of bytes to expect after the record for the file extension of the file resource being defined.
    • The "unsigned" above refers to the way the numbers are stored in memory, which is to say a series of binary digits. By default, the highest-value bit is reserved for the "sign" - a 0 for positive and a 1 for negative. Normal "signed" numbers can store positive values only half the size of their unsigned counterparts, because going from "0111 1111" (+127) wraps around to "1000 0000" (-127). This is why the "half" values - 32K, 2G - are seen as commonly as their "full" counterparts. As to why negative numbers are represented with all those zeros, you'll either have to read up independently or just trust me for now.
  3. DWORD FileDataSize: A "DWORD" is double the size of a "WORD" (hence the "D", presumably): it's an unsigned 32-bit number, ranging from 0 to 4,294,967,295 (4G). This number represents the total size of the file attachment, and so also presumably represents the maximum size of a file resource in Domino.
  4. DWORD SegCount: This indicates the number of "segment" records expected to follow. Segment records are similar to this header we're looking at, but contain the actual file data, split across multiple chunks and across multiple items. That "multiple items" bit is due to the overall structure of rich-text items, and is why you'll often see rich text item names (like "Body" or "$FileData") repeated many times within a note.
  5. DWORD Flags: As the comment in the API indicates, this field is currently unused. However, it's representative of something that is used very commonly in other structures: a set of bits that isn't useful as a number, but instead has values set to correspond to traits of the entity in question. This is referred to as a bit field and is an efficient way to store a fixed number of flags. So if you had a four-bit flags field for a text item, the bits may indicate whether it's summary data, a names field, a readers field, and/or an authors field - for example "1100" for a field that's summary and names, but not readers or authors. That is, incidentally, basically how those flags are implemented, albeit with a larger bit field.
    • There is a secondary type of "flag" in Notes: the "$Flags" and "$FlagsExt" fields in design elements. Those flags are less efficient - 8 bits per flag instead of 1 - but are generally more extensible and somewhat conceptually easier.
    • These bit fields are what those oddball bitwise operators are for, by the way.
  6. DWORD Reserved: Many (most?) C API data structures contain at least one block of bits like this that is reserved for future use. These are so that IBM can add features without breaking all existing code: API users are expected to leave anything there intact and to create new structures with all zeros. You can see this in action in a couple places, such as the addition of rudimentary theme support to legacy design elements, which used up a byte out of the previously-reserved block.
  7. Variable data: This is the fun part. Many structures contain "variable" data following the "fixed" portion. Whereas the previous parts are a predictable size - a DWORD will always be four bytes - the parts following the block can be anywhere from 0 bytes to whatever is the max value of their referring entity. Fortunately, the "SIG" at the beginning of the record tells the API user the length ahead of time, so it's not required to read it all in when dealing with an API entity. Still, the code to read this can be complex and bug-prone, particularly when there are multiple variable parts packed together. In this case, though, it's relatively simple: we get the value of "FileExtLen" and read that many bytes into an array and convert that from LMBCS to a respectable Unicode string.

Dealing with a stream of structs is... awkward at first, particularly when you're used to Java amenities like being able to just pour out and read back in serialized objects without even thinking twice about it. After a while, though, it gets easier, and you get an appreciation for a lower-level type of programming. And while you still may not be happy about Domino's 32K summary-data limit, you at least get an understanding for why it's there.

So if you're in a position to dive into this sort of thing once in a while, I recommend you do so. Though the value of what I'm writing specifically is mixed - I doubt the world needs, say, a Java wrapper for a CD record reflecting a DECS field association - the benefit to my brain is immense. Dealing with web and Java programming can cause you to become very disconnected from the fundamentals of programming, and something like this can bring you back to solid ground. Give it a shot!

The DSAPI Login Filter I Wrote Years Ago

May 19, 2014, 3:02 PM

Tags: dsapi c

In one of the conversations I had at the meetup yesterday, I was reminded of the DSAPI filter I use on my server for authentication, and remembered I'd yet to properly blog about it.

Years ago, I had a problem: I was setting up an XPages-based forum site for my WoW guild, and I wanted sticky logins. Since my guildies are actual humans and not corporate drones subject to the whims of an IT department, I couldn't expect them to put up with having to log in every browser session, nor did I want to deal with SSO tokens expiring seemingly randomly during normal use. My first swing at the problem was to grossly extend the length of Domino's web sessions and tweak the auth cookie on the server to make it persist between browser launches, but that still broke whenever I restarted the server or HTTP task.

What I wanted instead was a way to have the user authenticate in a way that was separate from the normal Domino login routine, but would still grant them normal rights as a Domino user, reader fields and all. Because my sense of work:reward ratios is terribly flawed, I wrote a DSAPI filter in C. Now, I hadn't written a line of C since a course or two in college, and I hadn't the foggiest notion of how the Domino C API works (for the record, I consider myself as now having exactly the foggiest notion), and the result is a mess. However, it contains the kernel of some interesting concepts. So here's the file, warts, unncessary comments, probable memory leaks or buffer overflows, and all:


The gist of the way my login works is that, when you log in to the forum app, a bit of code (in an SSJS library, because I was young and foolish) finds the appropriate ShortName, does a basic XOR semi-encryption on it, BASE64s the result, and stores it in a cookie. The job of this DSAPI filter, then, is to look for the presence of the cookie, de-BASE64 it, re-XOR it back to shape, and pass the username back to Domino.

Now, the thing that's interesting to me is that, because you've hooked directly into Domino's authentication stack, the server trusts the filter's result implicitly. It doesn't have to check the password and - interestingly - the user doesn't actually have to exist in any known Directory. So you can make up any old thing:

CN=James T. Kirk/OU=Starfleet/O=UFP

Though I do not, in fact, maintain a Directory for the Federation, Domino is perfectly happy to take this name and run with it, generating a full-fledged names list:

Names List: CN=James T. Kirk/OU=Starfleet/O=UFP, *, */OU=Starfleet/O=UFP, */O=UFP

It gets better: you can use any of these names, globs included, in Directory groups or DB-level ACLs and they're included transparently. So I set up a couple groups in the main Directory containing either the full username or "*/OU=Starfleet/O=UFP" and also granted "*/O=UFP" Editor access and an Admin role in the DB itself. Lo and behold:

Names List: CN=James T. Kirk/OU=Starfleet/O=UFP, *, */OU=Starfleet/O=UFP, */O=UFP, Starfleet Captains, Guys Who Aren't As Good As Picard, [Admin]
Access Level: Editor

Now we're somewhere interesting! Though I didn't use it for this purpose (all my users actually exist in a normal Directory), you could presumably use this to do per-app user pools while still maintaining the benefits of Domino-level authentication (reader fields, ACLs, user tracking). With a lot of coordination, you could write your C filter to look for appropriate views in the database matching the incoming HTTP request and only pass through the username when the cookie credentials match a document in the requested app. Really, only the requirements of high performance and the relentless difficulty of writing non-server-crashing C stand in between you and doing some really clever things with web authentication.