Skip to main content

Schema Administration

In general, schema in Meerkat DSA follows the X.500 directory schema administration model defined in the X.500 series of specifications.

Use of Schema

The X.500 standards do not clearly require that all attribute types used within a subschema be defined within the subschema subentry. In fact, the LDAP specifications explicitly do not require this. As such, certain subschema operational attributes that describe schema objects that are universal in nature (e.g. the attribute type with object identifier 2.5.4.3 is universally commonName and MUST have the same exact meaning everywhere) are purely informative. Meerkat DSA does not check that, say, an attribute type is defined in the relevant subschema before permitting it to be used in an entry.

Meerkat DSA does, however, maintain an internal index of recognized schema objects, and it does check that entry creations and modifications only make use of schema objects that are recognized within this index. This internal index is populated with schema objects as described in the following sections.

note

If you are looking at the code of Meerkat DSA, the "index" being talked about above is the aggregation of the attributeTypes, objectClasses, nameForms, contextTypes, and other schema-related properties of the context object, often named ctx.

Pre-Installed Schema

Meerkat DSA comes with schema pre-installed. At a minimum, this pre-installed schema includes:

  • All schema in the X.500 specifications.

It is highly recommended that you do not edit or re-define any of the pre-installed schema: some of it is critical to directory operation. Meerkat DSA may fail to work at all if some pre-installed schema elements are altered.

If you believe there is a mistake pertaining to the implementation of pre-installed schema, please report it as a bug instead so it can be fixed properly. Only alter the pre-installed schema if it is advised as a workaround for a known issue.

Viewing Schema

In X.500 directories, schema are stored and served from subschema subentries. The schema that these subentries serve applies only to the subentry's administrative area. (Note that the subtreeSpecification is ignored for subschema subentries. There may only be one subschema subentry, and it applies for the whole administrative area. The subtree specification must not have any minimums, maximums, chops, or refinements at all.) Within a subschema subentry, the elements of the schema are served in attributes.

Even though these attributes are supposed to be stored in subschema subentries, some of these subschema elements describe attribute types, context types, name forms, matching rules, and other constructs that are universal, because they have a single, universally-unique object identifier that identifies them. For these schema elements, Meerkat DSA stores them independent of the subschema subentries, but displays them in every single subschema subentry. For example, if you define an object class in Meerkat DSA, that object class will appear in every single subschema subentry, regardless of which subschema subentry to which you added the value of attribute type objectClasses. As stated above, this behavior applies to all schema constructs that are identified by an object identifier; such schema elements will be referred to as "universal schema elements" throughout this documentation.

Schema elements that are not global in nature do apply to specific subentries. These include DIT structure rules, DIT content rules, DIT context use rules, matching rule uses, and attribute friendships.

Root DSE Schema

The LDAP specifications require there to be a subschema subentry that applies to the Root DSE. This is useful for informing clients about what attribute types exist in the root DSE. In Meerkat DSA, the root DSE may not be edited, which makes a hard-coded subschema subentry an easy and obvious solution. In Meerkat DSA, there is a hard-coded subschema subentry having the distinguished name cn=subschema. This subschema subentry only "exists" if it is queried directly.

Editing Schema

There is yet another categorical dichotomy in schema elements: those that are purely data, and those having a functional component. Some schema elements can be represented purely as data, whereas others require some code to function. For instance, to implement a matching rule, somebody has to actually write code to perform the matching. Those subschema elements that require code are:

  • Attribute types
  • Context types*
  • Matching rules
  • LDAP syntaxes

In attribute types, functions must be defined for encoded and decoding a Basic Encoding Rules-encoded ASN.1 element representing the value into a usable data type and back.

In context types, a function (a "context matcher") must be defined for matching a Basic Encoding Rules-encoded ASN.1 element representing a context value with a similarly-encoded context assertion value. Another function may need to be defined to produce a default value for the context type.

note

You can define a context type in the database via the ContextDescription table, or by adding values to the contextTypes operational attribute of a subschema that resides within your DSA (not a shadow or RHOB subentry, however). A context type created in this manner will have a general-purpose matcher that compares two ASN.1 values byte-for-byte (this differs from comparing the whole element byte-for-byte because it takes into consideration that the element may be constructed and deconstructs it before the byte-for-byte comparison).

This means that you might not need to define a new context type in the init script if its syntax is a BOOLEAN, INTEGER, OBJECT IDENTIFIER, ENUMERATED, NULL, or any other primitive type that has one single byte representation when using the Basic Encoding Rules. All bets are off when the syntax is a constructed type or a type that could be constructed.

You may also define a DEFAULT-VALUE for that context type in the database. This will be the raw bytes of a Basic Encoding Rules-encoded ASN.1 value.

Still, it is better to define a context type in the init script, because you can define more complex and strict comparators.

In matching rules, a function (a "matcher") must be defined for evaluating whether a Basic Encoding Rules-encoded ASN.1 element representing an attribute value matches against a Basic Encoding Rules-encoded ASN.1 element representing the matching rule's assertion syntax. This function is slightly different depending on whether an equality matching rule, ordering matching rule, or substrings matching rule is being defined.

In LDAP syntaxes, functions must be defined for converting to LDAPStrings from ASN.1 elements and vice versa, so that attribute values in LDAP can be translated into a form that Meerkat DSA can utilize. Instead of handling LDAP attribute values directly, Meerkat DSA translates them to the equivalent ASN.1 elements that would have been in an equivalent Directory Access Protocol (DAP) request. When Meerkat DSA is done processing the request, the any attribute values in the response are translated back into an LDAP-equivalent using these functions.

For schema elements that require code, you'll have to add them via the init script, which is detailed below.

Editing Data-Only Schema

For schema elements that are purely data, such as name forms, they can be defined by simply adding them to subschema subentries as the X.500 specifications would portend. Alternatively, they can be added to the database directly. This should not be too hard to figure out, because schema elements have their own separate tables in the database. Note that if you insert new schema elements in the database directly, you may need to restart Meerkat DSA for them to appear.

Using the Directory Access Protocol (DAP) to define new schema elements should be preferred to directly inserting data into the database. Once downside of using the DAP to modify schema elements is that you cannot delete universal schema elements (attribute types, object classes, etc.) once they are defined. This is intentional: its purpose is to prevent administrators from redefining schema elements such that a given object identifier now ambiguously refers to multiple different versions of a schema element.

Note that schema objects can only be added via subentries that are "internal" to your DSA. If you try to add schema objects to subentries of DSE type rhob or shadow, they will not be recognized by your DSA. If this were not the case, DSAs with which you have an outstanding operational binding could overwrite your DSA's internal conception of these schema elements. It is also worth noting that new schema objects are recognized by your DSA on a first-come-first-served basis. If you add, say, an attribute type with object identifier 2.5.4.3 to a subentry, then somebody else comes along to add the same attribute type to another subentry, the first definition of that attribute type will persist throughout the whole DSA.

The benefit of using the database directly is merely that it is simpler and faster.

Finally, you can also use the init script to define schema elements when Meerkat DSA starts up. However, this is not recommended, because these schema elements will only exist while they are defined in the init script. It is strongly advised that data-only schema elements be persisted to the database.

Custom Attribute Types

Below is an example for implementing a custom attribute type:

import {
AttributeUsage_userApplications,
} from "@wildboar/x500/src/lib/modules/InformationFramework/AttributeUsage.ta";
import { ObjectIdentifier, FALSE } from "asn1-ts";
import { DER, _encodeObjectIdentifier } from "asn1-ts/dist/node/functional";

export async function init (ctx) {
ctx.attributeTypes.set("2.5.4.3", {
id: new ObjectIdentifier([ 2, 5, 4, 3 ]),
name: ["commonName"],
description: "A general-purpose name",
equalityMatchingRule: new ObjectIdentifier([ 2, 5, 13, 2 ]),
// orderingMatchingRule: new ObjectIdentifier(),
// substringsMatchingRule: new ObjectIdentifier(),
singleValued: FALSE, // FALSE === false. It's just an alias defined in the asn1-ts library.
collective: FALSE, // FALSE === false. It's just an alias defined in the asn1-ts library.
dummy: FALSE, // FALSE === false. It's just an alias defined in the asn1-ts library.
noUserModification: FALSE, // FALSE === false. It's just an alias defined in the asn1-ts library.
usage: AttributeUsage_userApplications,
ldapSyntax: new ObjectIdentifier([ 1, 3, 6, 1, 4, 1, 1466, 115, 121, 1, 15 ]), // Directory string syntax.
ldapNames: ["cn", "commonName"],
ldapDescription: "A general purpose name.",
compatibleMatchingRules: new Set(),
syntax: "UnboundedDirectoryString",
// Keep reading for how to implement an attribute type
// driver: {
// readValues
// addValue
// removeValue
// removeAttribute
// countValues
// isPresent
// hasValue
// getEntry
// },
});
}

export default init;

To be understood, attributes must be added to the context object's attributeTypes map. The key should be--at minimum--the dot-delimited object identifier. You should also map the same value to the ldap names as well; doing so will mean that Meerkat DSA will be able to recognize the attribute by its LDAP name when it receives an LDAP request.

The format of the attribute above is pretty much what you'd expect from a reading of ITU Recommendation X.501's definition of the ATTRIBUTE object class.

The driver field of the attribute info object is for implementing custom functions for interacting with stored attribute values so that attribute values of selected types can be stored, searched, etc. in alternative ways. For instance, if you wanted to store attribute values of this type in their own separate table in the database, you could define a driver that reads and writes from this table instead of the AttributeValue table.

Note that attribute drivers should be used sparingly. It is common for users to request all attribute types when reading or searching an entry. When this happens, every attribute type's readValues() driver will be called. If this function contains a database query, it will slightly slow down the time it takes to read an entry. Attribute drivers should only be used where there is some utility in storing values of a given attribute in their own table, rather than in the AttributeValue table.

Custom Object Classes

Object classes are pure data, and as such, there are three ways they can be added to Meerkat DSA:

  • Direct database insertion into the ObjectClassDescription table,
  • Directory Access Protocol (DAP) addition to a subschema subentry via the objectClasses attribute, or
  • Adding the object class to the objectClasses index of the context object in an init script. (This is not advised, because the object class will only continue to be defined so long as its definition continues to exist in the init script.)

Custom Name Forms

Name forms are pure data, and as such, there are three ways they can be added to Meerkat DSA:

  • Direct database insertion into the NameForm table,
  • Directory Access Protocol (DAP) addition to a subschema subentry via the nameForms attribute, or
  • Adding the name form to the nameForms index of the context object in an init script. (This is not advised, because the name form will only continue to be defined so long as its definition continues to exist in the init script.)

Custom Context Types

Below is an example for implementing a custom context type:

import {
ObjectIdentifier,
FALSE,
} from "asn1-ts";
import {
DER,
_encodePrintableString,
} from "asn1-ts/dist/node/functional";

export async function init (ctx) {
ctx.contextTypes.set("2.5.31.0", {
id: new ObjectIdentifier([ 2, 5, 31, 0 ]),
name: ["languageContext"],
description: "ISO 639-2 language code",
obsolete: FALSE, // FALSE === false. It's just an alias defined in the asn1-ts library.
syntax: "LanguageContextSyntax ::= PrintableString(SIZE (2..3)) -- ISO 639-2 codes only",
// assertionSyntax: ""; // An assertion syntax, if different from the value syntax.
defaultValue: () => _encodePrintableString("en", DER), // Defines "en" as the default value.
absentMatch: FALSE, // FALSE === false. It's just an alias defined in the asn1-ts library.
matcher: (assertion, value) => {
return (assertion.printableString === value.printableString); // if "en" === "en", it's a match!
},
validator: (value) => {
const len = value.printableString.length;
if ((len < 2) || (len > 3)) {
throw new Error();
}
},
});
}

export default init;

Custom LDAP Syntaxes

Below is an example for implementing a custom ldap syntax:

import {
ObjectIdentifier,
FALSE,
} from "asn1-ts";
import {
DER,
_encodePrintableString,
} from "asn1-ts/dist/node/functional";

// countryString SYNTAX-NAME ::= {
// LDAP-DESC "Country String"
// DIRECTORY SYNTAX CountryName
// ID id-lsx-countryString }

export async function init (ctx) {
ctx.ldapSyntaxes.set("1.3.6.1.4.1.1466.115.121.1.11", {
id: new ObjectIdentifier([ 1, 3, 6, 1, 4, 1, 1466, 115, 121, 1, 11 ]),
description: "Country String",
decoder: (bytes) => {
const str = Buffer.from(bytes).toString("utf-8");
return _encodePrintableString(str, DER);
},
encoder: (value) => {
return Buffer.from(value.printableString, "utf-8");
},
});
}

export default init;

In a custom LDAP syntax, the decoder is a function that takes a LDAPString, (which is defined in IETF RFC 4511 as an OCTET STRING, and which is represented in Meerkat DSA as the native JavaScript data type Uint8Array), and produces an ASN.1 element (of type ASN1Element from the asn1-ts NPM package) that represents the encoded X.500-equivalent of that LDAP value. The encoder function is the exact opposite, as you might have guessed.

Both the encoder and decoder are optional. If you do not define a decoder, you will not be able to write LDAP values of that syntax (because Meerkat will not be able to translate them to ASN.1 elements that it can work with). If you do not define an encoder, you will not be able to read LDAP values of that syntax (because Meerkat will not know how to convert those values into an LDAP values).

Note that, for an LDAP syntax to actually be used, an attribute type must identify with the LDAP syntax via the ldapSyntax property of the custom attribute type. If an attribute type has no associated LDAP syntax, it will simply be invisible to LDAP. (Note that "invisible" does not mean "ignored.")

It is also highly recommended that you add an entry to the Context object's ldapSyntaxToASN1Syntax map. This is simply a map of the object identifier as a string in dot-delimited notation to the ASN.1 syntax, as described in ITU Recommendation X.501 (2016 edition), Section 15.7.3. The ASN.1 does not have to be a complete ASN.1 module, and assumes the existence of all X.500 schema without importing them.

Custom Matching Rules

Below is an example for implementing a custom matching rule:

import {
ObjectIdentifier,
FALSE,
} from "asn1-ts";

// caseExactMatch MATCHING-RULE ::= {
// SYNTAX UnboundedDirectoryString
// LDAP-SYNTAX directoryString.&id
// LDAP-NAME {"caseExactMatch"}
// ID id-mr-caseExactMatch }

export async function init (ctx) {
ctx.equalityMatchingRules.set("2.5.13.5", {
id: new ObjectIdentifier([ 2, 5, 13, 5 ]),
name: ["caseExactMatch"],
description: "Matches two strings case-sensitively.",
obsolete: FALSE,
syntax: "UnboundedDirectoryString",
ldapAssertionSyntax: new ObjectIdentifier([ 1, 3, 6, 1, 4, 1, 1466, 115, 121, 1, 15 ]), // Directory string syntax.
matcher: (assertion, value) => { // EqualityMatcher
return (assertion.utf8String === value.utf8String);
},
});
}

export default init;

For equality matching rules, ordering matching rules, and substring matching rules, the example above applies, with one exception: the matcher will either be of type EqualityMatcher, OrderingMatcher, or SubstringsMatcher, and the matching rule will be indexed in the equalityMatchingRules, orderingMatchingRules, or substringsMatchingRules properties, respectively.

An EqualityMatcher is a function that takes an assertion as an ASN1Element, a value as an ASN1Element, and compares the two, returning true if they match according to the semantics of the matching rule, and false if they do not.

An OrderingMatcher is a function that takes an assertion as an ASN1Element, a value as an ASN1Element, and compares the two, returning an integer indicating which value is larger exactly as the predicate parameter of JavaScript's Array.sort() is expected:

  • A value greater than zero means "arrange the value before the assertion."
  • A value less than zero means "arrange the assertion before the value."
  • A value of zero means that the assertion and value are equal with respect to ordering.

A SubstringsMatcher is a function that takes an assertion as an ASN1Element, a value as an ASN1Element, and a SubstringSelection, which is an enumerated type defined in the @wildboar/x500 library that indicates whether the substring to be matches is initial, final, or any. In a pinch, these values may be used instead of the enumerated type:

  • "any" = 0
  • "initial" = 1
  • "final" = 2

The SubstringsMatcher determines if the asserted substring appears within the value at the selected location (the start, end, or anywhere), and returns a boolean value of true if the substring appears within the string where it is sought and false if it does not.

Object Identifiers

Particularly in LDAP, it is desirable (if not expected by some implementations) to represent object identifiers by their human-friendly names. Whenever a schema element, such as an attribute type or name form, is added to Meerkat DSA, it is recommended that the object identifier get added to the Context object's objectIdentifierToName and nameToObjectIdentifier maps. In each case, the object identifier is represented in dot-delimited form.

You DO have to manually map object identifiers to names and vice versa for schema objects you define in the init script. You DO NOT have to do this for schema objects you define in the database or via the subschema operational attributes.

Note that future versions of Meerkat DSA may expect object identifier names used in the nameToObjectIdentifier map to be normalized to lowercase or uppercase, but this is currently not the case.

Guidance

Prefer objects, not structured attributes

If an attribute type that you're defining is complicated enough to warrant a SEQUENCE or SET, you should consider breaking all of its components into individual single-valued attributes, and creating an object class that represents that type instead. You can use children within compound entries to represent these structs instead. The benefit of doing this is easier extensibility, and a greater likelihood that you can define your attribute type in terms of one of the subset of pre-defined LDAP syntaxes instead of having to define a new LDAP syntax.

The only case where you should do something like this is when: (1) for some reason, it would be burdensome to name these child entries, or (2) when all fields of the structured type are required, few and fixed in number.

Instead of defining an attribute type like so:

AccountInfo ::= {
name UTF8String,
balance INTEGER,
...
}

account ATTRIBUTE ::= {
WITH SYNTAX AccountInfo
ID { 1 2 3 4 }
}

...define separate single-valued attributes and an object class for them:

accountName ATTRIBUTE ::= {
WITH SYNTAX UTF8String
SINGLE VALUE TRUE
ID { 1 2 3 4 }
}

accountBalance ATTRIBUTE ::= {
WITH SYNTAX INTEGER
SINGLE VALUE TRUE
ID { 1 2 3 4 }
}

account OBJECT-CLASS ::= {
SUBCLASS OF {top}
MUST CONTAIN {accountName | accountBalance}
ID { 1 2 3 4 }
}

You may want to make the above object class auxiliary if you expect it to be merely an aspect of some other object, such as a person or organization.

Use MAY CONTAINS in auxiliary object classes

Auxiliary object classes can be useful for extending the schema of entries, however, but they can also function as a sort of "tag" if they are defined without required attributes.

Let's say you define an auxiliary object class called married that has an attribute spouseDN that points to the entry's marital partner. You would not want to make spouseDN a required attribute of married, even though it might theoretically make sense that somebody that is married has a spouse.

By making spouseDN an optional attribute of the married auxiliary object class, you can represent that an entry is married even if you do not know the name of the spouse.

You can think of this like NULL in relational databases: you should only make a field NOT NULL if you can never think of a situation in which it would be acceptable for that field to be absent (such as a lack of knowledge). Most fields are not like this.

Also be aware that, if you define an attribute as required in an object class and users of the directory system do not know the value for that attribute for that entry, they may attempt to bypass such a restriction by putting in, say, a "nullish" string like "N/A", "UNKNOWN", "NULL", "", or using 0 for a required attribute with INTEGER syntax. The result is that you could wind up with garbage data in your directory because you required of users what they could not provide.

Do not embed entries within entries

Use compound entries, or attributes with a distinguished name syntax to point to other entries. You should not define attributes with a syntax like this:

biologicalChildren ATTRIBUTE ::= {
WITH SYNTAX SET OF Attribute
ID id-at-biologicalChildren
}

Matchers should tolerate symmetrical syntax, if possible

Even if a matching rule specifies an assertion syntax that differs from the attribute syntax, the matching rule should attempt to tolerate an assertion having the value's syntax.