Mapistore v2

Introduction

This document addresses the different design flaws of mapistore v1 and details proposals to work around these problems while keeping the mapistore semantics correct.
This is a draft which will evolve along next weeks and which will serve as a content basis for the new mapistore development documentation.

In this document, we'll refer to:
  • v1 as the initial and existing mapistore implementation.
  • v2 as the new work in progress mapistore implementation.

openchange.ldb naming and scope

In v1, the dispatcher database was named openchange.ldb. It kept a general top-level overview of the mailbox, including some attributes. This required to have a dual server-side implementation:
  • retrieve openchange.ldb container's attributes from openchange.ldb
  • retrieve mapistore attributes from mapistore

mapistore.ldb attributes and backends

Historically, storing attributes in openchange.ldb was a convenient choice while developing the initial mapistore implementation. Now we have backends that DO things. Relying exclusively on these mapistore attributes however makes the whole server implementation more complicated.

In the meantime, we can have the case where a storage backend doesn't have the ability to handle a correct and complete 1-1 mapping between MAPI properties and backend specific items.

The solution is to have mixed-loop implementation:
  • Server queries mapistore for a top-folder attribute
  • If the backend can store the attribute, then return it
  • If the backend can't store the attribute itself, use an API that will store this data in mapistore.ldb
  • Return the data back to the server

V2 turns openchange.ldb into mapistore.ldb, remove automatic attributes from openchange.ldb and delegate storage's control to mapistore.

Another option is to have a virtual mstoredb:// backend that abstracts LDB calls through mapistore API. This way, we wouldn't have to handle anymore any dual ldb/backend implementation and focus on improving mapistore hooks.

mstoredb:// mapistore backend

One of the v1 mapistore context is to provide a mapistore context per mailbox's root folder. The design was however excluding from this process mailbox containers such as the IPM_SUBTREE one (where Inbox, Outbox, Calendar etc.).

v2 introduces a mstoredb backend which abstracts LDB calls in mapistore, providing a consistent and unique way to access data. However using a mstoredb:// mapistore backend for IPM_SUBTREE semantically implies underlying folders will be within mstoredb context being themselves mstoredb folders or messages.

One possibility - already experienced with v1 (through a semantic bug) is to pass the mapistore context down to the backend layer. In such case, backends would be able to instantiate new contexts and have subfolders stored into different backends. Such design may however - from a general overview - have side effects not yet identified. One way to limit potential impact would be to limit this feature to mstoredb context. mstoredb would in such case have mapistore abilities and be a virtual storage.

New Concept: Rather than passing the main mapistore context down to the backend as a reference, we will encapsulate the mapistore context within a container structure. Using this semantic, we can split mapistore calls intended to be available for backends from those who are not. Furthermore, this may provide on long-term a container where we can store additional data on purpose. We can even control the pointers we want to pass down to each different backends and add more references for the mstoredb while keeping the data structure opaque to backends.

In any case, mstoredb needs to be a virtual storage backend with the ability to create new mapistore contexts.

We could even extend this concept to the mailbox record itself. The mailbox would be a mstoredb mapistore container.

MAPIStore URI and backend operations

  • v1 is passing folder and message identifiers to the backends, requiring them to handle the FID/MID to mapistore URI conversion.
  • v2 will abstract FID/MID and only supply a mapistore URI to backend's operation. This require some semantic changes in the API:
    • backends needs to have a function able to generate a mapistore URI on purpose
    • mapistore needs to maintain the indexing database

MAPIStore URI and authentication/credentials

v1 only passes credentials to storage's backends through the mapistore URI attribute. This is an easy and quick way to pass information between interfaces but this single mechanism would be for v2 (on mid-term) a limitation. The idea is to keep authentication credentials separate and only put in an account identifier (like the username) in the URI.
  1. Passwords change. The user might choose to change it, or may be forced to. That would mean having to change the URI everywhere, which seems like a lot of work to track.
  2. In some cases, a fixed password might not be appropriate. For example, we could have ssh keys, or X.509 certificates, or kerberos tickets, or OTP, or a password plus an authentication realm. If we had something a bit more flexible, we wan deal with each of these cases.

We can also use this mechanism to handle rule-based access like if there is no VPN running, do these commands before trying to connect.

Convenient APIs

In v1, it was the backend responsibility to maintain the list of folders and messages opened and to associate a private data pointer. Such functions are available within the fsocpf backend. It sounds worthwhile to create an API in mapistore and available for all backends for convenience reasons.

Atomic operations vs Superset of operations

In MAPI, a complex operation is generally the result of multiple atomic operations. While providing a limited set of atomic operations help building complex tasks, it can however be slow. Furthermore some backends may have the ability to run these complex operations in one single step with good performances.

V1 was only relying on atomic operations. V2 will extend the concept and provide an optional set of pointer of functions backend can use to override the default mapistore behavior.

MAPIStore provisioning and Python bindings

In v1, the OpenChange dispatcher database was created through python's code only. This helped designing a preliminary store very quickly, thus populating it with ease. However, we now come to a development stage where we have more than one single storage backend. Maintaining backend's mapistore URI registration at python's level is not convenient, prone to errors and have many limitations. v2 is a C-based provisioning code actively using backend's hooks to communicate and retrieve configuration information.

Python bindings are written over the C-based implementation to keep the existing Python's approach and provisioning script layout.

One of the v1 flaw with regards to provisioning is the hardcoded defaults set to fsocpf. We need something much more generic where we can easily control (get/set) (without hacking into LDB) mapistore backends. Furthermore we need to abstract the way mapistore URI are set and delegate this control to the backend.

In v2. backends are responsible for generating mapistore URI.

We however have 2 different cases to handle:
  1. System/Special folders that Outlook require at startup (Inbox, Outbox, Drafts, Calendar, Tasks, Note, etc.)
  2. Custom top-level folders that can be create by Outlook on purpose (e.g. RSS feeds for Outlook 2007) or by the user.

FID/MID allocation

Backends can't register new FID or MID (folder or messages) in v1. This approach is semantically interesting to keep data consistency and to keep backends are easy as possible. However it is a huge limitation when the backend is also used for other purposes or services. For example, an IMAP backend may receive emails independently from Outlook (e.g. maildrop). Synchronizing backend data into mapistore could be done through a notification task. However we can't assume backends will have such feature.

V2 addresses this problem and let backends (optionally) use mapistore API to register new folders or messages. Consistency is kept on mapistore but gives more freedom to backends.

Another aspect v2 is fixing is the fid/mid allocation mechanism. In v1, we were maintaining a GlobalCount attribute in the server's record. We incremented this GlobalCount on purpose and were maintaining an indexing database to retrieve free'd and available fid/mid.

V2 is still using GlobalCount as an FID/MID indexing attribute, but delegates management to top folders. These folders have an range of FID and MID allocated specifically for this folder. Folder is responsible for managing IDs within its hierarchy and redistribute existing FID/MID within its range through the indexing database. When a folder becomes low on available number of FIDs, it can register a new allocation range leading in incrementing the GlobalCount.

Such mechanism ensures we have consistent ranges of IDs for a given folder and prevent from potential locking mechanisms on a single record, delegating this management to sub records.

Allocation context

There are two different cases where new fid/mid can be queried:
  • From MAPI client through OpenChange server
  • From mapistore provisioning scripts or applications
  • Directly through the backend to synchronize backend's data
Furthermore we can query two different kinds of fid/mid
  • fid/mid for a folder/message outside of mapistore. For example when we create a new root folder (which content will be in mapistore)
  • fid/mid for a folder/message within a mapistore context

When backends manipulate data, it is within a mapistore context. It makes the mapistore URI a good candidate
When a client manipulates data, it has no knowledge of mapistore URI or context and only deals with FID/MID. In this case the root folder where this subfolder or message belongs is a good candidate.

libmapiproxy modifications

v1 was more or less dispatching mapistore components across libmapiproxy and libmapistore libraries. v2 consolidates mapistore by renaming/moving and refactoring existing APIs into libmapistore.

libmapiproxy/openchangedb.c and libmapiproxy/openchangedb_property.c

  • Scope:
    • Only used by emsmdb server
  • Objective:
    • Remove openchangedb API and replace it with a consistent set of functions in libmapistore (mapistoredb)
  • libmapiproxy/openchangedb.c functions:
    • openchangedb_get_SystemFolderID
    • openchangedb_get_PublicFolderID
    • openchangedb_get_distinguishedName
    • openchangedb_get_MailboxGuid
    • openchangedb_get_MailboxReplica
    • openchangedb_get_PublicFolderReplica
    • openchangedb_get_mapistoreURI
    • openchangedb_get_receiveFolder
    • openchangedb_get_folder_count
    • openchangedb_lookup_folder_property
    • openchangedb_get_folder_special_property
    • openchangedb_get_folder_property_data
    • openchangedb_get_new_folderID
    • openchangedb_get_folder_property
    • openchangedb_get_table_property
    • openchangedb_get_fid_by_name
    • openchangedb_set_ReceiveFolder
  • libmapiproxy/openchangedb_property.c functions (auto-generated mparse):
    • openchangedb_property_get_attribute

mailbox.py Python package

  • Scope:
    • Provision and populate openchange.ldb database
  • Objective:
    • Replace this API with a consistent C API + python bindings
  • Functions:
    • setup included in mapistoredb_provision()
    • add_rootDSE included in mapistoredb_provision()
    • add_server included in mapistoredb_provision()
    • add_root_public_folder
    • add_sub_public_folder
    • add_one_public_folder
    • add_mapistore_pf_dir
    • add_public_folders root container ldb_add included in mapistoredb_provision()
    • lookup_server
    • lookup_mailbox_user
    • user_exists
    • get_message_attribute
    • get_message_replicaID
    • get_message_GlobalCount is now mapistoredb_get_GlobalCount
    • set_message_GlobalCount is now mapistoredb_set_GlobalCount
    • add_mailbox_user included in mapistoredb_new_mailbox()
    • add_storage_dir
    • add_folder_property
    • add_mailbox_root_folder
    • add_mailbox_special_folder
    • set_receive_folder
    • gen_mailbox_folder_fid

MAPIStore Indexing database

mapistore_indexing provides an API managing (add/del/search) records within an indexing database. An indexing database is linking a MID/FID to a mapistore URI. Such databases are user specific (1 indexing database per user).

V2 will improve the mapistore_indexing API and remove any previous and specific code related to fsocpf. indexing databases server 2 purposes:
  • fid the mapistore URI matching a particular FID/MID
  • find the hierarchy of folders from a given fid/mid

Current mapistore_indexing API

Context specific functions:
  • mapistore_indexing_search
  • mapistore_indexing_add
  • mapistore_indexing_del
  • mapistore_indexing_add_ref_count
  • mapistore_indexing_del_ref_count
Search functions:
  • mapistore_indexing_search_existing_fmid
  • mapistore_indexing_get_folder_list
Add functions:
  • mapistore_indexing_record_add_fmid
  • mapistore_indexing_record_add_fid
  • mapistore_indexing_record_add_mid
Delete functions:
  • mapistore_indexing_record_del_fmid
  • mapistore_indexing_record_del_fid
  • mapistore_indexing_record_del_mid
Wrapper functions (mapistore_interface):
  • mapistore_add_context_indexing

V2 new mapistore_indexing API

Current code analysis shows that mapistore_add_context_indexing (public function from mapistore_interface) is only used in emsmdbp_object.c after calling mapistore_add_context. It means we can hide public exposure of mapistore indexing to public APIs and maintain it's add/del privately.

Furthermore, I have identified that after small refactoring, mapistore indexing ref_count and context search functions could be defined as static and should never be used outside the mapistore_indexing API scope.

MAPIStore v2 current implementation

Create system/special folder scenario

We identify system/special folders with an ID. This is a fixed list similar to what we have already implemented in oxcstor.c.
We also split the system/special folders into 2 categories:
  • folders that are containers and which are of the mstoredb kind.
  • folders that are actually heavily used such as Inbox, Calendar, Outbox etc and which can be of any backend type.

Mailbox case

This is a particular case. The mailbox container for the user is created in mapistoredb_provision (and is the root of the LDB tree for the user's mailbox):
  • Theoretically, we should be able to add sub-folders (system and specia ones) directly through mapistore and the mstoredb backend
  • The only issue is how to handle custom mstoredb properties (that can't be mapped into MAPI properties and passed to mapistore functions)

Container/SystemFolder case (mstoredb://)

  1. Search for the system folder index in a static/fixed list
    1. Retrieve default attributes about the folder (folder name, SystemIndex)
    2. The folder name and mapistore URI can be overwritten
  2. Generate a FID for this record: server's object GlobalCount increment
  3. Let the backend URI returns a generated URI Retrieve the backend URI from intended mapistore backend
  4. Add the skeleton record to database
  5. Instantiate a context to the backend through mapistore and add properties

SpecialFolder case

When we create a specialfolder, we are within a mstoredb context:
  1. call mstoredb mkdir which will end in having a new LDB record added
  2. set the mapistore_uri for this specialfolder (different backend)
  3. create a context on this specialfolder within the desired backend
  4. add properties

Provisioning

mapistore v2 currently implements:
  • initialization of a mapistoredb context
  • provision top-level schema and data structure for mapistore.ldb
  • create the root mailbox container for a user mailbox (with sanity / existing checks)
  • accessors getsetters for mapistoredb context parameters
  • getsetters on server's GlobalCount attribute
  • pass an opaque context down to the backend's level which encapsulates a reference to the mapistore context

irclog_ACL_mapistore_discussion.txt - Discussion about mapistore and ACLs when moving to contexts using a username parameter (5.3 ) Julien Kerihuel, 10/29/2010 11:51 am

irclog_ACL_delegation_PF.txt - Discussion about ACLS, delegation and Public Folders in mapistore (9.8 ) Julien Kerihuel, 10/29/2010 01:38 pm

Also available in: HTML TXT