How Do I Use It?

Integration into MarkLogic Data Hub 5

1.3.1 is our final feature release of Smart Mastering in the Smart Mastering Core repository. As of Data Hub 5.0.0, Smart Mastering is fully integrated into MarkLogic Data Hub as a built-in capability, and the recommended way to use the Smart Mastering capability is by configuring a mastering step in Data Hub. Existing users should migrate their Smart Mastering configuration to MarkLogic Data Hub (see Import Your Smart Mastering Core Projects for instructions). The integration of Smart Mastering into Data Hub offers a variety of benefits, including:

  • Built-in support for orchestrating matching and merging across documents.
  • QuickStart UI for configuration of matching and merging

MarkLogic will continue to invest in Smart Mastering as a built-in capability of Data Hub.

Deprecated approach with Smart Mastering Community project

To use Smart Mastering in your project, start by adding it to your project as shown in the minimal-project example.

Define your match and merge options. These will control how matches are identified and how properties are merged into a new document.

From this point, you have a couple of choices about how to run mastering.

Mastering Your Content

Mastering with Data Hub Framework 4

If you’re using the MarkLogic Data Hub Framework, you can call the match and merge REST extension. See the smart-mastering-core example mastering task.

The key part is in the runMastering tasks of [build.gradle][sm-demo-build-gradle], which calls the REST extension.

Mastering with a Trigger

Smart Mastering can be set up so that whenever a document is inserted into the database, a trigger gets called to look for matches and merge as appropriate. See match-and-merge-trigger.xqy for the trigger code, Using Triggers to Spawn Actions in the Application Developer’s Guide to learn about how triggers work, and the ml-gradle wiki for an example of how to deploy triggers with ml-gradle.

Custom Approach

You can also access the Smart Mastering functionality by directly calling the REST API extensions or XQuery libraries.

To see what the results of the match functions look like, see Match Results.

Collections

Smart Mastering expects to find documents in particular collections. See the list of collections in constants.xqy. Some of the notable collections:

  • $CONTENT-COLL: insert any documents that should be available for merging into this collection. When two documents get merged, they will be removed from this collection and replaced with the new merged document. Your application should limit searches to this collection.
  • $MERGED-COLL: merged documents will be created in this collection. When a merge is rolled back, the rolled document will be removed from this collection and from the $CONTENT-COLL.
  • $ARCHIVED-COLL: original documents that have been merged will be moved into this collection.