Getting Started Tutorial 2.x
Harmonizing Order Data

Harmonizing the Order Data

Now that we have modeled the Order entity we can use the Data Hub Framework’s code scaffolding to create a boilerplate for Harmonizing our data.

Click on the Flows tab in the top navigation bar.

Click Flows

  1. Click on the + icon next to Harmonize Flows
  2. Type Harmonize Orders into the Harmonize Flow Name field
  3. Click the CREATE button

Note that this time we used the default option of Create Structure from Entity Definition. This means that the Data Hub Framework will create boilerplate code based on our Enity model. The code will pre-populate the fields we need to add.

Create Product Harmonize Flow

  1. Click on the Harmonize Orders flow.
  2. Click on the Collector tab.

Harmonize Flow Overview


Collector Plugin

Because each Order can consist of multiple rows which are then turned into multiple documents in MarkLogic, we cannot do a 1:1 mapping like we did for Products. This means we cannot simply return a list of URIs. Instead we need to return a unique list of all of the values from the relation id column.

We will use the jsearch library to run our query.

This code is simply returning all unique values in the id field. The one tricky bit is the slice call:

.slice(0, Number.MAX_SAFE_INTEGER)

By default jsearch will paginate results. The slice is telling it to return all results from 0 to a really big number.

The final collector.sjs code:

  1. Make the code change.
  2. Click on SAVE button.
  3. Click on the Content tab.

Click Content Tab


Content Plugin

For the Order entity the id is the id from the original relational system. Instead of a 1:1 mapping of source documents, we must find all source documents that match the given id.

After we get all of the matching documents we must then build up an array of the Products while also summing the total price.

Once again we will use the jsearch library to run our query.

Note how we query all Order documents containing the matching id. We use the map function to extract out the original content (stored in the instance part of the envelope). The orders variable will contain an array of original json objects.

You can also see how we iterate over the orders to sum up the price and add pointers to the Product entities into the products array.

The Final content plugin looks like:

  1. Change the code.
  2. Click SAVE.

Edit and Save Content

Now Click on the Flow Info tab.

Click Flow Info

Let’s Run the flow. Click the RUN HARMONIZE button to start the flow.

Run Order Harmonize

Check out the Harmonized Orders

Similar to what we did after running the other flows you might want to verify that the job finished.

  1. Click on the Jobs tab.
  2. Make sure the job finished.

Harmonized Products Jobs

You may also want to explore your Harmonized Data.

  1. Click on the Browse tab.
  2. Change Database to FINAL.
  3. Click the Search button.
  4. Click on the Order Facet to filter the results.

You should see harmonized documents in the search results.

Harmonized Products

Click on a result to see the raw data.

Harmonized Product Detail

Up Next

Serve the Data Out of MarkLogic