Getting Started Tutorial 2.x
1 - Download and Install MarkLogic

2 - Download the QuickStart War

  • Create a folder for this hub project and cd into it.
mkdir data-hub
cd data-hub
  • Download the latest Quick Start 2.x quick-start-2*.war from the releases page and place it in the folder you just created. Be sure you are getting the latest 2.x version and not the 1.x version!

3 - Download the Sample Data

  • Create a folder to hold your input data
mkdir input

Your directory should look like this:

Directory Tree

4 - Run the QuickStart

  1. Open a terminal window in the data-hub directory
  2. Run the War
java -jar quick-start-*.war

If you need to run on a different port then add the –server.port option

java -jar quick-start-*.war --server.port=9000

Start QuickStart

5 - Login to the Hub

After opening the QuickStart Application you must step through a wizard to properly configure the Hub.

  1. Browse to the directory where your hub where live. If you ran the quickstart war file in the correct directory then the folder should already be correct. Click Next. Hub Directory

  2. Initialize your Data Hub Project Directory. Click INITIALIZE. Hub Directory

  3. You have now initialized your Data Hub Framework project. Your project folder now contains many new files and directories. If you are curious, you can read about the files in a Data Hub project. Click Next. Hub Directory

  4. Choose the Local Environment. Click Next. Hub Directory

  5. Login to the Hub with your MarkLogic credentials Hub Directory

  6. Install the Hub into MarkLogic. Click Install. You will then see a screen with progress while the Data Hub is being installed. Hub Directory

Congratulations! The Data Hub Framework is installed and ready to use. You are taken to the Dashboard page where you can see the document counts of all four hub databases. Additionally, you can clear out the databases one-by-one or in one fell swoop.

The four databases are:

  • Staging: holds incoming data
  • Final: holds harmonized data
  • Job: holds data about the jobs you run
  • Trace: holds debugging data about each document that has been harmonized

Hub Directory

