Navigation

MarkLogic Community Recipes

Data Hub-4 Input Flow

Use a MarkLogic Data Hub Input Flow to transform documents while loading.

Read More

Run Data Hub-4 Harmonize Flow

Call the DHF Harmonization flow using EvaluateCollector Processor

Read More

Run Data Hub-4 Flows (Input and Harmonize)

Orchestration of DHF input and Harmonize flow in single NiFi template

Read More

Read Files from Directory, Write to MarkLogic

This example watches a directory for files, imports them into MarkLogic, then deletes them. The MarkLogic URI is /files/ followed by the filename.

Read More

Extract values from JSON data

This example introduces the EvaluateJsonPath processor and demonstrates how to extract an ID value from JSON data to use in constructing the URI.

Read More

Extract values from XML data

This example introduces the EvaluateXPath processor to extract an ID value from XML data to use in constructing the URI. The XPath value is stored in a FlowFile property which is used later in InvokeHTTP to construct the document URI.

Read More

Convert Data To UTF-8

This example demonstrates the ConvertCharacterSet processor and shows how to convert data from another character set to UTF-8.

Read More

Handling Multiple Types of Content

This example demonstrates how tow handle multiple content types.

Read More

Ingest Line-Delimited JSON

This example demonstrates the SplitText processor and shows how to ingest line-delimited JSON. Like the previous JSON example, we will construct the URI from an ID property in the JSON.

Read More

Split XML Files Into Multiple Documents

This example introduces the SplitXml processor to split an aggregate XML file into multiple documents.

Read More

Loading Documents From Compressed Files

This example demonstrates the UnpackContent processor and shows how to load content from one or more compressed files.

Read More

Generate Documents from CSV Files

This example introduces the EvaluateXPath processor to extract an ID value from XML data to use in constructing the URI.

Read More

Load PDF as Binary and Extracted Metadata as JSON

This example shows how to use the ExtractMediaMetadata processor to extract the properties from a PDF file and AttributesToJSON to convert the FlowFile attributes.

Read More

Call a Web Service

This example introduces the GenerateFlowFile processor and demonstrates how to consume JSON data from a paged web service.

Read More

Augment XML content with data from a Web Service

The example introduces xhtml fragment ingestion

Read More

Modify NiFi Attributes with Custom Scripting

This example introduces the ExecuteScript processor and demonstrates how to add an attribute with a Groovy script.

Read More

Get Files by FTP

This example uses the GetFTP processor to get a single file from an anonymous ftp server.

Read More

Extract Text from PDFs and Office Documents

This example uses the ExtractTextProcessor which is not included with NiFi but was developed by Hortonworks.

Read More

Get Data from a Relational Database

This example demonstrates ingesting data from a relational database

Read More

Create View, use GenerateTableFetch

Executes any query against a database. Does not support paging. Gets the entire resultset as a single Avro result that needs to be split.

Read More

Count Rows, Construct Paged SQL SELECTs

Designed for paging. Executes a SELECT COUNT(*), then generates SQL queries to page over the rows of a table in chunks, but does not execute them.

Read More

Use ExecuteSQLToColumnMaps

Polls Custom Query for additional rows by storing and querying with an increasing column.

Read More

Use ExecuteSQLToColumnMaps

This example explores a MarkLogic Community alternative to the built-in SQL processors

Read More

Column Maps with Child Query

This example explores using a child query to get column maps

Read More

Load From an MLCP Archive

This example explores loading content and metadata from an MLCP archive

Read More

Load Triples

This example loads triples from any of the 7 supported RDF triple formats

Read More

Transform JSON

This example demonstrates how to transform JSON with the built-in JoltTransformJSON processor

Read More

Load Data from SharePoint

This example demonstrates how to load data from a SharePoint server

Read More

Invoke HTML Tidy on HTML Content

This example demonstrates how to use HTML tidy to generate XHTML from HTML

Read More

Decrypt Input CSVs

This example demonstrates how to convert and load CSV content

Read More

Read Parquet Files

Read More

Export MarkLogic Database Content to the File System

This example uses the QueryMarkLogic processor to query a MarkLogic database, then writes the documents to the file system with the PutFile processor.

Read More

Read MarkLogic XML, Write to CSV

This example uses the MarkLogic QueryBatchProcessor processor to read XML from a MarkLogic database, then writes certain element values to CSV.

Read More

Kafka Integration

This example demonstrates how to integrate Kafka, Nifi, and MarkLogic

Read More

Error Handling in NiFi Flows

Many of the example flows presented in this cookbook have auto-terminated relationships that represent error conditions, such as "failure", "unmatched", etc. Here we demonstrate a few patterns for handling those errors.

Read More

Error Handling in PutMarkLogic

Here we discuss error handling in the PutMarkLogic processor

Read More