Controller Service
MarkLogicDatabaseClientService
Provides a MarkLogic DatabaseClient instance for use by other processors.
Properties
- Host
- The host with the REST server for which a DatabaseClient instance needs to be created
- Port
- The port on which the REST server is hosted
- Load Balancer
- Is the host specified a load balancer?
- Security Context Type
- The type of the Security Context that needs to be used for authentication. The options are:
- DIGEST
- BASIC
- CERTIFICATE
- Username
- The user with read, write, or admin privileges - Required for Basic and Digest authentication
- Password
- The password for the user - Required for Basic and Digest authentication
- Database
- The database to access. By default, the configured database for the REST server would be accessed.
- SSL Context Service
- The SSL Context Service used to provide KeyStore and TrustManager information for secure connections.
- Client Authentication
- Client authentication policy when connecting via a secure connection. This property is only used when an SSL Context has been defined and enabled.
MarkLogic Processors
ApplyTransformMarkLogic Processor
Creates FlowFiles from batches of documents, matching the given criteria, transformed from a MarkLogic server using the MarkLogic Data Movement SDK (DMSDK).
This allows an input which can used in the Query
property with the NiFi Expression Language.
Relationships
success
FlowFiles are generated for each document URI read out of MarkLogic.
failure
If a query fails a FlowFile goes to the failure relationship. If an input is provided to the QueryMarkLogic processor, the input FlowFile is penalized and passed. Otherwise a new FlowFile is generated and passed.
Properties
- DatabaseClient Service
- The DatabaseClient Controller Service that provides the MarkLogic connection.
- Batch Size
- The number of documents per batch - sets the batch size on the Batcher.
- Thread Count
- The number of threads - sets the thread count on the Batcher.
- Query
- The query criteria for retrieving documents that corresponds with the
Query Type
selected. Expression Language Enabled: FlowFile Scope - Query Type
- The type of query contained in the
Query
property. Available query types:- Collection Query Comma-separated list of collections to query from a MarkLogic server.
- Combined Query (JSON) Combine a string or structured query with dynamic query options (Allows JSON serialized cts queries). See documentation for more details.
- Combined Query (XML) Combine a string or structured query with dynamic query options (Allows XML serialized cts queries). See documentation for more details.
- String Query A Google-style query string to search documents and metadata. See documentation for more details.
- Structured Query (JSON) A simple and easy way to construct queries as a JSON structure, allowing you to manipulate complex queries. See documentation for more details.
- Structured Query (XML) A simple and easy way to construct queries as a XML structure, allowing you to manipulate complex queries. See documentation for more details.
- Apply Result Type
- Whether to REPLACE each document with the result of the transform, or run the transform with each document as input, but IGNORE the result. Default:
Replace
Available return types:- Replace Overwrites documents with the value returned by the transform, just like REST write transforms. This is the default behavior.
- Ignore Run the transform on each document, but ignore the value returned by the transform because the transform will do any necessary database modifications or other processing. For example, a transform might call out to an external REST service or perhaps write multiple additional documents.
- State Index
- Definition of the index which will be used to keep state to restrict future calls. Currently only supports xs:dateTime indexes.
Example State Index Values By Type
- Element Index:
xhtml:title
- Dynamic Property
ns:xhtml
=>http://www.w3.org/1999/xhtml
- Dynamic Property
- JSON Property:
title
- Path Index:
/xhtml:html/xhtml:head/xhtml:title
- Dynamic Property
ns:xhtml
=>http://www.w3.org/1999/xhtml
- Dynamic Property
- State Index Type
- Type of index to determine state for next set of documents.
- Element Index Index on an element. (Namespaces can be defined with dynamic properties prefixed with ‘ns:’.)
- JSON Property Index Index on a JSON property.
- Path Index Index on a Path. (Namespaces can be defined with dynamic properties prefixed with ‘ns:’.)
- Server Transform
- The name of REST server transform to apply to every document as it’s written.
- trans:<custom-transform-parameter>
- A dynamic parameter with the prefix of
trans:
that will be passed to the transform. Expression Language Enabled: Variable Scope
DeleteMarkLogic Processor
Creates FlowFiles from batches of documents, matching the given criteria, deleted from a MarkLogic server using the MarkLogic Data Movement SDK (DMSDK).
This allows an input which can used in the Query
property with the NiFi Expression Language.
Relationships
success
FlowFiles are generated for each document URI read out of MarkLogic.
failure
If a query fails a FlowFile goes to the failure relationship. If an input is provided to the QueryMarkLogic processor, the input FlowFile is penalized and passed. Otherwise a new FlowFile is generated and passed.
Properties
- DatabaseClient Service
- The DatabaseClient Controller Service that provides the MarkLogic connection.
- Batch Size
- The number of documents per batch - sets the batch size on the Batcher.
- Thread Count
- The number of threads - sets the thread count on the Batcher.
- Query
- The query criteria for retrieving documents that corresponds with the
Query Type
selected. Expression Language Enabled: FlowFile Scope - Query Type
- The type of query contained in the
Query
property. Available query types:- Collection Query Comma-separated list of collections to query from a MarkLogic server.
- Combined Query (JSON) Combine a string or structured query with dynamic query options (Allows JSON serialized cts queries). See documentation for more details.
- Combined Query (XML) Combine a string or structured query with dynamic query options (Allows XML serialized cts queries). See documentation for more details.
- String Query A Google-style query string to search documents and metadata. See documentation for more details.
- Structured Query (JSON) A simple and easy way to construct queries as a JSON structure, allowing you to manipulate complex queries. See documentation for more details.
- Structured Query (XML) A simple and easy way to construct queries as a XML structure, allowing you to manipulate complex queries. See documentation for more details.
- State Index
- Definition of the index which will be used to keep state to restrict future calls. Currently only supports xs:dateTime indexes.
Example State Index Values By Type
- Element Index:
xhtml:title
- Dynamic Property
ns:xhtml
=>http://www.w3.org/1999/xhtml
- Dynamic Property
- JSON Property:
title
- Path Index:
/xhtml:html/xhtml:head/xhtml:title
- Dynamic Property
ns:xhtml
=>http://www.w3.org/1999/xhtml
- Dynamic Property
- State Index Type
- Type of index to determine state for next set of documents.
- Element Index Index on an element. (Namespaces can be defined with dynamic properties prefixed with ‘ns:’.)
- JSON Property Index Index on a JSON property.
- Path Index Index on a Path. (Namespaces can be defined with dynamic properties prefixed with ‘ns:’.)
ExecuteScriptMarkLogic Processor
Executes server-side code in MarkLogic, either in JavaScript or XQuery. Code can be given in a Script Body property or can be invoked as a path to a module installed on the server.
Relationships
results
FlowFiles are generated for each script result.
first result
FlowFile is generated for first script result.
last result
FlowFile is generated for last script result.
original
Input FlowFile is passed to this relationship.
failure
Input FlowFile is passed to this relationship, if failure occurs.
Properties
- DatabaseClient Service
- The DatabaseClient Controller Service that provides the MarkLogic connection.
- Execution Type
- What will be executed: ad-hoc XQuery or JavaScript, or a path to a module on the server:
- XQuery Execute XQuery supplied in the Script Body property.
- JavaScript Execute JavaScript supplied in the Script Body property.
- Module Path Execute the module specified in the Module Path property.
- Script Body
- Body of script to execute. Only one of Module Path or Script Body may be used. Expression Language Enabled: FlowFile Scope
- Module Path
- Path of module to execute. Only one of Module Path or Script Body may be used. Expression Language Enabled: FlowFile Scope
- Results Destination
- Where each result will be written in the FlowFile. If Attribute, the result will be written to the
marklogic.result
attribute:- Content Write the MarkLogic result to the FlowFile content.
- Attribute Write the MarkLogic result to the marklogic.result attribute.
- Attributes from JSON Properties Parse a MarkLogic JSON result into attributes with the same names as the top-level JSON properties, where the values are simple types, not objects or arrays.
- Skip First Result
- If true, first result is not sent to results relationship or last result relationship, but is sent to the first result relationship.
- Content Variable
- The name of the external variable where the incoming content will be sent to the script. (optional) Expression Language Enabled: FlowFile Scope
- <custom-external-variable>
- A dynamic parameter that will be passed as an external variable. Expression Language Enabled: Variable Scope
ExtensionCallMarkLogic Processor
Allows MarkLogic REST extensions to be called.
Relationships
success
FlowFiles are generated for each document URI read out of MarkLogic.
failure
If a REST call fails, a FlowFile goes to the failure relationship.
Properties
- DatabaseClient Service
- The DatabaseClient Controller Service that provides the MarkLogic connection.
- Extension Name
- Name of MarkLogic REST extension.
- Requires Input
- Whether a FlowFile is required to run.
- Payload Source
- Whether a payload body is passed and if so, from the FlowFile content or the Payload property.
- None No paylod is passed to the request body.
- FlowFile Content The FlowFile content is passed as a payload to the request body.
- Payload Property The Payload property is passed as a payload to the request body.
- Payload Format
- Format of request body payload.
- XML
- JSON
- TEXT
- BINARY
- UNKNOWN
- Payload
- Payload for request body if “Payload Property” is the selected Payload Type. Expression Language Enabled: FlowFile Scope
- Method Type
- HTTP method to call the REST extension with.
- GET
- DELETE
- POST
- PUT
- param:<custom-extension-parameter>
- A dynamic parameter with the prefix of
param:
that will be passed to the REST extension. Expression Language Enabled: FlowFile Scope - separator:param:<custom-extension-parameter>
- A dynamic parameter with the prefix of
separator:
can reference a way to split values in aparam:
property (e.g., Multipleuri
parameters can be set withparam:uri
=>uri1.json,uri2.json
andseparator:param:uri
=>,
). Expression Language Enabled: FlowFile Scope
PutMarkLogic Processor
Write batches of FlowFiles as documents to a MarkLogic server using the MarkLogic Data Movement SDK (DMSDK).
Relationships
success
FlowFiles that have been written successfully to MarkLogic are passed to this relationship.
batch_success
A FlowFile is created and written to this relationship for each batch. The FlowFile has an attribute of URIs, which is a comma-separated list of URIs successfully written in a batch. This can assist with post-batch processing.
failure
FlowFiles that have failed to be written to MarkLogic are passed to this relationship.
Properties
- DatabaseClient Service
- The DatabaseClient Controller Service that provides the MarkLogic connection.
- Batch Size
- The number of documents per batch - sets the batch size on the Batcher.
- Thread Count
- The number of threads - sets the thread count on the Batcher.
- Collections
- Comma-delimited sequence of collections to add to each document. Expression Language Enabled: FlowFile Scope
- Format
- Format for each document; if not specified, MarkLogic will determine the format based on the URI.
- Job ID
- ID for the WriteBatcher job.
- Job Name
- Name for the WriteBatcher job.
- MIME Type
- MIME type for each document; if not specified, MarkLogic will determine the MIME type based on the URI.
- Permissions
- Comma-delimited sequence of permissions - role1, capability1, role2, capability2 - to add to each document
- Temporal Collection
- The temporal collection to use for a temporal document insert.
- Server Transform
- The name of REST server transform to apply to every document as it’s written.
- URI Attribute Name
- The name of the FlowFile attribute whose value will be used as the URI.
- URI Prefix
- The prefix to prepend to each URI.
- URI Suffix
- The suffix to append to each URI.
- trans:<custom-transform-parameter>
- A dynamic parameter with the prefix of
trans:
that will be passed to the transform. Expression Language Enabled: Variable Scope
PutMarkLogicRecord Processor
Breaks down FlowFiles into batches of Records and inserts JSON documents to a MarkLogic server using the MarkLogic Data Movement SDK (DMSDK).
Relationships
success
FlowFiles that have been written successfully to MarkLogic are passed to this relationship.
batch_success
A FlowFile is created and written to this relationship for each batch. The FlowFile has an attribute of URIs, which is a comma-separated list of URIs successfully written in a batch. This can assist with post-batch processing.
failure
FlowFiles that have failed to be written to MarkLogic are passed to this relationship.
Properties
- DatabaseClient Service
- The DatabaseClient Controller Service that provides the MarkLogic connection.
- Batch Size
- The number of documents per batch - sets the batch size on the Batcher.
- Thread Count
- The number of threads - sets the thread count on the Batcher.
- Record Reader
- The Record Reader to use for incoming FlowFiles.
- Record Writer
- The Record Writer to use for creating new FlowFiles.
- Collections
- Comma-delimited sequence of collections to add to each document. Expression Language Enabled: FlowFile Scope
- Format
- Format for each document; if not specified, MarkLogic will determine the format based on the URI.
- Job ID
- ID for the WriteBatcher job.
- Job Name
- Name for the WriteBatcher job.
- MIME Type
- MIME type for each document; if not specified, MarkLogic will determine the MIME type based on the URI.
- Permissions
- Comma-delimited sequence of permissions - role1, capability1, role2, capability2 - to add to each document
- Temporal Collection
- The temporal collection to use for a temporal document insert.
- Server Transform
- The name of REST server transform to apply to every document as it’s written.
- URI Field Name
- The name of the record field whose value will be used as the URI. If not specified, a UUID will be generated.
- URI Prefix
- The prefix to prepend to each URI.
- URI Suffix
- The suffix to append to each URI.
- trans:<custom-transform-parameter>
- A dynamic parameter with the prefix of
trans:
that will be passed to the transform. Expression Language Enabled: Variable Scope
QueryMarkLogic Processor
Creates FlowFiles from batches of documents, matching the given criteria, retrieved from a MarkLogic server using the MarkLogic Data Movement SDK (DMSDK).
This allows an input which can used in the Query
property with the NiFi Expression Language.
Relationships
success
FlowFiles are generated for each document URI read out of MarkLogic.
failure
If a query fails a FlowFile goes to the failure relationship. If an input is provided to the QueryMarkLogic processor, the input FlowFile is penalized and passed. Otherwise a new FlowFile is generated and passed.
Properties
- DatabaseClient Service
- The DatabaseClient Controller Service that provides the MarkLogic connection.
- Batch Size
- The number of documents per batch - sets the batch size on the Batcher.
- Thread Count
- The number of threads - sets the thread count on the Batcher.
- Consistent Snapshot
- Boolean used to indicate that the matching documents were retrieved from a consistent snapshot.
- Query
- The query criteria for retrieving documents that corresponds with the
Query Type
selected. Expression Language Enabled: FlowFile Scope - Query Type
- The type of query contained in the
Query
property. Available query types:- Collection Query Comma-separated list of collections to query from a MarkLogic server.
- Combined Query (JSON) Combine a string or structured query with dynamic query options (Allows JSON serialized cts queries). See documentation for more details.
- Combined Query (XML) Combine a string or structured query with dynamic query options (Allows XML serialized cts queries). See documentation for more details.
- String Query A Google-style query string to search documents and metadata. See documentation for more details.
- Structured Query (JSON) A simple and easy way to construct queries as a JSON structure, allowing you to manipulate complex queries. See documentation for more details.
- Structured Query (XML) A simple and easy way to construct queries as a XML structure, allowing you to manipulate complex queries. See documentation for more details.
- Return Type
- The type of data that is returned. Default:
Documents
Available return types:- URIs Only Passes FlowFiles with just
filename
attribute with the matching document URIs. - Documents Adds document in FlowFile content.
- Documents + Metadata Adds document in FlowFile content and adds metadata with the
meta:
prefix and properties with theproperty:
prefix to the FlowFile attributes. - Metadata Adds metadata with the
meta:
prefix and properties with theproperty:
prefix to the FlowFile attributes.
- URIs Only Passes FlowFiles with just
- State Index
- Definition of the index which will be used to keep state to restrict future calls. Currently only supports xs:dateTime indexes.
Example State Index Values By Type
- Element Index:
xhtml:title
- Dynamic Property
ns:xhtml
=>http://www.w3.org/1999/xhtml
- Dynamic Property
- JSON Property:
title
- Path Index:
/xhtml:html/xhtml:head/xhtml:title
- Dynamic Property
ns:xhtml
=>http://www.w3.org/1999/xhtml
- Dynamic Property
- State Index Type
- Type of index to determine state for next set of documents.
- Element Index Index on an element. (Namespaces can be defined with dynamic properties prefixed with ‘ns:’.)
- JSON Property Index Index on a JSON property.
- Path Index Index on a Path. (Namespaces can be defined with dynamic properties prefixed with ‘ns:’.)
- Collections
- DEPRECATED use Query Type
Collection Query
with Query instead. Comma-separated list of collections to query from a MarkLogic server. - Server Transform
- The name of REST server transform to apply to every document as it’s read.
- trans:<custom-transform-parameter>
- A dynamic parameter with the prefix of
trans:
that will be passed to the transform. Expression Language Enabled: Variable Scope