Navigation

2-2 Load PDF as Binary and Extracted Metadata as JSON

This example shows how to use the ExtractMediaMetadata processor to extract the properties from a PDF file and AttributesToJSON to convert the FlowFile attributes, including the extracted PDF properties, to a JSON file. This is also the first example to show multiple uses of the same relationship, in this case the "success" relationship of GetFile, to create two sub-flows.

  • Download Template
  • Processors:
    • GetFile – reads files from a watched directory
      • Properties
        • Input Directory: /some/path
    • UpdateAttribute (after GetFile)
      • Properties
        • marklogic.uri: /pdfs/${filename}(custom property)
    • ExtractMediaMetadata
      • Properties
        • (all defaults)
    • AttributesToJSON
      • Properties
        • Destination: flowfile-content
    • UpdateAttribute (after AttributesToJSON)
      • Properties
        • marklogic.uri:/pdfs/${filename}.json(custom property)
    • PutMarkLogic
      • Properties
        • DatabaseClient Service: MarkLogicClientService – Localhost / Documents
        • URI attribute name: marklogic.uri
      • Settings
        • Automatically Terminate Relationships: FAILURE and SUCCESS