Navigation
- Introduction
- Getting Started
- Step-by-Step Guide
- NiFi Features
- Cookbook Recipes
-
Community Recipes
- Run Data Hub-4 Input Flow
- Run Data Hub-4 Harmonize Flow
- Run Data Hub-4 Flows (Input and Harmonize)
- Read Files from Directory, Write to MarkLogic
- Extract values from JSON data
- Extract values from XML data
- Convert Data To UTF-8
- Handling Multiple Types of Content
- Ingest Line-Delimited JSON
- Split XML Files Into Multiple Documents
- Loading Documents From Compressed Files
- Generate Documents from CSV Files
- Load PDF as Binary and Extracted Metadata as JSON
- Call a Web Service
- Augment XML content with data from a Web Service
- Modify NiFi Attributes with Custom Scripting
- Get Files by FTP
- Extract Text from PDFs and Office Documents
- Get Data from a Relational Database
- Create View, use GenerateTableFetch
- Count Rows, Construct Paged SQL SELECTs
- Execute a SQL Query for a Nested Array
- Use ExecuteSQLToColumnMaps
- Column Maps with Child Query
- Loading Content and Metadata from an mlcp Archive File
- Load Triples
- Transform JSON
- Load Data from SharePoint
- Invoke HTML Tidy on HTML Content
- Decrypt Input CSVs
- Read Parquet Files
- Export MarkLogic Database Content to the File System
- Read MarkLogic XML, Write to CSV
- Kafka Integration
- Error Handling in NiFi Flows
- Error Handling in PutMarkLogic
- Error Resolutions
- Performance Considerations
- FAQs
3-5 Read Parquet Files
The easiest way to process Parquet files is to use Python's Panda library and put it into an ExecuteProcessStream processor