Cloudera and StreamNative are open-sourcing an integration between Apache NiFi and Apache Pulsar. Together, NiFi and Pulsar enable companies to create a cloud-native, scalable, real-time streaming data platform that can ingest, transform, and analyze massive amounts of data.
StreamNative was founded by the original creators of Apache Pulsar and the team is excited to contribute the integration to the open source. The Cloudera team includes some of the original developers of Apache NiFi and will make the connector available inside the Cloudera platform.
According to the company, the synergies realized by combining these technologies inside your data platform will be significant. All of your dataflow management needs including prioritization, back pressure, and edge intelligence are provided by NiFi.
You can use NiFi’s extensive suite of connectors to automate the flow of data into your streaming platform while performing ETL processing along the way. After the data has been transformed, it can be routed directly to Pulsar’s durable stream storage for long-term retention via these new NiFi processors designed for Apache Pulsar.
Once the data has been stored inside Pulsar, it can be made readily available to various popular stream processing engines such as Flink or Spark, for more complex streaming processing and analytics use cases. In short, NiFi’s extensive suite of connectors makes it easy to “get data in” to your streaming platform, and Pulsar’s integration with Flink and Spark makes it easy to get real-time insights out.
With this update, you will be able to consume and produce messages from Pulsar topics at scale with simple configuration settings within Apache NiFi. Cloudera makes these processors available out of the box for CDF for Data Hub 7.2.14 and newer.