2021. What an interesting year. With the world turned upside down by a pandemic that seemingly had its sights set on...
Striim; an opportunity for the NonStop user community to better integrate and utilize the Apache Kafka capabilities!
striim
How many of us have been checking the Striim blog of late? Noticed just how many posts there have been referencing Kafka? When it comes to the NonStop community how many of us even know what Kafka is, let alone track all the attention it has been attracting of late. If you haven’t been tracking Kafka then perhaps it is time to take a look. After all, it’s another Apache project and as we know, there has been much coming from Apache that has found its way onto NonStop systems – think Apache web services, including SOAP.
So what is the Apache Kafka project all about? First up, take a look at how Apache describes this project – Kafka™ is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies. Wiki describes Kafka as “an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a ‘massively scalable pub/sub message queue architected as a distributed transaction log,’ making it highly valuable for enterprise infrastructures to process streaming data.”
OK, so let’s not get too hung up with the references to scalable and fault tolerance – does anyone do this better than NonStop? However, the reference to customers using Kafka to build real-time data pipelines and steaming apps is significant. In his post of August 28, 2017, Striim CTO Steve Wilkes lists a couple of things you need to be aware of, as a NonStop user, when the discussion about Apache Kafka comes up. In this post, Making the most of Apache Kafka – Make Kafka easy, Wilkes suggests that you need to determine:
- How do you get data into Kafka from your enterprise sources, including databases, logs, devices?
- How do you deliver data from Kafka to targets like Hadoop, Databases, Cloud storage?
- How do you process and prepare data?
- How do you perform analytics?
- How do you ensure all the pieces work together and are enterprise grade?
- How do you do all this in a fast and productive way without involving an army of developers?
Turning our attention to the NonStop community, here’s the deal. Let’s start with the very first bullet item – how do you get data into Kafka from your enterprise sources, including databases, logs, devices? You shouldn’t be surprised to read that, yet again, it’s all about exploiting change data capture (CDC) – something Striim can readily perform on NonStop! After all, there is more than enough expertise within the Striim organization, folks knowledgeable and experienced in doing exactly this. Remember where a number of the Striim team members came from? Yes, you guessed it – the former GoldenGate team.
But there is more to consider here. How do you perform analytics and where? The story line is now becoming a lot clearer and if you have been attuned to the latest news coming out of HPE aimed at the NoNStop community you will appreciate how it is all about hybrid IT and it is all about virtualization. And this includes NonStop! Striim doesn’t expect many NonStop users will contemplate running the processor-intensive analytics on NonStop – particularly in real time – but rather will exploit offerings like NSADI to offload onto Linux systems where the analytics can be performed in a more cost-effective manner. That’s right, where you have configured Striim! And yet, back to the databases and log / audit files, you will still find elements of Striim running on NonStop, pulling relevant information from completed transaction.
The Apache Kafka project is a very important project and one that will likely have an impact on solutions running on NonStop. Moving mission critical information between NonStop and Kafka and then back again, will be something many NonStop users will be looking to exploit. As Wilkes explains it, “Striim not only integrates Kafka as a source and target … Striim ships with Kafka built in. You can optionally start a Kafka cluster when you spin up a Striim cluster, and easily switch between our high speed in-memory messaging and Kafka using a keyword (or a toggle in our UI). Kafka can become transparent and its capabilities harnessed without having to code to a bunch of APIs.”
Furthermore, notes Wilkes in a follow on post, “When you are considering how to get data into Kafka, you need to determine how to collect source data in a streaming fashion, and how to ‘massage’ and transform that data into the required format on Kafka. Neither of these steps should require any coding, yet should be flexible enough to cover a wide range of use cases.” And yes, “The Striim platform ingests real-time streaming data from a variety of sources out-of-the box, including databases, files, message queues and devices. All of these are wrapped in a simple easy to use construct – a data source – that is configured through a set of properties.” Sound familiar? Well it should if you are a NonStop user but it you have any questions or would like to know more about how Striim can help you here then feel free to call Katherine Rincon directly or simply send her an email at any time.