2021. What an interesting year. With the world turned upside down by a pandemic that seemingly had its sights set on...
Striim: When it’s time to move data in real time, you need a robust and resilient data pipeline
We all know about pipelines, right? There are oil and gas pipelines just as there are sales pipelines. Whenever we think of pipelines we think of movement. The very goal of any pipeline is to support continuous movement. What then do we think of data pipelines?
This is the question that Mariana Park, Striim’s Growth Marketer, addresses in her June 21st, 2021 post to the Striim blog, What is a Data Pipeline (and 7 Must-Have Features of Modern Data Pipelines). Now, for the NonStop community, any discussion about pipelines touches on a popular theme, one we regularly discuss. The prospect of NonStop silo-ing data is an anathema to all who deploy NonStop. After all, NonStop creates data. That’s just what NonStop systems do as they process transactions. , The data flow is nonstop! And it’s all in real time.
What is a Data Pipeline? According to Striim’s Park, it is:
A data pipeline is a sequence of actions that moves data from a source to a destination. A pipeline may involve filtering, cleaning, aggregating, enriching, and even analyzing data-in-motion.
Data pipelines move and unify data from an ever-increasing number of disparate sources and formats so that it’s suitable for analytics and business intelligence.
Data can be moved via either batch processing or stream processing … stream processing enables the real-time movement of data. Stream processing continuously collects data from sources like change streams from a database or events from messaging systems and sensors.
Real Time Data Processing and Analytics? These two actions have become tightly coupled as increasingly, the type of real time data processing we associate with NonStop is more often than not dependent on outcomes of analytics processes. No surprises here as Park then highlights how:
Modern data pipelines should load, transform, and analyze data in near real time, so that businesses can quickly find and act on insights. To start with, data must be ingested without delay from sources including databases, IoT devices, messaging systems, and log files. For databases, log-based Change Data Capture (CDC) is the gold standard for producing a stream of real-time data.
Real-time data pipelines provide decision makers with more current data. And businesses, like fleet management and logistics firms, can’t afford any lag in data processing. They need to know in real time if drivers are driving recklessly or if vehicles are in hazardous conditions to prevent accidents and breakdowns.
When considering the latest and most modern infrastructures in place today proving popular with IT managers, it’s hard to ignore cloud. HPE likes to talk about the cloud experience even as virtual NonStop is heading to the clouds, but the reality is that without access to cloud service providers much of what we would like to achieve with data – create and analyze – may be beyond economic practicality for most enterprises. The way Park sees it then:
Modern data pipelines rely on the cloud to enable users to automatically scale compute and storage resources up or down. While traditional pipelines aren’t designed to handle multiple workloads in parallel, modern data pipelines feature an architecture in which compute resources are distributed across independent clusters.
Clusters can grow in number and size quickly and infinitely while maintaining access to the shared dataset. Data processing time is easier to predict as new resources can be added instantly to support spikes in data volume.
When it comes time to summarize the advantages that come with deploying a data pipeline capable of moving data in real time, Park returns to the topic of modernization. This has been the focus of many NonStop presentations and webinars of late, given the steps HPE has taken to modernize NonStop, so it comes as no surprise that any discussion around modernization should include the implementation of data pipelines as supported by Striim:
Data pipelines are the backbone of digital systems. Pipelines move, transform, and store data and enable organizations to harness critical insights. But data pipelines need to be modernized to keep up with the growing complexity and size of datasets. And while the modernization process takes time and effort, efficient and modern data pipelines will allow teams to make better and faster decisions and gain a competitive edge.
Read this blog article at https://www.striim.com/what-is-a-data-pipeline-and-must-have-features-of-modern-data-pipelines/. You can find the entire Striim blog postings at https://www.striim.com/blog/
Should you have any questions about the Striim intelligent data pipeline and the value it provides in supporting real time movement of data, then please don’t hesitate to reach out to us, the Striim team. We would be only too happy to hear from you, anytime and all the time.
Ferhat Hatay, Ph.D.
Sr. Director of Partnerships and Alliances, Striim, Inc.