NonStop Insider

job types


Site navigation


Recent articles


Editions


Subscribe


For monthly updates and news.
Subscribe here
NonStop Insider

Integrating Cloud Computing Services with NonStop

Canam Software and TIC Software

DanDan

ticsoft logo   canam logo

Cloud computing is playing an increasingly key role in organizational IT strategies. Amazon, Microsoft, IBM, Google and others continue to improve their offerings, making a compelling case for using cloud-based resources. As providers continue to extend more and more services, new opportunities are presenting themselves.  n tis article, we will look at some of the products offered by Amazon Web Services (AWS) and how they can be used to introduce big data analytics capabilities for NonStop applications.

To begin from a common starting point, let’s define cloud computing and its advantages.

What is Cloud Computing?

Amazon defines Cloud Computing as “the on-demand delivery of compute power, database storage, applications, and other IT resources through a cloud services platform via the internet with pay-as-you-go pricing.” A quick Google search will yield many definitions for Cloud Computing, but they are essentially the same, with the key concepts being:

There are three types of Cloud Computing deployments, with each one representing a different level of control. They are: Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS).

Infrastructure as a Service involves renting individual components of computing resources hosted by someone else, such as storage, network endpoints, and virtual machines with a selection of operating systems. One is responsible for assembling and maintaining these components as well as deploying and maintaining one’s software. Examples of this include Microsoft Azure, Amazon Web Services, Google Cloud Platform, and IBM Cloud.

Platform as a Service typically involves renting access to pre-configured virtual machines hosted by someone else. Meaning, one provides the software that runs in the cloud and they provide the operative system and virtual hardware to host that software. Examples of this include Heroku, OpenShift. and any Docker-based applications.

Software as a Service involves renting access to a piece of software that someone else hosts on the internet. One pays for access to the service and they worry about setting up, providing and maintaining that service. Examples of this include SalesForce and Dropbox.

Advantages of Cloud Computing

Advantages of Cloud Computing include:

Cloud Computing and Big Data Analytics

Big data analytics is the examination of large volumes of data to detect patterns, trends, customer preferences and other information that can help organizations find ways to: increase revenue, provide better customer service, improve operational efficiency and more.  Typically, this type of analysis is not done against transactional databases which are designed for high performance, transaction-based processing.  Instead, data is moved to storage areas designed specifically for querying and analysis.  Data lakes, data warehouses and data marts are examples of such storage areas.

A data lake is a storage area that holds large volumes of data in its original format. (No processing or formatting of the data is done before it is loaded). The thinking behind storing data this way is that one never knows how this data will be used in the future.  Storing it in its raw format keeps all possibilities open.  While a data lake stores data in an unstructured format, a data warehouse stores data in a structured format.  Data warehouses are modeled for high performance querying and reporting.  Data marts are subsets of a data warehouse geared to a specific functional area.    Data lakes and data warehouses/data marts are sometimes considered mutually exclusive approaches to big data analytics; however, this doesn’t have to be the case.  A data lake is an excellent way to source data for use by multiple data warehouses and data marts can meet both immediate and future analytic requirements.

Costs for storage, processing power, software, etc. can make implementing a data analytics solution an expensive undertaking. This, plus the fact that being scalable is a key requirement as data volumes grow, make cloud computing a great option for big data analytics.

AWS and NonStop

Transactional data captured by NonStop applications can be a key source for business analytics; however, the NonStop platform may not be ideal for storing and analyzing the information over a long period of time.  Developing and hosting an in-house data analytics solution can be challenging – and expensive – option.  An attractive alternative is AWS which provides all the cloud-based services needed to extend easily a NonStop application with a scalable, flexible, and cost-effective infrastructure.  But how do you get the information from the NonStop to AWS and which AWS services do you use and how do you use them?

The diagram below shows just one possible approach for integrating data from the NonStop platform with AWS.

tic - aug 19

The AWS services in the above diagram can be split into 3 categories: Collection, Storage and Analyze.

Collection Services

AWS’ Direct Connect service can be used to connect NonStop application data to AWS.

Storage Services

AWS provides several storage services.  The example above uses AWS’ Simple Storage Service (S3) to hold data in its raw form – thus providing an excellent data lake implementation.  A concern with data lake implementations is that they can often turn into “data swamps.” This is a term used to describe a situation where the data stored cannot be easily queried or used, and can occur when data is simply stored in a data lake without any information about its context (date, source, identifiers, etc.).  AWS’ data lake solution addresses this by storing data in packages and tagging each package with metadata. One can define the metadata one needs for your packages to keep them organized.  AWS’ Elasticsearch and DynamoDB are used for storing and retrieving these packages.  Redshift is a data warehouse service where data can be stored for sophisticated querying and analysis. Data can be loaded from the data lake into one or more data warehouses. Lambda is AWS’ serverless function environment. It can be used to develop event-driven code for receiving data from the NonStop and loading it to S3 and storing metadata in DynamoDB.

Analyze Services  

AWS provides many analysis services.  In the above example, AWS Quicksite is used.

Summary

Integrating AWS with NonStop can provide a scalable, flexible and cost-effective platform for big data analytics.  In our next article we’ll discuss the steps involved in more detail.

 

For further  information please feel free to contact CanAm and/or TIC Software –

John Russell

Canam Software

110 Matheson Blvd.

Ste. 110

Mississauga ON

L5R 4GT

CANADA

289.719.0800

russell@canamsoftware.com

www.canamsoftware.com (Canam Company Site)

 

Ryan Ly

TIC Software

60 Cuttermill Rd.

Ste. 412

Great Neck, NY 11021

516.466.7990

ryan_ly@ticsoftware.com

www.ticsoftware.com (TIC Company Site)

http://blog.ticsoftware.com (TIC Talk Blog)