Data Fabric in Action - How to enhance a Stock Trading App with ML and Data Virtualization
How do you query data across MongoDB, DB2, and flat files with a single SQL statement? See how a data fabric simplifies data access for machine learning.
#1about 2 minutes
What is a data fabric architecture?
A data fabric is an emerging architecture that integrates disparate data sources across hybrid multi-cloud environments to address distributed data challenges.
#2about 3 minutes
Common challenges in developing machine learning applications
Developers face significant hurdles in finding, understanding, integrating, and ensuring the quality of data before they can select and deploy ML models.
#3about 4 minutes
Exploring the components of the IBM Data Fabric
The platform architecture is built on four pillars—collect, organize, analyze, and infuse—and includes key automated services like a data catalog, data virtualization, and privacy controls.
#4about 3 minutes
Understanding roles and responsibilities in the AI lifecycle
A successful AI project involves collaboration between distinct roles like data engineers, data stewards, data scientists, and developers, each with specific tasks.
#5about 2 minutes
The platform architecture of IBM Cloud Pak for Data
IBM Cloud Pak for Data is built on Red Hat OpenShift, a Kubernetes-based platform that provides scalability, automated deployments, and a control plane for integrated services.
#6about 5 minutes
Use case: Enhancing a stock trading app with ML
To reduce customer churn, a stock trading application is enhanced with an ML model to predict churn risk and data virtualization to simplify access to diverse data sources.
#7about 5 minutes
Demo: Discovering data with the knowledge catalog
The platform's central catalog allows developers to search for data assets, view previews, and understand data context through automatically assigned data classes and business terms.
#8about 6 minutes
Demo: Building an ML model with the AutoAI experiment
The AutoAI experiment automates model selection by testing multiple algorithms and hyperparameters, presenting a leaderboard to help developers choose and deploy the best model as a REST API.
#9about 8 minutes
Demo: Setting up real-time data virtualization
Data virtualization allows connecting to heterogeneous sources like MongoDB and relational databases, presenting them as standard SQL tables that can be visually joined and queried in real time.
#10about 6 minutes
Q&A: Sharing assets and training data limits
The discussion covers how users can publish their own refined data assets to the catalog, considerations for training data size, and API connectivity for native mobile applications.
Related jobs
Jobs that call for the skills explored in this talk.
With AIs wide open - WeAreDevelopers at All Things Open 2025Last week our VP of Developer Relations, Chris Heilmann, flew to Raleigh, North Carolina to present at All Things Open . An excellent event he had spoken at a few times in the past and this being the “Lucky 13” edition, he didn’t hesitate to come and...