What is relational learning and why does it matter?

Up to 90% of a data science project is spent on manual feature engineering. Relational learning automates this critical step, letting you build better models faster.

#1about 4 minutes

The challenge of using relational data in machine learning

Machine learning models require fixed-size feature vectors, which is straightforward for images or text but problematic for relational data with one-to-many relationships.

#2about 5 minutes

Why manual feature engineering is a major bottleneck

Manually creating features through aggregation is a slow, iterative process that requires deep domain expertise and can take weeks of a data scientist's time.

#3about 1 minute

Introducing relational learning to automate feature creation

Relational learning automates predictive analytics by using a two-step approach where an algorithm first learns features before they are passed to a prediction model.

#4about 4 minutes

Understanding the brute-force propositionalization approach

Propositionalization automates feature creation by applying a large bag of aggregations to every column, but this method is inefficient and generates many irrelevant features.

#5about 1 minute

Using supervised learning to find the best features

Advanced feature learning algorithms use supervised learning and statistical optimization to intelligently search for the most impactful features, avoiding a brute-force approach.

#6about 4 minutes

How the MultiRel algorithm builds complex features

The MultiRel algorithm iteratively builds complex features with hierarchical conditions by optimizing a loss function, enabling it to discover subtle and powerful patterns.

#7about 1 minute

Implementing pipelines with the getML Python API

The getML framework offers a high-performance C++ core with a simple Python API for building end-to-end pipelines that combine feature learners and predictors.

#8about 2 minutes

Getting started with automated feature engineering tools

Practitioners can start with open-source tools like feature-tools or use highly optimized implementations like the upcoming FastProp algorithm for significant performance gains.

#9about 5 minutes

Q&A on deep learning and alternative feature methods

The Q&A session clarifies why deep neural networks still require fixed-size inputs and how supervised feature learning is more efficient than brute-force generation followed by pruning.