2.3. Part 3: Anchor Problems#

Throughout this book, we will revisit a small set of problems repeatedly and explore them across different stages of the data science lifecycle. These problems serve as recurring examples that help connect concepts across chapters. It is important to understand them well, as they will appear throughout the book in multiple contexts.

2.3.1. Problem 1: Spotify Track Popularity Prediction#

The goal of this problem is to design a system that can predict the popularity a song might achieve based on its characteristics.

Think about the different factors that may or may not influence a song’s popularity. Could the name of the song matter? What about the artist, genre, or audio features? Feel free to propose creative and intuitive ideas.

2.3.2. Problem 2: Netflix Movie Recommender#

As the name suggests, this problem focuses on building a recommendation system. The objective is to recommend movies that are similar to those a user already likes.

Consider what factors might influence why people enjoy similar movies. Is it the director, the actors, the genre, or the style of storytelling? Explore different possibilities and assumptions.

2.3.3. Problem 3: Credit Card Fraud Detection#

In this problem, we will build a real world and widely used application commonly deployed by banks and financial institutions. The task is to determine whether a transaction is fraudulent or legitimate.

Think about the signals and patterns that might indicate fraud, such as unusual spending behavior, transaction location, or timing.


For all three problems, revisit the framework described in Novel Framework section. Apply that framework to each problem to develop a deeper understanding and to generate novel and insightful ideas.