Member-only story

Agile for Data Science

4 min readDec 8, 2017

This post is a collection of ideas and work of others on Agile Data Science, combined with my thoughts on how to apply essence of agile methodology to data science work. Agile as applied to software and process is an extension of the Scientific Method that emphasizes a structured approach based on hypothesis, observation and learning.

There are three concepts that capture essence of agile: feature definition, user feedback and iteration. Let’s apply these concepts to data science work:

Hypothesis formulation is the equivalent to feature driven development in software
Well-designed experimentation and iteration based on new information or feature selection is similar to test driven development and implementation
The concept of retrospectives / peer review is highly valuable to data science work

Why does it matter? In his post on this topic, John Akred nicely captures the value of agile to data science:

By using agile data science methods, we help data teams do fast and directed work, and manage the inherent uncertainty of data science and application development.

In a post titled Agile Data Science, Wacław Kuśnierczyk mentions that Agile Data Science means a focus on efficiency, creating MVPs based on research and preferring simple models over elaborate ones.

Manifestos are a good way to spark discussion and good practices. For instance, here’s a straightforward four-point software development agile manifesto that offers value for data science:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

Russell Jurney has written a book on the topic of Agile Data Science and created a seven-point manifesto — check it out and let me know what you think.

There are two things in the above manifesto that caught my eye. The first is “finding a critical path” as described in the manifesto as:

Analytics product development is the search for and pursuit of a moving goal. Once a goal is determined, for instance a prediction to be made, then we must find the critical path to its implementation and, if it proves valuable, to its improvement.

Agile for Data Science

Written by Babar M Bhatti

No responses yet