You Too Should Be Data-proficient
As a PM, one of the most important skills that I learned in the past couple of years is the ability to collect, query and analyze data. No, really - data is fascinating to me almost more than any other part of the product. Data can inform any future decisions and either validate or invalidate your hypotheses around the direction of your work.
Before joining Microsoft, I always thought that working with data is something that only data scientists and analysts do - a PM sets out the path for the product, the data science team provides the numbers and insights, and then engineering drives the implementation. What I’ve noticed, however, is that data scientists are much more scarce than I expected, and I had too many features to ship that I needed data analysis on than I had data team members available to do the analysis. The solution? Learn how to work with data myself - and I am here to tell you that you should learn the basics (and way beyond the basics) too. After all, I am a very vocal proponent of data democratization within teams.
Here are some of the skills that I found to be useful:
- Querying data. The data store does not matter as much, so focus on the theory as much as you would on the practice. Being able to think through a query and formalize what data you want to get out of a store helps tremendously in being able to dig through existing numbers.
- Looking for the right things. Telemetry comes in different shapes and sizes, and often time can contain a lot of irrelevant information. Being able to clearly articulate what data is impactful vs. noise is going to help one identify the true state of affairs when it comes to a product or a feature.
- Finding anomalies. Not all data is always accurate. It’s key that when looking at various metrics one spends enough time to understand if there are any outliers or trends that look off. Learn more about anomaly detection.
- Understanding that correlation is not causation. Just because several things seem to be changing as if they are connected does not mean that one causes the other. Read more on Wikipedia.
- Learning the tools, algorithms and libraries. There is so much out there already implemented that will help you get things done. For example, recently I needed to run the Apriori algorithm over a data set. I could implement it myself, but I ended up using a pre-cooked library. Knowing the theory and the needs behind it helped me determine what I needed to look for to get the outcome I wanted from the data.
- Ability to identify vanity metrics. Some things tell you nothing about the real performance of a feature or a product. It’s important to not get sidetracked by those - the rose-colored glasses will eventually wear off. Read more on what constitutes a vanity metric.
I got myself into a habit where every day I spend some time analyzing the data and learning more about the available tools that can make me more data-driven. This makes a lot of feature planning decisions really easy - I combine customer interviews with information on past and projected performance, and then there is very little to argue about the potential impact. It does not matter that I am not a data scientist by trade - being able to work with data became a key component of my PM responsibilities, and I am loving it.
Whether you are an engineer, PM, designer or researcher - I would encourage you to carve out some time to learn about the data tools and stores that your team works with. Once you get an idea of how you can extract insights, you will be able to operate at a much higher velocity. Being more well-versed in data does not mean that you don’t need a data science team - they still are a major factor in your product’s success, as they operate at a much higher scale. Being well-versed in data does unlock opportunities for yourself, however, to be able to quickly understand how your users are operating with a product and what you can do better to address their needs.
For additional reading, I recommend checking out “Your Team Doesn’t Need A Data Scientist For Simple Analytics” (link no longer active).