Love the breakdown of all the data! You've shown a similar thing as is described in the "Accelerate" book, that there doesn't need to be a trade-off between speed and quality, as long as you're pr turnaround is quick
Great article explaining how to balance velocity with quality.
I have always believed that each team has a style. that suits them. There is a balance of trust and culture. A good PR review is one where people focus on giving meaningful feedback. With AI, engineers are empowered to gather the first feedback and apply the changes.
If anyone has some unique examples to share of high velocity teams, let's connect!
Curious to know, Anton, what’s your take on AI code reviews? Do you consider them in the “no code review” category? In my company, as in many organizations, AI driven automated code reviews are now becoming a norm and we’re able to move much faster.
It's a tricky one. I definitely believe there is a lot of potential, the hard part is to find the optimal way to leave humans in the loop.
I really like Hamming's approach, using multiple AI PR tools, an internal tool to triage the comments, and summarizing it all for an LLM to fix. He talked a bit about it in last week's article (planning a follow-up!):
I appreciate the part about quality of reviews. I always recommend "How to Do Code Reviews Like a Human" by Michael Lynch (https://mtlynch.io/human-code-reviews-1/) to my engineers because I believe is important to start doing good code reviews as soon as possible, but not so in depth that the main developer can't learn anything from it or it ends up significantly impacting team's velocity.
Interesting analysis! There's something I don't get though: You said "Teams that didn’t perform code reviews had a x1.9 times higher output (~59 expert hours/developer), but x2.4 times more bugs! (8.9/developer)."
However, you count bugs by looking at PRs, not at JIRA. Meaning, you are looking at bugs solved, not at bugs created. Then, you are saying that in a company where they have higher output (i.e. more features), they also have more bugs solved. Well, that doesn't mean they are creating more bugs! They are just doing more of everything.
I think you should normalize the bugs solved, taking the higher output into account. So for example, if due to lack of code reviews, I manage to get 1.9x more code in, then assuming I work on features and bugs 50%-50% (a big assumption!), I also solve 1.9x more bugs. So, if you see 2.4x bugs, that means due to lack of code reviews, the bugs increased from 1.9x to 2.4x, and not from 1x to 2.4x. That's a 1.3x increase.
In reality, I don't work on features and bugs 50%-50%. It really depends on the culture. In some companies, you have a dedicated time for bug fixing (e.g. 1 week every month). But assuming all bugs are being resolved is always wrong. You can assume all major bugs are resolved, but other than that, it really depends. So maybe the correct way to analyze this would be to only look at PRs for major bugs. WDYT?
Regarding the increase - I agree, it's relative. I wanted to show that the number of bugs increased more than the output. As I mentioned in the end, it's a relative 25% increase in bugs.
Not sure I got the point about looking only on major bug PRs :)
Great take on code reviews!
And a brilliant example of a mental model of inertia.
Thanks for linking to Perspectiveship, Anton! I appreciate it!
Thank you Michal! :)
Love the breakdown of all the data! You've shown a similar thing as is described in the "Accelerate" book, that there doesn't need to be a trade-off between speed and quality, as long as you're pr turnaround is quick
Great article explaining how to balance velocity with quality.
I have always believed that each team has a style. that suits them. There is a balance of trust and culture. A good PR review is one where people focus on giving meaningful feedback. With AI, engineers are empowered to gather the first feedback and apply the changes.
If anyone has some unique examples to share of high velocity teams, let's connect!
Curious to know, Anton, what’s your take on AI code reviews? Do you consider them in the “no code review” category? In my company, as in many organizations, AI driven automated code reviews are now becoming a norm and we’re able to move much faster.
It's a tricky one. I definitely believe there is a lot of potential, the hard part is to find the optimal way to leave humans in the loop.
I really like Hamming's approach, using multiple AI PR tools, an internal tool to triage the comments, and summarizing it all for an LLM to fix. He talked a bit about it in last week's article (planning a follow-up!):
https://zaidesanton.substack.com/p/what-a-10x-team-looks-like
Interesting experiment!
I appreciate the part about quality of reviews. I always recommend "How to Do Code Reviews Like a Human" by Michael Lynch (https://mtlynch.io/human-code-reviews-1/) to my engineers because I believe is important to start doing good code reviews as soon as possible, but not so in depth that the main developer can't learn anything from it or it ends up significantly impacting team's velocity.
I absolutely love the part about code review karma: the better reviews you give, the better ones you receive.
That's something I've anecdotally observed in teams, but I've always been reluctant to draw too many conclusions from sparse data points.
Interesting analysis! There's something I don't get though: You said "Teams that didn’t perform code reviews had a x1.9 times higher output (~59 expert hours/developer), but x2.4 times more bugs! (8.9/developer)."
However, you count bugs by looking at PRs, not at JIRA. Meaning, you are looking at bugs solved, not at bugs created. Then, you are saying that in a company where they have higher output (i.e. more features), they also have more bugs solved. Well, that doesn't mean they are creating more bugs! They are just doing more of everything.
I think you should normalize the bugs solved, taking the higher output into account. So for example, if due to lack of code reviews, I manage to get 1.9x more code in, then assuming I work on features and bugs 50%-50% (a big assumption!), I also solve 1.9x more bugs. So, if you see 2.4x bugs, that means due to lack of code reviews, the bugs increased from 1.9x to 2.4x, and not from 1x to 2.4x. That's a 1.3x increase.
In reality, I don't work on features and bugs 50%-50%. It really depends on the culture. In some companies, you have a dedicated time for bug fixing (e.g. 1 week every month). But assuming all bugs are being resolved is always wrong. You can assume all major bugs are resolved, but other than that, it really depends. So maybe the correct way to analyze this would be to only look at PRs for major bugs. WDYT?
Regarding the increase - I agree, it's relative. I wanted to show that the number of bugs increased more than the output. As I mentioned in the end, it's a relative 25% increase in bugs.
Not sure I got the point about looking only on major bug PRs :)
Ah, I see, I didn't realize that 25% was after the normalization. Thanks! Great analysis!
I agree with this.
Fascinating analysis! The correlation between review speed and productivity is something every team needs to see.