Is my Feature Done, yet? - The Story of our Definition of Done

Posted on 05/16/2018 by Noha Khater

As developers, we can all relate to that sense of accomplishment when a feature is finally done. You’ve spent a lot of time planning the implementation, ensuring that all cases and any imaginable scenarios are handled, and that the feature functions as intended. You’ve checked your code into the shared code repository and now you’re done! Or are you?

»Is my feature done?« is actually a vague question with no definitive or unique answer. And in order to answer this question, each team needs to define their own Definition of Done (DoD). The Definition of Done is a checklist of activities or conditions that need to be completed before a software product or a development task is considered as done. Examples for these activities are: writing code, coding comments, unit testing, integration testing, release notes, design documents, etc. Following the DoD ensures that the features are truly done, not only in terms of functionality, but also in terms of quality.

Creating our DoD

The first version of our DoD was created almost a year ago. By then, we already had our established shared understanding about what criteria need to be met for the development tasks to be done. Nevertheless, these criteria were never clearly stated or widely communicated in an explicit manner. Therefore, it was difficult to establish accountability and ensure that the agreed upon process is actually implemented on a team-wide scale. Accordingly, the team transformed these criteria into a set of well-formulated statements that would eventually comprise our own DoD document. The main criteria in our DoD are as follows:

  • The code compiles without errors or warnings
  • The code does not produce errors or warnings in the IDE
  • The code is checked into the version control system
  • The code adheres to our Coding Guidelines
  • The build is successful and all previously existing automated tests have passed
  • The code does not generate new Teamscale findings
  • Already existing Teamscale findings in changed code are addressed
  • The code is verifiably tested (either through manual or automated tests)
  • There are no non-trivial test gaps
  • The code has been peer-reviewed and all review comments addressed
  • The Teamscale product documentation (User Guide, changelog, etc.) has been updated as needed

The list includes criteria that are almost always present in any DoD, such as checking code into the repository, having a project that builds without errors and writing tests. However, there are also criteria dependent on the underlying infrastructure, utilized tools and the team structure. In CQSE, we use Teamscale for developing Teamscale and accordingly, Teamscale is an integral part of our DoD. We use it to monitor the status of newly introduced and already existing findings for each feature. Not to mention that we use it to verify that the feature has been well-tested and that there are no non-trivial test gaps. The screenshot below shows the issue view for a feature that is currently undergoing development. In Teamscale’s issue view you can see the Test Gap Treemap for the issue’s changed files. And in this case, there is a Test gap of 85.71% which indicates that the feature might not have been properly tested yet, and that not enough tests have been written yet to mark the code as verifiably tested in the DoD.

Issue view for an ongoing feature

Enforcing our DoD

Despite having a well-defined DoD document, it became evident with time that the DoD was not strictly followed or fully respected for every single feature. Our DoD was merely a document available on our shared Google Drive that could be easily overlooked and forgotten about. So, it became quite apparent that the DoD needed to be integrated into our workflow process.

We embedded the DoD checklist into the merge requests in Gitlab making the criteria always visible and part of the team’s natural workflow. Before submitting the feature for review, the developer should ensure that all the criteria are met. And the list is again double-checked by the reviewer whose task is to actually check off every item on the list before merging the feature. That way the DoD could no longer be ignored.

DoD embedded in the Gitlab merge request

Lessons Learned

The DoD is not static, but rather continuously evolving and changing over time. It is crucial for teams to regularly revisit the DoD, review it and make any necessary amendments. In the first retrospective meeting after embedding the DoD in our workflow, the development team collected feedback on how the DoD influenced the productivity (whether positively or negatively), which criteria were especially helpful and which criteria were considered as impediments. And these were the main conclusions and lessons learned:

Accountability and a common understanding of the criteria were established: The DoD managed to communicate clearly and explicitly which activities and quality criteria are expected to be met for every feature. As a result, the developers now have a common shared understanding of the DoD checklist; not to mention that individual accountability was established. There is no confusion anymore about the team members’ roles and their responsibilities.

Developers proactively reduced test gaps: Developers have reported that due to the testing-related criteria and emphasis on not creating new test gaps in the DoD, they have become more aware of the already existing test gaps in their assigned features and exerted more effort into writing quality tests and ensuring that their features are properly tested. In the screenshot below, you can see a comparison between the test gaps for Release 4.0 (before enforcing the DoD) and Release 4.1 (after enforcing the DoD). The test gap percentage have indeed decreased from 37.58% to 27.58% after the DoD was enforced and integrated into the workflow. While it can’t be verified that this decrease in test gaps is a direct consequence of the DoD, these observations however support the developers’ reports.

Comparing test gaps between releases 4.0 and 4.1

Teamscale areas of improvement were identified: As it was previously mentioned, we use Teamscale during development as it allows us to monitor and control our code quality, detect bugs and identify which additional functionality or features could be useful for Teamscale itself. Integrating Teamscale into our DoD has helped greatly in compiling a list of Teamscale impediments that hinder fulfilling the DoD criteria, as well as brainstorming which areas in Teamscale could be improved to support the delivery of the team. An example of such a case was when the developers reported that it was difficult to ensure that there were no non-trivial test gaps in JavaScript code, because we do not measure test coverage for all its areas yet. Accordingly, the team has planned to take steps in order to address this problem and come up with solutions. Thus, the DoD has proven to be quite useful in continuously improving and developing our processes, infrastructure, adopted tools and product.

If you do not have a DoD yet, I urge you to create one and apply it. And I hope that sharing our experience with our DoD and using Teamscale to implement it would benefit your team or provide you with ideas that could be useful or applicable to you. And as always, if you have any ideas or suggestions you would like to share, let me know in the comments below!

Click to activate comments for this page.

Please note that comments use the third-party integration Disqus, which shares data with both Twitter and Facebook and has a separate privacy policy that you have to agree to before activating comments on this page.