Code review

From AtoM wiki
Revision as of 15:29, 21 May 2020 by Dan (talk | contribs) (Getting started)

Main Page > Development > Development/Code review

Code review is essentially 2 or more developers discussing changes to the underlying source code of the application - which could be for the implementation of a bug fix or new feature, a refactor of some working code to optimize it, and so forth. Think of it as peer review for developers.

Every new feature and bugfix to AtoM is expected to go through code review before inclusion. This applies both to developers at Artefactual and to outside contributors.

Why code review?

Code review has a number of advantages, all of which help the AtoM project in the long term. Some of these include:

  • Code quality: Getting more than one set of eyes on the project can ensure that no (or less) bugs slip through, that security vulnerabilities are found before they are a problem, and that an easier or more elegant solution is not overlooked.
  • Code consistency: This is about 2 things: 1) consistent code is easier to maintain over time, and 2) it's easier for someone new to the project to learn. As AtoM grows, it becomes difficult for one person to have eyes on all of the corners of the application's code base. Code review can help ensure that consistent methods are used throughout the application, and deviations caught in advance.
  • Awareness: Again, because AtoM is a large application. As changes are introduced, it's helpful to get more eyes on what's changing, and make sure that more of the developers responsible for maintaining AtoM are familiar with how the application is evolving with each change.
  • Knowledge sharing: For both our team, and our community contributors, there is always the opportunity to learn from our colleagues. Getting a second opinion can be a wonderful opportunity to learn something new!

Seealso

Many of the bullet points above have been adapted from this post about code review. There are a lot of other compelling arguments out there for why code review is important as well - here's a few:

Getting started

So you have something to contribute to the AtoM project. Great! Thanks for sharing with the community!

Here's an outline of the code submission and review process:

  1. Fork the Artefactual project on GitHub, and commit your changes to a branch.
  2. Open a pull request.
  3. Back and forth discussion with developers on the branch.
  4. Make any changes suggested by reviewers.
  5. Repeat 3 and 4 as necessary.
  6. Clean up commit history, if necessary.
  7. Your branch will be merged!

Artefactual uses GitHub's pull request feature for code review. Every change being submitted to an Artefactual project should be submitted as a pull request to the appropriate repository. A branch being submitted for code review should contain commits covering a related section of code. Try not to bundle unrelated changes together in one branch; it makes review harder.

If you're not familiar with forking repositories and creating branches in GitHub, consult their guide.

Seealso

When to submit code for review?

Your code doesn't have to be ready yet to submit for code review! You should submit a branch for code review as soon as you want to get feedback on it. Sometimes, that means submitting a feature-complete branch, but sometimes that means submitting an early work-in-progress in order to get feedback on direction. Don't be shy about submitting early.

Tip

If you are submitting a draft pull request that's a work in progress, just to get some early feedback, we recommend prefacing your pull request title with "WIP:", so others can easily tell that it is intended as a work in progress and not the final form.

Opening the pull request

GitHub has an excellent guide on using the pull request feature.

Seealso

Discussion

Discussion on pull requests is usually a back and forth process. Don't feel like you have to make every change the reviewer suggests; the pull request is a great place to have in-depth conversation on the issue.

Do make use of GitHub's line comment feature!

An example of line commenting during code review

By viewing the "Files changed", you can leave a comment on any line of the diff. This is a great way to scope discussion to a particular part of a diff. Any discussion you have in a specific part of the diff will also be automatically hidden once a change is pushed that addresses it, which helps keep the pull request page clear and readable.

When leaving line comments, make sure you leave them on the diff - not on the page for an individual commit. Any comments left on the commit will disappear from the pull request page as soon as that commit is no longer in the branch, even if a newer revision of that commit is.

Anyone can participate in code review discussions. Feel free to jump in if you have something to contribute on another pull request, even if you're not the one who opened it.

What do we look for?

It's hard to sum up all the considerations that might come into play during code review, but the reasons why we do it (listed above) help to indicate what we look for as well. Is the code the best that it could be? Are there any security vulnerabilities? Any bugs? Is it consistent with how we implement other similar elements in AtoM, and with AtoM's overall design? Is it going to be difficult to maintain?

Additionally, we want to make sure that code conforms to our coding standard guidelines.

Seealso

Kevin London has some great suggestions for what he looks for listed in this post on his blog - many of which are transferable to the AtoM project:

Cleaning up commit history

Once code review is finished or nearly finished, and no further development is planned on the branch, the branch's commit history should be cleaned up. You can alter the commit history of a branch using git's powerful interactive rebase feature. The following few criteria help outline what makes for a clean commit history:

Commits should be specific and atomic

Any single commit should cover a specific change. That change might span multiple files, but it shouldn't introduce multiple distinct functional changes unless it's impossible to separate those changes from each other. It's okay (and normal!) for a branch that makes a lot of changes to consist of many small commits, each of which makes individual small changes. This makes it easier to revert individual commits from a branch, and also makes it easier to isolate the commit that caused a particular bug with git-blame.

In development, it's common to introduce a change with a commit and then introduce multiple other commits with small fixes for that change. When getting ready for merging, those commits should all be squashed together.

Every commit should work

No individual commit should put the software in a broken state. It's fine for code to go unused (because the feature it's been introduced for isn't there yet), but no new functional issues should be introduced by a given commit. This ensures that reverting any individual commit from the branch is safe, and that git-bisect stays reliable for tracking down bugs.

Commit summaries should be concise

The commit summary (the first line of a commit message) should be short and clear. Usually, the first line of the commit message should be no more than 50 characters. Since the summary is just a summary of the change, it should be readable at a glance when looking through git history. The first line should be followed by a blank line, with following lines no more than 72 characters. This Github blog post summarizes why this commit summary format is important.

A good commit summary makes it clear a) what part of the code is changing, and b) what the change is. For example:

Clear commit summary:

     normalize: fall back to default rule for unidentified files

Unclear commit summaries:

     Fixed some normalization bugs
     Bugfixes

The unclear messages make it hard to tell at a glance what changed, and that makes browsing the commit history harder.

Commit messages should be as detailed as they need to be (and no more)

The commit message is the rest of the commit past the first line. If a commit makes a small and obvious change, it's fine to not even have a commit message past the summary.

The commit message is your place to clarify the justification for a change. While there's no need to rehash anything that code comments already say, if there's more detail that helps a reader understand why a change was made, be as verbose as you need to! Remember: future-you (or another developer) will read this when going through the commit history to understand why a change was made. Make their life easier.