Google Testing Blog

Don't DRY Your Code Prematurely

Tuesday, May 28, 2024

This is another post in our Code Health series. A version of this post originally appeared in Google bathrooms worldwide as a Google Testing on the Toilet episode. You can download a printer-friendly version to display in your office.

By Dan Maksimovich

Many of us have been told the virtues of “Don’t Repeat Yourself” or DRY. Pause and consider: Is the duplication truly redundant or will the functionality need to evolve independently over time? Applying DRY principles too rigidly leads to premature abstractions that make future changes more complex than necessary.

Consider carefully if code is truly redundant or just superficially similar. While functions or classes may look the same, they may also serve different contexts and business requirements that evolve differently over time. Think about how the functions’ purpose holds with time, not just about making the code shorter. When designing abstractions, do not prematurely couple behaviors that may evolve separately in the longer term.

When does introducing an abstraction harm our code? Let’s consider the following code:

# Premature DRY abstraction assuming # uniform rules, limiting entity-

# specific changes.

class DeadlineSetter:

def __init__(self, entity_type):

self.entity_type = entity_type

def set_deadline(self, deadline):

if deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

task = DeadlineSetter(“task”)

task.set_deadline(

datetime(2024, 3, 12))

payment = DeadlineSetter(“payment”)

payment.set_deadline(

datetime(2024, 3, 18))

# Repetitive but allows for clear,

# entity-specific logic and future

# changes.

def set_task_deadline(task_deadline):

if task_deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

def set_payment_deadline( payment_deadline):

if payment_deadline <= datetime.now():

raise ValueError(

“Date must be in the future”)

set_task_deadline(

datetime(2024, 3, 12))

set_payment_deadline(

datetime(2024, 3, 18))

The approach on the right seems to violate the DRY principle since the ValueError checks are coincidentally the same. However, tasks and payments represent distinct concepts with potentially diverging logic. If payment date later required a new validation, you could easily add it to the right-hand code; adding it to the left-hand code is much more invasive.

When in doubt, keep behaviors separate until enough common patterns emerge over time that justify the coupling. On a small scale, managing duplication can be simpler than resolving a premature abstraction’s complexity. In early stages of development, tolerate a little duplication and wait to abstract.

Future requirements are often unpredictable. Think about the “You Aren’t Gonna Need It” or YAGNI principle. Either the duplication will prove to be a nonissue, or with time, it will clearly indicate the need for a well-considered abstraction.

No comments

Avoid the Long Parameter List

Monday, May 20, 2024

By Gene Volovich

Have you seen code like this?

void transform(String fileIn, String fileOut, String separatorIn, String separatorOut);

This seems simple enough, but it can be difficult to remember the parameter ordering. It gets worse if you add more parameters (e.g., to specify the encoding, or to email the resulting file):

void transform(String fileIn, String fileOut, String separatorIn, String separatorOut,

String encoding, String mailTo, String mailSubject, String mailTemplate);

To make the change, will you add another (overloaded) transform method? Or add more parameters to the existing method, and update every single call to transform? Neither seems satisfactory.

One solution is to encapsulate groups of the parameters into meaningful objects. The CsvFile class used here is a “value object” — simply a holder for the data.

class CsvFile {

CsvFile(String filename, String separator, String encoding) { ... }

String filename() { return filename; }

String separator() { return separator; }

String encoding() { return encoding; }

} // ... and do the same for the EmailMessage class

void transform(CsvFile src, CsvFile target, EmailMessage resultMsg) { ... }

How to define a value object varies by language. For example, in Java, you can use a record class, which is available in Java 16+ (for older versions of Java, you can use AutoValue to generate code for the value object); in Kotlin, you can use a data class; in C++, you can use an option struct.

Using a value object this way may still result in a long parameter list when instantiating it. Solutions for this vary by language. For example, in Python, you can use keyword arguments and default parameter values to shorten the parameter list; in Java, one option is to use the Builder pattern, which lets you call a separate function to set each field, and allows you to skip setting fields that have default values.

CsvFile src = CsvFile.builder().setFilename("a.txt").setSeparator(":").build();

CsvFile target = CsvFile.builder().setFilename("b.txt").setEncoding(UTF_8).build();

EmailMessage msg =

EmailMessage.builder().setMailTo(rcpt).setMailTemplate("template").build();

transform(src, target, msg);

Always try to group data that belongs together and break up long, complicated parameter lists. The result will be code that is easier to read and maintain, and harder to make mistakes with.

No comments

Test Failures Should Be Actionable

Monday, May 06, 2024

This article was adapted from a Google Testing on the Toilet (TotT) episode. You can download a printer-friendly version of this TotT episode and post it in your office.

By Titus Winters

There are a lot of rules and best practices around unit testing. There are many posts on this blog; there is deeper material in the Software Engineering at Google book; there is specific guidance for every major language; there is guidance on test frameworks, test naming, and dozens of other test-related topics. Isn’t this excessive?

Good unit tests contain several important properties, but you could focus on a key principle: Test failures should be actionable.

When a test fails, you should be able to begin investigation with nothing more than the test’s name and its failure messages—no need to add more information and rerun the test.

Effective use of unit test frameworks and assertion libraries (JUnit, Truth, pytest, GoogleTest, etc.) serves two important purposes. Firstly, the more precisely we express the invariants we are testing, the more informative and less brittle our tests will be. Secondly, when those invariants don’t hold and the tests fail, the failure info should be immediately actionable. This meshes well with Site Reliability Engineering guidance on alerting.

Consider this example of a C++ unit test of a function returning an absl::Status (an Abseil type that returns either an “OK” status or one of a number of different error codes):

EXPECT_TRUE(LoadMetadata().ok());

EXPECT_OK(LoadMetadata());

Sample failure output

load_metadata_test.cc:42: Failure

Value of: LoadMetadata().ok()

Expected: true

Actual: false

load_metadata_test.cc:42: Failure

Value of: LoadMetadata()

Expected: is OK

Actual: NOT_FOUND: /path/to/metadata.bin

If the test on the left fails, you have to investigate why the test failed; the test on the right immediately gives you all the available detail, in this case because of a more precise GoogleTest matcher.

Here are some other posts on this blog that emphasize making test failures actionable:

Writing Descriptive Test Names - If our tests are narrow and sufficiently descriptive, the test name itself may give us enough information to start debugging.
Keep Tests Focused - If we test multiple scenarios in a single test, it’s hard to identify exactly what went wrong.
Prefer Narrow Assertions in Unit Tests - If we have overly wide assertions (such as depending on every field of a complex output proto), the test may fail for many unimportant reasons. False positives are the opposite of actionable.
Keep Cause and Effect Clear - Refrain from using large global test data structures shared across multiple unit tests, allowing for clear identification of each test’s setup.

No comments

isBooleanTooLongAndComplex

Thursday, April 25, 2024

By Yiming Sun

You may have come across some complex, hard-to-read Boolean expressions in your codebase and wished they were easier to understand. For example, let's say we want to decide whether a pizza is fantastic:

// Decide whether this pizza is fantastic.

if ((!pepperoniService.empty() || sausages.size() > 0)

&& (useOnionFlag.get() || hasMushroom(ENOKI, PORTOBELLO)) && hasCheese()) {

...

}

A first step toward improving this is to extract the condition into a well-named variable:

boolean isPizzaFantastic =

(!pepperoniService.empty() || sausages.size() > 0)

&& (useOnionFlag.get() || hasMushroom(ENOKI, PORTOBELLO)) && hasCheese();

if (isPizzaFantastic) {

...

}

However, the Boolean expression is still too complex. It's potentially confusing to calculate the value of isPizzaFantastic from a given set of inputs. You might need to grab a pen and paper, or start a server locally and set breakpoints.

Instead, try to group the details into intermediate Booleans that provide meaningful abstractions. Each Boolean below represents a single well-defined quality, and you no longer need to mix && and || within an expression. Without changing the business logic, you’ve made it easier to see how the Booleans relate to each other:

boolean hasGoodMeat = !pepperoniService.empty() || sausages.size() > 0;

boolean hasGoodVeggies = useOnionFlag.get() || hasMushroom(ENOKI, PORTOBELLO);

boolean isPizzaFantastic = hasGoodMeat && hasGoodVeggies && hasCheese();

Another option is to hide the logic in a separate method. This also offers the possibility of early returns using guard clauses, further reducing the need to keep track of intermediate states:

boolean isPizzaFantastic() {

if (!hasCheese()) {

return false;

}

if (pepperoniService.empty() && sausages.size() == 0) {

return false;

}

return useOnionFlag.get() || hasMushroom(ENOKI, PORTOBELLO);
}

No comments

Testing Blog

Don't DRY Your Code Prematurely

Avoid the Long Parameter List

Test Failures Should Be Actionable

isBooleanTooLongAndComplex

Labels

Archive

Feed