Introduction

I think most programming taught in schools and courses concentrates on creating working code. Beginning programmers are left with the impression that creating code that runs and gives the correct result is the hard part; that this is somehow the end goal of software development.

In fact, writing code that runs and give the correct result is usually trivially easy, especially once you have some experience. The real challenge involves writing code that programmers can work with; potentially programmers other than you, and potentially for years to come. Far too many systems go into production simply because they work, but nobody has thought about the cost of keeping it working as the requirements change over time.

What makes a system difficult to understand and maintain?

Of all the considerations when designing or coding a system, coupling is probably the most important to keep in mind. Conversely, tight coupling is the number one issue that makes systems hard to understand and maintain.

Coupling is the big scary thing that you need to deal with as you type every line of code.

What is coupling?

From Wikipedia:

…coupling is the degree of interdependence between software modules; a measure of how closely connected two routines or modules are; the strength of the relationships between modules.

The problem with coupling is that it makes it very difficult to do two things:

Understand a System
When a system has a great deal of interdependence between elements, then it becomes more necessary to understand how all of those elements work in order to understand any one element.
Modify, Debug and Extend a System
When there is tight coupling between elements, then changes to any one element start to have “ripple effect” through the whole system. The first change alters the way some other element behaves, and correcting that impacts another element and so on. Eventually, making any change to the system becomes a test of nerves.

For this discussion, we’re going to put aside all the technical definitions of coupling that you can read about on Wikipedia because… you can read about them on Wikipedia. Instead, we’ll try to focus on practical things that cause coupling, and aspects of writing good code that reduce coupling.

Coupling and “Clean Code”

You’ve probably heard the term “Clean Code”, and maybe you have an idea about what it means. If you want to know a lot more, then check out Bob Martin’s Clean Coders, and maybe watch some of his videos. They are really quite good.

A lot of the clean coding concepts involve the SOLID principles. Uncle Bob talks about these a lot.

I’m going to go out on a limb, and I’m going to say the main purpose of most of the techniques that we generally consider to be “clean code”, are aimed a limiting coupling. This includes the SOLID principles, DRY (Don’t Repeat Yourself), and an awful lot of what Uncle Bob presents in his videos. He doesn’t explicitly present it as such, but when you think about what “Clean Coding” achieves, you’ll see that loose coupling is a big outcome from it.

“Hiding”, “Revealing” and “Knowing”

In many cases, the best way to think about coupling is in terms of “hiding”, “revealing” and “knowing”. You’ll see these terms used to explain coupling over and over again in the rest of this article. Let’s take a look at what they mean for coupling:

Hiding and Revealing
This is the fundamental mechanism of loose coupling. Elements of your system should hide as much as they can about themselves so that other elements cannot abuse them. An element should reveal only as much about itself as it absolutely has to in order to fulfil its role.
Knowing
This is about how much code in one element uses its knowledge of the inner workings of some other element to do its work. As soon as you use that knowledge, you’ve instantly created a dependency to the insides of that other element.

“Coupling” tends to be an abstract term, and a little tougher to think about in practical terms. But if you think in terms of hiding as much as possible in every design decision, revealing as little as you can get away with, and then try not to write code that depends on knowing about the insides of other things in the system, then it’s easier to write better code.

Types of Coupling

Let’s take a practical look at the types of coupling that you’re likely to create in your code…

Code Cohabitation

The easiest way to create coupling is to put too much code in one place. The so-called “God Classes” are an example of this; putting virtually all of your functionality into a single class. You also see this frequently within methods. Whenever I see a method that’s 200 lines long, or 100 lines or even 30 lines long I know that there’s too much going on inside.

This is coupling because the various functions that the method does are interleaved with each other, sharing variables and space on the screen. It’s hard, sometimes even impossible to figure out which code belongs to what, sometimes it’s shared by multiple functions.

What’s the problem with this?

Scope of Variables

It is much harder to keep track of variable scope within a method than it is between methods. If a variable is declared in one method, it’s not going to be available in another method unless it’s explicitly passed as a parameter. Within that method, though, any variable that’s declared outside a control block - most likely a loop or conditional branch - is global within that method. So, if you have a 200 line method, and you declare some variable near the top of that method, then you have to understand every use of that variable throughout the entire method - especially if the variable is mutable - in order to understand how it’s used at line 190 of the method.

Understanding the Code

This is the main problem with long methods. For the most part, you need to understand the entire method in order to understand part of the method. Let’s say that you’re interested in some loop part-way through the method. Where did the data that’s being iterated over come from? Was it mutated in any way before this loop? Was it filtered somewhere? What happens to the data in this loop after the loop is finished? How will changes to this loop affect anything that happens with this data after the loop? What about that control variable? Where did it come from? Can you change it?

When you are maintaining monolithic blocks of code there’s generally just one self-defence method that you can adopt: Leave as small a footprint on the code as you can while achieving your maintenance goal. There’s very little latitude to make bigger changes short of refactoring the entire method into smaller methods - which is hideously risky unless you totally understand every nuance of the method to start with. Every change you make carries the risk of some side effect that you didn’t anticipate…because the code is coupled so tightly simply by cohabiting in the same method.

Using the Single Responsibility Principle

Most of these code cohabitation issues can be avoided by using the Single Responsibility Principle: The idea that each method should be responsible for only one single thing.

A lot of programmers seem to give up on the Single Responsibility Principle (SRP) because it seems impossible. Surely, at some point some method has to be responsible for more than one thing, or else no application could ever do more than one thing, right?

The Single Responsibility Principle refers to direct responsibility. Delegated responsibilities don’t count.

If you have a method that needs to do three things and combine the results, just delegate those three things to separate methods and do the combination work in your primary method. If you follow this process, you’ll end up with methods that just contain flow logic and, because that’s all they do, it becomes very clear what that flow logic is.

Just remember that combining results from delegated methods is a responsibility, and control flow is also a responsibility. So, if you are doing both in one method then you’re technically breaking SRP. Of course if either one is trivial, it’s probably not going to be an issue.

When you’ve delegated the responsibilities to separate methods, the coupling is broken. Code in one method can be changed without worrying that code in another method will be broken. Of course, if the methods have other coupling, such as mutating class fields, you’ll still need to worry about that.

Too Many Fields and Instance Variables

Within a class, fields (or instance variables, if you like) are globally accessible. Generally speaking, you shouldn’t use fields to simply provide a global variable.

When should you use fields?

Fields are best used to represent the “state” of the object. This state persists, in memory, outside the context of running a method in the class. That’s really what they are for. That’s what you should use them for.

I have seen cases where fields are introduced to limit the number of parameters that are passed between methods in a class. Sometimes it can seem to make the class “cleaner” because you’re not passing around a bunch of parameters from private function to private function. However, that technique carries a lot of baggage.

Within the scope of a class, a field is essentially a global variable. Instantly this creates an extreme amount of coupling between the methods in your class. Any method can access a field and use it any way that it wants, effectively altering the meaning of the field since the meaning is defined by the way that it is used. The only way to know what’s going on with a field is to understand every place that the field is used.

Global variables are bad enough, but fields are persistent global variables. When you have a local variable, you declare, create and initialize it, and you know its status. If you pass it to another method, you know its status in both places. However, a field is declared, created and (usually) initialized when the object is instantiated - and that’s the only time you can take its status for granted. Any public method that accesses the field or calls private methods that access the field could be called at any time, and could change the field. You don’t have any control over this.

Non-Private Methods

Non-private methods are the number one way that classes in Java reveal things about themselves to other classes. Every single non-private method that you add to a class ties you down just a little more because it exposes your class to potential coupling.

Getters and Setters

Everybody knows that you’re supposed to make all fields private, and to provide getters and setters for those fields so that other classes can have access to the data. This is a strategy against coupling - those other classes know that they can get the data, but they don’t have access to the implementation of that data inside the class. These leaves you free to change the internal structure of the class without worrying about breaking other classes.

Really though, you need to think of getters and setters in terms of services. What is the service that each getter or setter provides? What are other classes going to do with the data they get from a getter? Do they actually need a setter? What do you want to reveal about your class to other classes through the getters and setters?

The point to this is that getters and setters should be designed thoughtfully, not just tossed into a class because you have fields.

An Example: Names

Let’s say we have a class that has some name fields: firstName and lastName, and they’re just Strings. Now we need to provide access to that data. Do we just add a getFirstName() and getLastName() method to the class? What are the calling methods going to do with the data from getters? It’s OK to ask this. We’re providing a service through these getters, and we have a right to set boundaries around it. We could just provide getName() and have that method put firstName and lastName together with a space between them. Maybe that’s the limit of how far we’re willing to go.

Make no mistake, providing getFirstName() and getLastName() is going to tie us down somehow. What happens if we decide to add a middleName field to our class? Are we going to add a getMiddleName() method to our class? Then do we need to chase down all the calling classes and modify them? What if we add a field for suffix, like “Jr”, “Sr”, “II”? How do we cope with that?

Central to all these questions is the idea of what class has the responsibility of putting the name in a format for use elsewhere? You could make a very strong argument that it’s our class with the data that has that responsibility. Perhaps we need to provide a getName() method as described above. Maybe we need to have a getFullName() that uses all the parts. Perhaps a getCommaName() that produces something like “Smith, Fred” would be needed.

What happens if we decide to create a special Name class, and use it instead of our String fields? Now we can handle all the funky edge cases, all the exceptions and rare cases like a multi-word last names (like “Van Der Kamp”), and proper casing of names like “McDonald” or “O’Brian” and those kinds of things. And we can use this class anywhere that we have a name. Now you really don’t want all these calling methods using their own rule-base for formatting names.

With something like names, they don’t change very often. Do we need to have setters for these fields? Probably not.

Now let’s be clear, all of this stuff is really hard if you’ve implemented your class with getFirstName() and getLastName(). Really hard. But it’s fairly easy if you just implemented getName() and let your class figure out how to deliver it.

Coupling Through Getter Types

The other aspect of getters that shouldn’t just be written without thought is the return type. I’d be willing to bet that 99% of the time, programmers just return the same type as the field in their getters. But is this always the right thing?

Let’s say you’ve got a field in your class age, and it’s just an int. Should you create a getter, getAge() that returns int? What are the units? Years? OK, let’s say it’s years. What if you decide that you need to track closer than integer years, so you’ll make it a double? Now you’re stuck.

Maybe you should make getAge() return Number. Now the calling programs can use it as they want. You’ll still return an actual int, but it will be returned as Number.

Maybe you should make getAge() return Duration. There’s no ambiguity in this. Duration has all the unit stuff baked in. You can still treat the data internally as int, but you can convert this to the number of years in a Duration value.

The important thing is that it may make sense to treat age as an int internally inside your class. Your code inside the class knows all about it and can deal with appropriately as an int. If it causes grief in your code, it’s going to be contained inside your class, and it won’t be an application-wide problem. But if you let that int leak out through the getter, then it becomes an application-wide problem. And should you decide to fix it by making it a Duration then that’s going to have a ripple effect through your application.

Once again, treat your getters like a service, not a data-access mechanism. Treat them like a service from the start.

Other Methods

I’ve seen people express the opinion that every class should implement an Interface, and while I think that’s extreme, I do think that programmers tend to under-utilize interfaces. Having an Interface formalizes the contract about what a specific class is going to expose to the outside world, and it unlocks the client code from using a specific implementation of that contract.

Let’s say that you have some class, ClassA, that does a number of related functions, and that you have some client class that invokes a few of ClassA's methods to do some of those things. The client code doesn’t care about the other functions that your ClassA does, just the ones that it calls. Also, your client code passes ClassA around to a number of methods as a parameter.

Now, let’s image that something about the module that ClassA belongs to changes. ClassA will remain, but, going forward, is only going to be used in special circumstances that don’t include your client code. You need to use ClassB instead. That means that you’re going to have to change your client code, changing the type of every parameter where ClassA was passed from ClassA to ClassB.

This is all because of coupling.

Imagine that you had an Interface for the few functions of ClassA that your client code uses, let’s call it ClientInterface. Now you instantiate ClassA as ClientInterface and then pass it around as an instance of ClientInterface. When ClassB is created, it also implements ClientInterface and the only code you have to change is:

ClientInterface workerObject = new ClientA();

to:

ClientInterface workerObject = new ClientB();

This works even better if there’s some method in that service module that returns ClientInterface for you without your client code even knowing what implementation of ClientInterface is being returned.

The key thing here is, as usual, all about hiding, revealing and knowing. Using an Interface lets your service classes hide as much as possible, reveal only the required methods, and prevents the client code from knowing anything more about the service than it absolutely needs to.

Accessing the Implementation Instead of the Interface

When I first started to program in Java, we used ArrayList all over the place. It was easy to use, and we passed and returned values as ArrayList throughout a lot of our code. Then Java 8 came along and we got Streams! Streams were awesome, but…

Stream.toList() returns List - which is an Interface - and not ArrayList. Now, we started using Streams and returning List instead of ArrayList and then passing the results around between methods. But all those methods expected ArrayList!. There was nothing in any of this code that used any of the methods of ArrayList that weren’t in the List interface, but we now had to do a major refactoring to change all of our parameters and return values to List.

That was the big learning moment about coupling. We’d update a service method to use Streams and return a List. Then we’d get red squiggly lines all over our project. We’d chase those down, and then we’d get a whole new set of red squigglies as the methods that called those methods had problems. And so, on, and so on…

This was a newbie mistake, but illustrates another level of coupling. None of the calling methods had any need to know that the returned Lists were ArrayList, but we chose to reveal that in our return values and then use that knowledge in our calling routines. Then, when we changed the inner workings of those methods to something that didn’t involve ArrayList we had that ripple effect throughout the whole code base. And somewhere in there, there probably was only one or two lines of code that actually tried to call an ArrayList method that isn’t in List.

This goes even further. Since all objects in Java exist in a hierarchy it’s often possible to return something further up the hierarchy than the actual object passed back by a method. Just so long as that parent object has whatever methods the code invoking it requires. For instance, if the only method that you want the invoking code to be able to call on a returned object is toString(), then you could just return Object. Perhaps more realistically, you could return Number instead of int, or double.

This happens a lot with Builder methods in JavaFX. Often, a Builder is returning a component that’s going to have nothing more done to it than to put it into a layout. In that case you don’t want to expose particular methods of the specific component classes. Like the setOnAction() method in Button. So you don’t return Button, you return Node which is pretty much the top level class for all the screen components. Sometimes you want the layout code to be able to set the maximum width of the return component, in which case you’d return Region which is a bit lower down than Node and is the top level class for all the layout container classes.

The big takeaway here is that you should give some careful thought to the values that return from your methods, and not just return whatever type is actually created inside the method code. Ask yourself about how much the calling routines should be able to know about the data that’s being returned and return an appropriate type.

Mutable Fields and Variables

I spent years and years working in various dialects of Pick BASIC. It’s pretty old-school. Structured with GOSUB but no formal “method” or “function” declarations with passed parameters. Variables were all untyped and without any scoping. The implications of all of this are that every variable is global, even if in a subroutine, and you can put just about anything in any variable at any time without any restrictions.

You could have things like this:

   X = "25"
   .
   .
   X = (X : "17") + 1
   .
   .
   X = "GEORGE"

And it would work just fine. BTW: X = (X:"17") + 1 would return 2518 (the : was the concatenation operator and X would be treated as a String for that).

It wasn’t unusual to have program files with 3K to 5K lines of code and dozens of subroutines, and any variable used anywhere was available anywhere else. The result was that any attempt at maintenance was a test of nerve. It was very hard to know if any particular variable had the value that you expected it at any given time because literally any of the code could modify it. And some variables were updated dozens of times in dozens of different places. It was also very hard to know what would happen if you changed a variable because, once again, it could be used in dozens of places.

It’s all due, of course, to coupling. Any code that uses a global variable is tightly coupled to every other piece of code that uses the same global variable. Actually, it’s coupled to all the other code in the file because you have to go searching to find all the uses - any code anywhere could use anything!

Much of this would be a non-issue if variables could be made non-mutable. Non-mutable variables can’t be changed, so you don’t have to pour over hundreds of lines of code looking for it on the left side of an equals sign.

Kotlin takes this to a new place in the Java world. Variables and fields in Kotlin need to be declared as mutable or immutable using the var or val keywords, with the convention being to prefer immutable whenever possible. In the Kotlin code I’ve written, about 90%+ of my variables are immutable.

In Java, we have final and you should use it whenever you can. Prefer local variables over instance variables whenever possible too. And don’t make variables static unless absolutely necessary.

Of course, final isn’t used anywhere as often as it should be. It’s rare to find local variables declared as final in most code. Programmers only think about final variables when they use one in a lambda, and they get that compiler warning about “variables in a lambda must be final or effectively final”. Even then, the usual response it to look for a wrapper class like Atomic to put the value in to get around the final requirement.

Duplicate Code

Most programmers don’t think about duplicate code as coupling, but it really is. Really, what could be more “interdependent” about two blocks of code than if they contain the same code?

I’ve seen people argue that you can overdo DRY (“Don’t Repeat Yourself”). The idea being that if you’ve pulled the duplicate code out into a separate method and call it instead of having duplicated code, then any time you touch that method you have to worry about all the places that might call it.

This is nonsense. Not that you have to worry about all the places that call it, but the idea that if it’s duplicated then you don’t.

If you touch a piece of code that’s duplicated, then you have to ask yourself, “Do I need to make this same change in all the duplications?” The tricky part is that with duplicate code, a number of things might happen:

You Cannot Find, or You’re Not Aware of the Duplicates
If you have a method, then it’s super easy with modern IDE’s to find all the places that the method is called. It’s hard NOT to be aware of how a method is used. But with duplicate code you don’t have that facility, so you have to go looking. And why would you even look?
It’s Hard to Understand the Impact on Duplicate Code
Very often, duplicate code isn’t 100% identical in every place that you encounter it. There’s always a different value checked, or a calculation is slightly different, or something like that. When you touch code that’s been duplicated you have to figure out whether you’re touching the stuff that really is duplicated. When you pull it out to a separate method, you generally parametrize the differences, so that the calls to the method hold the differences, not the duplicated code. Now, when you make a change, are you making a change to the caller, or are you making a change to the method?
You Mess Something Up
There is probably no more effective source of trivial bugs than duplicated code and copy/paste. If you have to do something 5 times over, each just a tiny bit different, then that’s 5 times as many chances to make a mistake.

One other important point about DRY: Employing DRY generally forces you to think about the structure of what you’re building. It makes you understand that you have shared functionality in your system and requires that you identify it and organize it. DRY removes some amount of coupling, but it also brings it out into the open and puts structure around it that makes it easier to manage.

Feature Envy

This really comes down to putting code in the right place. One antipattern that you see all the time is something like this:

   private int sampleMethod(ClassA classA, int somethingElse) {
     return (classA.getValue1() * classA.getValue2()) + (somethingElse * classA.calculateSomething());
   }

What we have here is a block of code that repeatedly calls methods in another class to derive a result based almost exclusively on those method calls. In other words, this code really has very little to do with the class in which it resides.

This is called “Feature Envy”. In this example, sampleMethod() represents a feature in the class in which it appears but really shouldn’t belong to it. It should be a feature of ClassA.

A far better way to write this:

   private int betterMethod(ClassA classA, int somethingElse) {
      return classA.sampleMethod(somethingElse);
   }

And ClassA.sampleMethod() would contain the calculation from the first code snippet.

From a coupling perspective, we’ve now exposed only one non-private method of ClassA, instead of 3. So that’s a win. But almost certainly, the meaning of that calculation is something that relates to ClassA, not whatever class it was originally found in.

And it’s clear that if something about ClassA changes, maybe the meaning of whatever is returned from one or both of the two getters, for instance, then the meaning of the result of that calculation changes. If it’s contained within ClassA, however, then that change can be handled inside ClassA without needing to worry about a ripple effect.

Breaking The Law of Demeter

This is a slam dunk because the “Law of Demeter” is, by definition, a rule about loose coupling.

The most common example of the Law of Demeter is when you have stacked object composition. For instance, you have a field in a class which is another class, and that class in turn has a field which is another class and so on. The result is that you can get situations like this:

   something = classA.getField1().getField6().getField3().getField2();

Now our code here has to know not only that classA has getField1(), but it has to know that whatever type getField1() returns has a getField6() method. And then, of course, that that class has a getField3() method and so on. But if the object returned by classA.getField1() isn’t just a utility class like ArrayList or Map, is there any good reason for our code to know that classA has whatever is returned by classA.getField1() in it?

And, of course, if anything changes in any of those 4 classes, this might stop working.

From the perspective of coupling, it’s far better to have a single call:

something = classA.getSomething();

Where:

public classA;
   .
   .
   public SomeClass getSomething(){
     return field1.getSomething();
   }
}

and:

public classXYZ;
   .
   .
   public SomeClass getSomething() {
     return field6.getSomething();
   }
}

Of course, it needn’t be called getSomething() all the way down the chain.

Not Using Packages Properly

You should understand that packages in Java aren’t just a way to organize your classes and break up the name-space. Packages are all about hiding, revealing and knowing - the elements of reducing coupling.

Classes that share the same package inherently share an increased coupling to one another. This should represent a high level of “cohesion” in your system, where elements that share a package are highly related to each other. Cohesion is a good thing, and packages that have high cohesion have more latitude for coupling between classes without causing issues.

This is reflected in the access modifiers. package-private is the default modifier for Java elements, and it’s what you get if you don’t specify anything at all. An element with package-private means that it is exposed to other classes in the same package. If your package has high cohesion, this can often be acceptable.

On the other hand, if you’ve only got one package in your application, and all of your classes are in that one package, then your cohesion is zero and there’s no difference between package-private and public. Organization of your packages so that they have high cohesion allows you to say, “It’s okay if this bunch of classes has access to this method, even though I don’t want any other, unrelated classes calling it”.

Conclusion

Over the decades that I’ve been programming, I’ve read, and tried to understand, literally millions of lines of code written by other people. Sometimes the “other people” was me from a year earlier. It’s a core programming skill, and I think I’m pretty good at it, better than a lot of programmers I’ve worked with.

When I open up a project and take a first look at it, I can tell in a few moments how experienced the programmers that wrote it were, and I can tell how easy or difficult it will be to maintain. How do I know? I look for all the stuff that’s written about in this article. If I see a class with a zillion fields with two letter names, none of them final, one or two giant methods, and everything declared public then I know it’s going to be rough going. And it’s going to be rough going because I know that all the other coupling rules are going to be broken, over and over and over again.

Coupling is the number one challenge when understanding and maintaining a system. It’s a challenge for understanding because it means that in order to understand one tiny part of a system you need to understand large amounts of that system. It’s a challenge for maintaining a system because you cannot touch one tiny part of the system without possibly impacting large amounts of that system.

The nice thing about coupling is that you don’t have to think about it in an abstract way to avoid it. Just think about every element of your system hiding its secrets from every other part of the system and keeping its nose out of how those other parts work.

Most of all, you can make huge steps towards reducing coupling by following DRY, SRP and avoiding feature envy. And do it from the start. Don’t treat it as a step to take later - because you probably won’t get a chance. And honestly, it doesn’t take any longer to write good, loosely coupled code, than it does to write bad, tightly coupled code.