Why OOP Is a Bad Fit for Custom Software
Everything in programming is a trade-off. The combination of imperative and (DO)RY is the best one that results in more total code, but more maintainable code.
Join the DZone community and get the full member experience.
Join For FreeEver notice that custom OOP projects tend towards a flaming pile of spaghetti crap?
Have you ever seen anti-patterns like the following:
- Changing a line of code to fix screen A blows up screen B, which have no relation to each other.
- Many wrappers: A
Service
is wrapped by aProvider
is wrapped by aPerformer
is wrapped by a ... - It is hard to track down where is the code that performs a certain operation.
- Playing whack a mole, where each bug fix just yields a new bug.
Ever ask yourself why OOP has design patterns? I would argue that OOP assumes upfront design before writing any code. In particular, OOP shines when every important thing is known at the outset. Take a Java List
or Map
as an example. They have remained virtually the same since the rollout of Java 1.2 when the collections API was added, replacing older classes like Vector
and Dictionary
.
A List
or Map
is a simple beast - they are just ordered sets of data. A list orders items by index, a map by their keys. Once you have basic operations like add
, change
, iterate
, and delete
, what more do you really need? This is why Java has really only added conveniences like Map.computeIfAbsent
, ConcurrentHashMap
, and so on. Nothing huge, just some nice things that people were already doing anyway with their own convenience functions and/or classes.
But custom software paid for by a customer who only knows what they want today is something altogether different. You literally don't know from one month to the next what feature the customer will ask for, or what bug they will report. Remember that OOP design pattern for random structural changes on a dime? Neither do I.
Why Imperative Is Better
OOP intrinsically means some kind of entanglement: once you pick a design pattern for a set of classes, and write a bunch of code for it, you can't easily change to some other design pattern. You can use different patterns for different sets of code, composing them as needed into a larger system. But each part is kind of locked into a chosen pattern, and it is a significant hassle to change the pattern later. It's like the coding equivalent of vendor lock-in.
Unfortunately, a set of code doesn't necessarily shout out "Hey, this is the strategy pattern." You have to examine a set of code to reverse engineer its pattern or ask someone. Have you worked on a team that stated up front what patterns were being used for different parts of the system? I don't recall getting very much of this in my career. Really, in a lot of cases, there simply isn't any real conscious choice of design patterns, just replication of whatever the devs saw before elsewhere, often without any real contextual information of why.
This entanglement easily leads to hard to deal with code if someone doesn't fully grok whatever pattern(s) are present. More often than not, using OOP for custom who-knows-what-the-customer-wants-next-week software is setting the system up for failure. Not failure as in it doesn't work, but failure as in it will virtually be guaranteed to become very hard to maintain.
Using a simple imperative pattern is much better, which you can do even if the language is primarily OOP language like Java. In the case of Java, just use static methods, where each class corresponds to either a data structure or a series of static methods that operate on data structures. By passing data structures as arguments and returning new data structures, effectively the code is working from the outside, which makes the code simpler to understand and tends towards less entanglement.
You could organize packages like this:
- Top-level packages represent functional areas (e.g., configuration, database access, REST API, validations, etc.)
- Sub-packages for data structures and functions that operate on them
- Some sub-packages can represent a design pattern like model, view, and controller
For example, it might be organized like this, where app
is the top-level dir checked out of the repo:
app/db/util
: Some utility functions to make DB access easierapp/db/dto
: Database transfer objects that represent data as stored/retrieved in the DBapp/db/dao
: Database access objects that store/retrieve dtosapp/rest/util
: Some utility methods to make REST a bit easierapp/rest/view
: Objects that represent the data as sent/received over HTTPapp/rest/translate
: Translateapp/db/dto
to/from app/rest/view
app/rest/model
: Makeapp/db/dao
calls to store/retrieve data, uses/app/rest/{view, translate}
app/rest/controller
: Define endpoints and methods, useapp/rest/model
to do the workapp/html
: SSR HTML generation
You'll notice I mention MVC above, which is an OOP pattern. However, this pattern can be simplified as a set of directories with one responsibility per directory, which can still be an imperative way of writing code. It can still be operating on the data objects from the outside. Just because we don't want to use OOP doesn't mean we can't apply some of what we've learned from it over the years in an imperative way.
The above looks like a monolithic design. It can be a hybrid if you want:
- Make
app/{service}
dirs, which in turn contain DB, REST, and HTML as shown above. - Each service can be its own application.
- Services can be grouped into a smaller number of deployments: you don't have to deploy each service in its own container.
The Other Most Common Mistake
One of the most important things to consider is (DO)RY
versus (DONT)RY
. The overuse of (DONT)RY
is often a very big pain point in OOP. Like Lists
and Maps
, (DONT)RY
works best in a limited area of code, such as reusing some common code across all Map
implementations. Essentially, it is just another variation of what I said earlier about knowing the design in advance - (DONT)RY
can be quite useful when you know the considerations up front, but just another factor in making spaghetti code when you don't.
(DO)RY
is far more useful when you have a changes-by-the-week application: the duplication isolates changes. For example, say you have a customer address and a business address. They seem kind of the same thing, with only minor differences:
- Businesses have 3 lines, for doors and stops and other cupboard-under-the-stairs things individuals don't need.
- Businesses can have multiple addresses, so they need a type (physical, mailing, billing).
It sounds like you could use the same code for both. But over time, random requests are made for random changes, and some changes need to only apply to one or the other address type. (DONT)RY
causes these increasing differences to get harder and harder to manage, which is exactly the bad form of entanglement I keep seeing. (DO)RY
means copying code when a change needs to be done for both.
The improvement stems from the fact that when a particular change must be implemented quite differently due to their differing code bases, there is no tangled mess problem - instead, it is just more effort to do the change twice in different ways, without causing either code base to become any harder to read or modify.
In some cases, a data type that has its own logic for persistence and retrieval/display might also be contained inside another data type for another use case. When contained, there is no reason to believe in the face of random changes that it will necessarily always require the same validations, persistence, and display logic as when it is used as a top-level object. As such, all the logic for the contained object should be a copy of the top-level code, so the two use cases can be as different as they need to be.
Conclusion
Imperative programming combined with (DO)RY
encourages making separate silos for each data type - separate queries, separate DB reads/writes, separate REST endpoints, and separate HTML generation. This separation expresses an important truth I alluded to earlier about your data - every top-level data type is completely unrelated to any other data type, it is a thing unto itself. Any correlation or similarity in two separate data types should be viewed as both accidental and temporal: in other words, they just so happen to be similar at the moment - there is no reason to believe their similarity will continue in the face of random unknowable future changes.
Separating all your top-level data types with their own imperative code and using (DO)RY
- copying code as necessary to maintain the separation - is the key to managing code that has to be dynamic in response to frequent unknowable future changes. The resulting code will be larger as a result of copying logic, but more maintainable.
In other words, everything in programming is a trade-off, and the combination of imperative and (DO)RY is the best trade-off that results in more total code, but more maintainable code.
Published at DZone with permission of Greg Hall. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments