Practical PHP Refactoring: Convert Procedural Design to Objects
Join the DZone community and get the full member experience.
Join For FreeEven in languages where there are no constructs but classes, there is no constraint that can force a programmer into writing object-oriented code. In many cases, just wrapping a series of functions into classes do not result in the design.
The Convert Procedural Design to Objects has great benefits, but it reaches a very large scale (potentially the whole application).
What does object-oriented mean?
In 2011, there is no reason to write procedural code anymore in a web application:
- all libraries and frameworks worth inclusion are object-oriented, even a part of the PHP code (SPL but most importantly PDO, and even DateTime).
- All other successful languages in the web space are either object-oriented, functional, or both.
- Software design literature is based on objects and their patterns.
However, using class and extends keywords does not suffice to produce an object-oriented design; entire books are written on this topic.
This refactoring tries to solve a common case of procedural design shoehorned into an object model:
- classes containing behavior, and depending on many other ones.
- dumb classes only being a container for data, or worse primitive types with no methods at all.
It is common in procedural design to segregate responsibilities in this procedure/record pattern, but high level methods can be added on these dumb classes to encapsulate a bit of the data they are containing, and simplify the procedural classes using them. It is just a starting point towards "object-orientation", but often an overlooked one.
The Tell Don't Ask principle summarizes what we would like to do in very few words:
Procedural code gets information then makes decisions. Object-oriented code tells objects to do things. -- Alec Sharp
Instead of an infinite series of calls from a procedure to getters and setters, we want to pass messages even to the lower level objects.
Steps
A preliminary step is to turn primitive data structures into a data object wrapping them and providing getters. If you see variables like arrays or strings passed around in the code to refactor, this step is necessary to provide a class to accomodate potential new methods.
- Inline the procedural code into a single class. This step makes us able to extract code along different lines than the original ones in the rest of the refactoring: for example, procedural code is often divided in temporal steps, while objects may segregate different parts of the available data instead.
- Extract methods on the procedural class. See the next steps for hints on what to extract.
- Methods that have one of the dumb objects as argument can be moved on the object itself, by eliminating it as a parameter but maintaining the remaining ones. Move Method should free the original giant class from any unrelated responsibilities.
The goal is to remove logic from the procedural class as much as possibile, going into an opposite direction with regard to the original design; Fowler notes that in some cases the procedural class totally disappears.
Example
One of my popular examples is invoice calculation: the computation of fields like total price and due taxes from a series of information.
In this procedural design, we have one invoice and a bunch of rows modelled with Primitive Obsession (as arrays).
<?php class ConvertProceduralDesignToObjects extends PHPUnit_Framework_TestCase { public function testPricesAreSummedAfterAPercentageBasedTaxIsApplied() { $invoice = new Invoice(array( array(1000, 4), array(1000, 20), array(2000, 20), )); $this->assertEquals(4640, $invoice->total()); } } class Invoice { private $rows; public function __construct($rows) { $this->rows = $rows; } public function total() { $total = 0; foreach ($this->rows as $row) { $rowTotal = $row[0] + $row[0] * $row[1] / 100; $total += $rowTotal; } return $total; } }
We introduce the Row class, but the design is now worse: it adds a bunch of lines of code (the new class) without the new entity giving us something in return. The Row object has no responsibilities, and we just have to write getters and sometimes setters. At least we're writing down parts of our model for documentation (giving names to the net price and tax rate numbers), but we aren't sure this model is the most versatile one.
<?php class ConvertProceduralDesignToObjects extends PHPUnit_Framework_TestCase { public function testPricesAreSummedAfterAPercentageBasedTaxIsApplied() { $invoice = new Invoice(array( new Row(1000, 4), new Row(1000, 20), new Row(2000, 20), )); $this->assertEquals(4640, $invoice->total()); } } class Invoice { private $rows; public function __construct($rows) { $this->rows = $rows; } public function total() { $total = 0; foreach ($this->rows as $row) { $rowTotal = $row->getNetPrice() + $row->getTaxRate() * $row->getNetPrice() / 100; $total += $rowTotal; } return $total; } } class Row { public function __construct($netPrice, $taxRate) { $this->netPrice = $netPrice; $this->taxRate = $taxRate; } public function getNetPrice() { return $this->netPrice; } public function getTaxRate() { return $this->taxRate; } }
For the scope of this small example all business logic is already in a single class, thus we don't have to inline anything. Let's extract a first method instead:
class Invoice { private $rows; public function __construct($rows) { $this->rows = $rows; } public function total() { $total = 0; foreach ($this->rows as $row) { $total += $this->rowTotal($row); } return $total; } public function rowTotal($row) { return $row->getNetPrice() + $row->getTaxRate() * $row->getNetPrice() / 100; } }
That was a small enough step. In a real situation, the extracted code may be 100-line long, so we would want to test the extraction has been successful before doing anything else.
In fact, since the test still passes, we can notice this method has a Row object in its arguments, so it can be moved on Row now that its logic has been clearly isolated:
- $this->field references should become additional parameters of the method before moving it.
- Other parameters should just remain formal parameters.
- Calls to $this->anotherMethod() would be more difficult to treat, as you have the options of moving anothetMethod() in the Row class too, or to extract an interface containing anotherMethod() and pass $this.
While moving the code, we change the references to $row to $this, and check that the method scope is public. We also rename the method to total() instead of rowTotal().
{ private $rows; public function __construct($rows) { $this->rows = $rows; } public function total() { $total = 0; foreach ($this->rows as $row) { $total += $row->total(); } return $total; } } class Row { public function __construct($netPrice, $taxRate) { $this->netPrice = $netPrice; $this->taxRate = $taxRate; } public function getNetPrice() { return $this->netPrice; } public function getTaxRate() { return $this->taxRate; } public function total() { return $this->getNetPrice() + $this->getTaxRate() * $this->getNetPrice() / 100; } }
Finally, we inline the getters, since they're not used from outside the Row class. They will be introduced again in the future in case there is a real need for them: as a rule of thumb we avoid exposing any state from Row that is not necessary.
class Row { public function __construct($netPrice, $taxRate) { $this->netPrice = $netPrice; $this->taxRate = $taxRate; } public function total() { return $this->netPrice + $this->taxRate * $this->netPrice / 100; } }
Opinions expressed by DZone contributors are their own.
Comments