Understanding Classes in Java (Part 2)

The history of Java classes continues, this time with a focus on objects, their creation, their purpose, and how they're treated in memory.

Justin Albano

CORE ·

Sep. 05, 17 · Tutorial

Likes (40)

Comment

Save

21.0K Views

In the previous article, we gained an intuitive understanding of classes and their role in solving real problems through software. Although this concept is essential in object-oriented programming, classes are only specifications and are not executed when a program is run. In short, a class is only the blueprint, not the house: It is the house that provides the shelter, not the blueprint. In this article, we will explore objects, the actionable realization of classes in software, as well as provide a bit of a back-story to the original introduction of classes the object-oriented paradigm into the world of software engineering. These two topics will allow us to finally move into the world of classes and objects in Java.

Before we reach that point, though, we must answer a simple question: What is an object?

What Is an Object?

The astute reader will recognize that the programming paradigm we have been discussing is called object-oriented, rather than class-oriented; so what is an object? While there are many terse technical definitions, it is much more fruitful to understand objects at an intuitive level, and from there, devise a sound definition. In order to understand objects, we must look to an analogy in the real world.

Thus far, we have created a class, which amounts to a specification about a thing or entity, but we have not created anything to do actual work. In much the same way that an architecture creates a blueprint for a house, the blueprint only represents an abstraction of a real house or a house-to-be. In order to create the house, we must take the blueprint and combine lumber, concrete, glass, and a host of other building materials to construct the house. In programming terms, this construction is called instantiation.

Each house that we instantiate has the same state (an address, number of rooms, square footage, etc.) and the same behavior (open the front door, open the garage door, etc.), but the actual values of the state are different. For example, if we use the blueprint to construct two identical houses on Roosevelt Place, one of the houses (although identical in their design) may be at 60 Roosevelt Place and the other may be at 62 Roosevelt Place. Note that while both have an address, they simply have different instantaneous values for that address. If you check the value for the address state of the first house, you would obtain 60 Roosevelt Place, while the address state of the second house would be 62 Roosevelt Place. Viewed through our previous notation, we would see something that resembles the following:

instanitate: House houseOne
instantiate: House houseTwo

houseOne.getAddress() results in "60 Roosevelt Place"
houseTwo.getAddress() results in "62 Roosevelt Place"

Returning to our Vehicle class, we can instantiate two separate vehicles, but the instantaneous RPMs or gear ratios of the engine and transmission, respectively, may not be the same. For example:

instanitate: Vehicle hotRod
instantiate: Vehicle economyCar

hotRod.accelerate(80)

hotRod.getCurrentSpeed() results in 80
economyCar.getCurrentSpeed() results in 0

A common analogy for the relationship between objects and classes is that a class is a cookie cutter and objects are the cookies: The cookie cutter is used to stamp out individual cookies. We can tell the nature of each cookie by the cookie cutter that was used to create it, but each cookie may have its own varying attributes: One may have 10 chocolate chips, while another only has 7; one may have an internal temperature of 180 degrees Fahrenheit while another may only have an internal temperature of 165 degrees Fahrenheit; etc.

One shortfall that we have not discussed thus far is how a new object is instantiated. For that process, we have a special type of behavior called a constructor.

Construction

Since classes may have objects created from them, we have to have some behavior that dictates how this process occurs. For example, when we create a new Vehicle, what action is taken to do so? Or when we build a new house, what behavior is executed during this process? Even more pressing, if we construct a new house, how do we know what the address value will be? In order to configure our newly created house, we must have a constructor behavior that gets executed when the object is instantiated and initializes the values of the newly created object. For example, suppose we have the following class specification for a house:

class House:
    state:
        private String address;
    behavior:
        public String getAddress():
            return this.address

Using our class specification, we have no way of setting the address of the house when it is instantiated. For that, we would need to provide a constructor behavior that gets called when a new house is instantiated and sets the initial value of the address. For a constructor, we will use the same notation as other behaviors, except with two adjustments: (1) No return type is specified and (2) the behavior name matches the name of the class. The first adjustment is introduced because the constructor cannot return any value: It is simply called during instantiation of the object. The second adjustment is instituted because it differentiates the behavior from other general behaviors. It also allows us to instantiate objects using the following notation:

House myHouse is new House()

Essentially, we are executing a behavior that matches the class name but is not associated with an object yet (i.e. we are not executing the behavior using the form someObject.someBehavior()), because the goal of the constructor is to create the object. Just as with any other behavior, we can define parameters that can be accepted by the constructor. Using this notation, we can define a constructor for our House class.

class House:
    state:
        private String address;
    behavior:
        public House(String address):
            set this.address to address
        public String getAddress():
            return this.address

Now we are able to instantiate our House objects with different initial values:

House houseOne is new House("60 Roosevelt Place")
House houseTwo is new House("62 Roosevelt Place")

houseOne.getAddress() results in "60 Roosevelt Place"
houseTwo.getAddress() results in "62 Roosevelt Place"

This initialization logic can be extended to create objects of other classes in our constructor. For example, we could write a constructor for our Vehicle class that resembles the following:

public Vehicle(String manufacturerName, String modelName, Year productionYear, Number wheelCircumference):
    set this.manufacturerName to manufacturer
    set this.modelName to modelName
    set this.productionYear to productionYear
    set this.wheelCircumference to wheelCircumference
    set this.engine to new Engine()
    set this.transmission to new Transmission()

Although we now have an intuitive understanding of an object, and its differences with a class, we still lack enough information to provide a rich definition. For that definition, we need to introduce the detail and nuance of a real computer.

Objects in Memory

While the abstract concept of a class specification aids in our modeling of concepts and things in the real-world, all computing systems must operate within the confines of processors, memory, context switches, paging, temporal limitations, and other pragmatic mechanisms. In the context of objects, this means mapping our objects into the memory of our computer system. More precisely, that means storing the state and behavior of our objects into memory and be able to access these portions of memory in a consistent manner.

Before we move into the physical location of our objects in memory, we first need to compute how much memory is required to store an object. To do this, we must return to our discussion about primitive types. Every object, no matter how complex, can eventually be decomposed into a collection of primitive types. For example, if we look at a single Vehicle object, as illustrated in the figure below, we see that eventually, we are able to decompose the Vehicle object into primitive types (stylized in blue).

Although our Vehicle object is composed of 4 primitive types and 2 references to complex (non-primitive) types, we can decompose the referenced types into primitive types. To calculate the size of a single Vehicle object, we must first assign sizes to the primitive types:

Character: a single alphanumeric character is 1 byte in length (following the ASCII standard).
String: a string is simply a sequence of characters (of 1 byte each) with an additional null character appended to the tail of the characters to denote the end of the string; therefore, a String has a variable length that is equal to the number of characters in the string, plus one (for the null character). For example, the string hello is composed of 5 characters (h, e, l, l, and o), plus the null character (written as \0), and is therefore 6 bytes long. Note that Java does not use this style to store strings, but we will use this scheme for demonstration purposes. For more on Java strings in memory, see this post.
Number: we will define a number as being 2 bytes long.
Reference: a reference to another object must be large enough to hold the address of the other object in memory (so that we can find the referenced object); for the purposes of this article, we will assume an address in memory can be stored in 2 bytes, and therefore, a reference will be 2 bytes in length.

Using these lengths, we can compute the number of bytes needs to store a Vehicle object in memory. For demonstration, let's assume that the manufacturerName for our Vehicle object is Ford, the modelName is F150, the productionYear is 2017 (written as 0x07e1 in hexadecimal), the wheelCircumference is 113 (written as 0x0071 in hex), the initial rpms of the Engine is 0 (written as 0x0000 in hex), and the initial gearRatio of the Transmission is 1 (a 1:1 gear ratio, written as 0x0001 in hex). Using these values, we can compute the size of an Engine object to be 2 bytes, the size of a Transmission object to be 2 bytes, and the size of a Vehicle object to be 18 bytes:

5 bytes for the manufacturerName
5 bytes for the modelName
2 bytes for the productionYeaer
2 bytes for the reference to the Engine object
2 bytes for the reference to the Transmission object
2 bytes for the wheelCircumference

This memory scheme is illustrated in the figure below, where each block represents 1 bytes with blue representing the Vehicle object, purple representing the Engine object, and orange representing the Transmission object. (The importance of a class specification can be vividly seen when trying to find the state of an object in memory: Without knowing the types of the state entries, we would be unable to decipher which bytes represented what state in our objects, leaving us with an apparently random assortment of bytes in memory.)

It is important to note that it is not a requirement that the Vehicle, Engine, and Transmission objects be contiguously ordered in memory. In fact, they may be very far from one another in memory; this prohibits us from assuming that the object directly after the Vehicle is the Engine object and the object directly after the Engine object is the Transmission object. It is also important to note that there are two major divisions of memory in a computer: The stack and the heap. In the case of Java, all objects created in an application are placed on the heap (see Understanding Memory Management). Therefore, all objects that are illustrated in this article are assumed to be on the heap.

Using our previous calculation, we now know how much memory each one of our objects consumes, but we have to address the behavior of our objects as well. Since behavior is executable, it must exist in memory and be callable. Since the state of one object may vary from the state of any other, we have to maintain separate chunks of memory for the state of each one of our objects, but the same is not true for behavior.

In the case of behavior, each of the definitions is identical for all objects of that class, except that the definition is executing its logic on different objects. For example, is we call thegetCurrentSpeed() behavior of two different Vehicle objects, the logic of the behavior is identical, save that it is being executed against two different objects. Knowing this, we can simply parameterize the behavior based on the object against which it is being executed.

For example, instead of having a getCurrentSpeed() behavior definition existing in memory for each of the Vehicle objects we have in memory, we can create one definition in memory and simply add another parameter: The object that it is being executed against. Therefore, our getCurrentSpeed() behavior can be converted (by the compiler for the programming language we are using) from its original definition of...

public Number getCurrentSpeed():
    return this.transmission.gearRatio
        multiplied by the this.engine.rpms
        multiplied by the this.wheelCircumference

...to the following definition:

public Number getCurrentSpeed(Vehicle vehicle):
    return vehicle.transmission.gearRatio
        multiplied by the vehicle.engine.rpms
        multiplied by the vehicle.wheelCircumference

In essence, we simply replace the implicit this reference with a reference to an actual Vehicle object (this concept is reiterated in the following section). For example, a call tomyVehicle.getCurrentSpeed() for some myVehicle object of type Vehicle can be converted (by the compiler) to getCurrentSpeed(myVehicle). If we made a call to getCurrentSpeed() on another object of Vehicle named someOtherVehicle, the call would likewise begetCurrentSpeed(someOtherVehicle). Notice that we can reuse the same behavior definition by simply supplying the object on which it is acting as a parameter.

This is also true for behaviors that have existing parameters: The compiler simply adds the object as another parameter to the existing parameters. Note that all of this translation and introduction of additional parameters is done behind the scenes. We as a developer simply write our behavior definitions as normal and the compiler performs these optimizations unbeknownst to us. Using this optimization, we are able to reuse the behavior definitions for any number of objects in memory.

For example, if we instantiated two objects of type Vehicle and added them to memory (along with their Engine and Transmissions objects, respectively), we would have 6 total objects in memory: 2 Vehicle objects, 2 Engine objects, and 2 Transmission objects. By reusing our Vehicle behavior definitions, we only need one chunk of memory for the definitions, even though there are two Vehicle objects in memory, as illustrated in the figure below:

It is important to note that the Java Virtual Machine (JVM) standard does not require objects to be structured in a specific way in memory (see Section 2.7 of the JVM Standard); therefore, the above illustrations describe a general scheme used by many languages and many compilers. In actuality, there are other bytes requires for a Java object residing in memory in order to perform garbage collection and other overhead tasks, but the general (conceptual) techniques described in this section are still used in Java.

With these techniques understood, we can, at last, provide a concise definition of an object:

An object is a runtime instance of a class

By runtime, we mean that an object is an instance of a class existing in memory during the execution of a program. While a class is a simple specification that describes an entity or thing, an object is an instance of that specification existing in memory.

With a sound understanding of both classes and objects, we can explore the history of these concepts, which will shed some light on the context and origination of object-oriented programming.

Where Did Classes and Objects Come From?

The concept of a class is a natural extension of a former leading programming paradigm: Procedural Programming. Prior to the introduction of languages such as Java, C++, Python, and the other ubiquitous object-oriented languages, procedural languages (such as Dennis Ritchie's C Programming Language) were the de facto standard for software. These languages had two major concepts: (1) Aggregates of different data variables named structures and (2) functions.

Simply put, a structure is an entity composed of arbitrary primitive data members or other structures. For example, if we wanted to define a mailing address in C, we could create the following data structure:

struct Address {
    char addressee[128];
    int houseNumber;
    char streetName[128];
    char townName[128];
    char state[3];
    int zipCode;
};

A function, on the other hand, performs some action on either primitive data or a structure. For example, if we wanted to calculate the distance between two addresses, we could create a function that defines that behavior:

double distanceBetweenInMiles(struct Address from, struct Address to) {
    // Perform calculation and return result in miles 
}

If we then wanted to format the address into a standard US format, we could create another function to perform this action:

void printUsFormattedAddress(struct Address address) {
    // Print the address in US standard format
}

A pattern starts to present itself as more functions are created: The first argument of each function accepts an Address structure and performs some action on this provided structure using the additional arguments (if provided). In essence, there is almost an affinity between these functions and the first argument. For example, if we put these function declarations near the structure declaration, we see that they have a natural attraction:

struct Address {
    char addressee[128];
    int houseNumber;
    char streetName[128];
    char townName[128];
    char state[3];
    int zipCode;
};

double distanceToInMiles(struct Address from, struct Address to);
void printUsFormattedAddress(struct Address address);

This is where the change in paradigms originates: Instead of creating structures and functions that accept those structures, what if we created structures that had functions associated with them? For example, instead of creating standalone functions, what if we could declare functions that were associated with our structures (note that this is not valid C):

struct Address {
    char addressee[128];
    int houseNumber;
    char streetName[128];
    char townName[128];
    char state[3];
    int zipCode;

    double distanceToInMiles(struct Address to);
    void printUsFormattedAddress();
};

Following this thought process, we could then call our functions qualified with instantiated structures:

struct Address startingAddress;
struct Address destinationAddress;

// Obtain the distance between the two addressed
double distanceInMiles = startingAddress.distanceToInMiles(destinationAddress);

// Print the US formatted address
startingAddress.printUsFormattedAddress();

Our hypothetical model can be bridged with actual C code by having the above snippet compiled into the following equivalent code:

struct Address startingAddress;
struct Address destinationAddress;

// Obtain the distance between the two addressed
double distanceInMiles = distanceToInMiles(startingAddress, destinationAddress);

// Print the US formatted address
printUsFormattedAddress(startingAddress);

In the early days of object-oriented programming, there were compilers, such as Cfront, that simply converted (or translated) object-oriented C++ code into procedural C. Additionally, some languages such as Python explicitly perform this approximation. For example, in order to declare a behavior associated with a class, Python requires that the first parameter in the behavior declaration be a reference to the current object. For example, a simple class is declared as follows:

class Address(object):

    def distance_to_in_miles(self, to):
        # Calculate some distance and return it


starting_address = Address()
destination_address = Address()

distance_in_miles = starting_address.distance_to_in_miles(destination_address)

When the distance_to_in_miles method is called, the Python compiler automatically maps the starting_address object to self and destination_address to to. In essence, the Python compiler performs the same translation from object-oriented to procedural code that the original C compilers performed but instead requires the developer to explicitly declare the first argument of a behavior to be a reference to the object being operated on (read as "this" object or "self"). Note that the first parameter is not required to be named self (any valid Python parameter name will suffice), but this name is a de facto standard for Python.

Conclusion

In this article, we explored the need and the details of objects in software engineering, as well as the rationale behind their introduction into the world of software engineering. With our accumulated knowledge of both classes and objects, we are now ready to explore how these concepts are applied to the Java Programming Language. In the next article in this series, we will deep-dive into the details of how to create classes in Java and how these classes allow us to create executable code using the Java Virtual Machine (JVM).

Object (computer science) Java (programming language) Memory (storage engine) House (operating system) Concept (generic programming) Java virtual machine Cookie cutter De facto standard

Opinions expressed by DZone contributors are their own.

Related

Trending