- Step 0: Total transparency; no reuse potential
- Step 1: Forward declarations
- Step 2: Header files
- Step 3: Perfect information hiding
A colleague recently told me that C was akin to camping. It can be fun once in a while, but you don’t want to spend your life camping, do you?
While I tend to agree with that statement, it should be noted that if you go camping for a few days, you might learn a thing or two about life, tying knots, or how to prepare a meal or survive in the wilderness with minimum material. You know, things that could prove useful later on, once you’re back from camping.
The same goes with C. Playing with it a lot recently, I was amazed to see how the language helps you clarify some basic concepts that you can more or less avoid understanding indefinitely if you stick to higher-level languages. C actually forces you to understand, or you get nowhere.
Now C is not labeled as an object-oriented language. But does that prevent one from using the OOP paradigm in C?
This article is going to take you through a series of steps leading to the introduction of the dot notation and true information hiding in C. This is done for illustrative purposes only, not for efficiency purposes in any way. Still, the exercise is highly useful, because you can’t implement good syntactic sugar without gaining a few bits of knowledge about what you are coating.
Step 0: Total transparency; no reuse potential
I am going to evolve an IntStack class through all the steps. Here’s the class at the start, along with an example usage.
What do we notice here? Well, everything is defined before it’s used. We are actually forced into that order here, because declarations and definitions are kept together (more on that shortly).
This presentation style means the code is laid out from the bottom up, or from lower levels of abstraction to higher levels. This is not ideal from a communication perspective. (Think about how a newspaper presents the information.)
More importantly, everything is exposed to the client (the main function, in this case): the fact that the stack is implemented with a linked list, the list nodes’ composition, and the composition of the IntStack data structure (a pointer to the head of the underlying linked list). The client has complete knowledge of the internal representation of the stack, and it uses that knowledge, thereby creating a tight coupling between itself and the IntStack type. In this scenario, any change to IntStack’s implementation (including its internal representation) will affect its clients.
Quick note before we move on: I almost always typedef my structs, but you should be warned that this practice is controversial.
Step 1: Separate declarations and definitions, using forward declarations
The least we can do at this stage is separating declarations from definitions.
A declaration introduces an identifier, enabling the compiler to accept future usages of that identifier. A definition, on the other hand, implements the identifier, meaning:
- it causes storage to be reserved if the identifier refers to an object;
- it specifies the function body if the identifier refers to a function.
It should be noted that a definition always contains a declaration, but the converse is false. A declaration given without a definition is called a forward declaration. Let’s take a look at the reviewed code, after the introduction of forward declarations.
While forward declarations make our code easier to read, it would seem like they fail to bring us closer to our goal of introducing dot notation and information hiding. But au contraire, they are essential in separating interface and implementation. As you shall see, most of the remaining steps rely on forward declarations.
Step 2: Introducing header files
Now that we have forward declarations, we can use a clean mechanism allowing us to share the IntStack type across any number of source files. Any source file that needs an integer stack and the related functions can simply specify a preprocessor directive to include the int_stack.h header file.
Step 3: On our way to true information hiding
Now we are armed to eliminate the coupling between IntStack’s clients and its implementation. But how should we proceed?
The first logical step is to rewrite the client in a way that it does not need to access the stack’s members. Let’s start by creating a peek function.
By introducing peek_stack, do you notice how the client is now decoupled from struct int_node’s definition? That’s a great thing, because we can now hide it. We simply move the definition from the header file to int_stack.c. Now not only can the client avoid accessing struct int_node’s members, but it is also prevented from doing so. See what happens if you try to replace the invocation of peek_stack with the function’s implementation:
We were actually able to cause information hiding to be enforced by the compiler. Great!
Similarly, we can introduce an is_empty_stack function:
Note that the client still knows about struct int_stack’s definition (it is placed in int_stack.h). What if we also wanted to hide that definition?
All we’d need to do is provide a way for the client to avoid referring to IntStack’s top member. And we can achieve that be providing a constructor function.
Notice what also happened here? We separated struct int_stack’s declaration from its definition, and moved the latter to the implementation file, thereby making it inaccessible to IntStack’s clients.
Now can you guess what happens if the client tries to access IntStack’s top member?
That’s right, another compiler error. Perfect!
At this point, it would appear like we have achieved perfect information hiding. On surface, the client knows nothing about IntStack’s implementation:
But is that really so? Is the client totally immune against implementation changes?
Not quite yet. We have delegated object construction to the implementation file, but the client still needs to free the memory allocated to it. OK, but how does that go against information hiding? After all, at this point, all the client knows about the type is its declaration.
Well, consider the following. Right now, the constructor only allocates memory for one instance of the IntStack type. But what if there were an implementation change down the road that caused the constructor to allocate more memory, perhaps because there is a need to maintain a larger state internally?
Then all the clients would have to know about the change, or there would be memory leaks. So the clients are still partly coupled to the implementation. Fortunately, we can fix that by delegating the destruction of the object to the implementation file, just like we did for the construction.
At this point, the clients are only able to use the IntStack type through its interface. No implementation details are leaking. We have therefore achieved perfect information hiding.
This is great, but we can’t say it’s sweet yet.
In the second part of this piece, we’ll introduce some syntactic sugar known as the dot notation (even though our dot notation will actually use arrows). Specifically, we will bind our pseudo-methods to our objects (the IntStack instances), like so:
That way, we’ll be able to rid all the function names of their “_stack” suffixes.