Saturday, January 3, 2015

Liskov Substitution Principle (LSP)


The fourth article in the SOLID Principles series describes the Liskov Substitution Principle (LSP). The LSP specifies that functions that use pointers of references to base classes must be able to use objects of derived classes without knowing it.


The Principle

The Liskov Substitution Principle (LSP) can be worded in various ways. The original wording was described by Barbara Liskov as, "If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behaviour of P is unchanged when o1 is substituted for o2 then S is a subtype of T". Robert Cecil Martin's simpler version is, "Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it". For languages such as C#, this can be changed to "Code that uses a base class must be able to substitute a subclass without knowing it".

Case and Implementations 

The LSP applies to inheritance hierarchies. It specifies that you should design your classes so that client dependencies can be substituted with subclasses without the client knowing about the change. All subclasses must, therefore, operate the same manner as their base classes. The specific functionality of the subclass may be different but must conform to the expected behaviour of the base class. To be a true behavioural subtype, the subclass must not only implement the base class' methods and properties but also conform to its implied behaviour. This requires compliance with several rules.
The first rule is that there should be contravariance between parameters of the base class' methods and the matching parameters in subclasses. This means that the parameters in subclasses must either be the same types as those in the base class or must be less restrictive. Similarly, there must be covariance between method return values in the base class and its subclasses. This specifies that the subclass' return types must be the same as, or more restrictive than, the base class' return types.
The next rule concerns preconditions and postconditions. A precondition of a class is a rule that must be in place before an action can be taken. For example, before calling a method that reads from a database you may need to satisfy the precondition that the database connection is open. 

Postconditions describe the state of objects after a process is completed. For example, it may be assumed that the database connection is closed after executing a SQL statement. The LSP states that the preconditions of a base class must not be strengthened by a subclass and that postconditions cannot be weakened in subclasses.

Next the LSP considers invariants. An invariant describes a condition of a process that is true before the process begins and remains true afterwards. For example, a class may include a method that reads text from a file. If the method handles the opening and closing of the file, an invariant may be that the file is not open before the call or afterwards. To comply with the LSP, the invariants of a base class must not be changed by a subclass.

The next rule is the history constraint. By their nature, subclasses include all of the methods and properties of their superclasses. They may also add further members. The history constraint says that new or modified members should not modify the state of an object in a manner that would not be permitted by the base class. For example, if the base class represents an object with a fixed size, the subclass should not permit this size to be modified.

The final LSP rule specifies that a subclass should not throw exceptions that are not thrown by the base class unless they are subtypes of exceptions that may be thrown by the base class.
The above rules cannot be controlled by the compiler or limited by object-oriented programming languages. Instead, you must carefully consider the design of class hierarchies and of types that may be subclassed in the future. Failing to do so risks the creation of subclasses that break rules and create bugs in types that are dependent upon them.

One common indication of non-compliance with the LSP is when a client class checks the type of its dependencies. This may be by reading a property of an object that artificially describes its type or by using reflection to obtain the type. Often a switch statement will be used to perform a different action according to the type of the dependency. This additional complexity also violates the Open / Closed Principle (OCP), as the client class will need to be modified as further subclasses are introduced.

To demonstrate the application of the LSP, we can consider code that violates it and explain how the classes can be refactored to comply with the principle. The following code shows the outline of several classes:

public class Project
{
public Collection ProjectFiles { get; set; }
public void LoadAllFiles()
{
foreach (ProjectFile file in ProjectFiles)
{
file.LoadFileData();
}
}
public void SaveAllFiles()
{
foreach (ProjectFile file in ProjectFiles)
{
if (file as ReadOnlyFile == null)
file.SaveFileData();
}
}
}

public class ProjectFile
{
public string FilePath { get; set; }
public byte[] FileData { get; set; }
public void LoadFileData()
{
// Retrieve FileData from disk
}
public virtual void SaveFileData()
{
// Write FileData to disk
}
}

public class ReadOnlyFile : ProjectFile
{
public override void SaveFileData()
{
throw new InvalidOperationException();
}
}
The first class represents a project that contains a number of project files. Two methods are included that load the file data for every project file and save all of the files to disk. The second class describes a project file. This has a property for the file name and a byte array that contains file data once loaded. Two methods allow the file data to be loaded or saved.

The third class may have been added to the solution at a later time to the other two classes, perhaps when a new requirement was created that some project files would be read-only. The ReadOnlyFile class inherits its functionality from ProjectFile. However, as read-only files cannot be saved, the SaveFileData method has been overridden so that an invalid operation exception is thrown.
The ReadOnlyFile class violates the LSP in several ways. Although all of the members of the base class are implemented, clients cannot substitute ReadOnlyFile objects for ProjectFile objects. This is clear in the SaveFileData method, which introduces an exception that cannot be thrown by the base class. Next, a postcondition of the SaveFileData method in the base class is that the file has been updated on disk. This is not the case with the subclass. The final problem can be seen in the SaveAllFiles method of the Project class. Here the programmer has added an if statement to ensure that the file is not read-only before attempting to save it. This violates the LSP and the OCP as the Project class must be modified to allow new ProjectFile subclasses to be detected.

Refactored Code

There are various ways in which the code can be refactored to comply with the LSP. One is shown below. Here the Project class has been modified to include two collections instead of one. One collection contains all of the files in the project and one holds references to writeable files only. The LoadAllFiles method loads data into all of the files in the AllFiles collection. As the files in the WriteableFiles collection will be a subset of the same references, the data will be visible via these also. The SaveAllFiles method has been replaced with a method that saves only the writeable files.
The ProjectFile class now contains only one method, which loads the file data. This method is required for both writeable and read-only files. The new WriteableFile class extends ProjectFile, adding a method that saves the file data. This reversal of the hierarchy means that the code now complies with the LSP.

The refactored code is as follows:


public class Project
{
public Collection AllFiles { get; set; }
public Collection WriteableFiles { get; set; }
public void LoadAllFiles()
{
foreach (ProjectFile file in AllFiles)
{
file.LoadFileData();
}
}
public void SaveAllWriteableFiles()
{
foreach (WriteableFile file in WriteableFiles)
{
file.SaveFileData();
}
}
}

public class ProjectFile
{
public string FilePath { get; set; }
public byte[] FileData { get; set; }
public void LoadFileData()
{
// Retrieve FileData from disk
}
}

public class WriteableFile : ProjectFile
{
public void SaveFileData()
{
// Write FileData to disk
}
}

Thursday, January 1, 2015

The Open/Closed Principle


The third article in the SOLID Principles series describes the Open / Closed Principle (OCP). The OCP states that all classes and similar units of source code should be open for extension but closed for modification.

The Principle

The Open / Closed Principle (OCP) states that classes should be open for extension but closed for modification. "Open to extension" means that you should design your classes so that new functionality can be added as new requirements are generated. "Closed for modification" means that once you have developed a class you should never modify it, except to correct bugs.

The two parts of the principle appear to be contradictory. However, if you correctly structure your classes and their dependencies you can add functionality without editing existing source code. Generally you achieve this by referring to abstractions for dependencies, such as interfaces or abstract classes, rather than using concrete classes. Such interfaces can be fixed once developed so the classes that depend upon them can rely upon unchanging abstractions. Functionality can be added by creating new classes that implement the interfaces.
Applying the OCP to your projects limits the need to change source code once it has been written, tested and debugged. This reduces the risk of introducing new bugs to existing code, leading to more robust software. Another side effect of the use of interfaces for dependencies is reduced coupling and increased flexibility.

Example Code

To demonstrate the application of the OCP, we can consider some C# code that violates it and explain how the classes can be refactored to comply with the principle:

public class Logger
{
public void Log(string message, LogType logType)
{
switch (logType)
{
case LogType.Console:
Console.WriteLine(message);
break;
case LogType.File:
// Code to send message to printer
break;
}
}
}

public enum LogType
{
Console,
File
}
The above sample code is a basic module for logging messages. The Logger class has a single method that accepts a message to be logged and the type of logging to perform. The switch statement changes the action according to whether the program is outputting messages to the console or to the default printer.

If you wished to add a third type of logging, perhaps sending the logged messages to a message queue or storing them in a database, you could not do so without modifying the existing code. Firstly, you would need to add new LogType constants for the new methods of logging messages. Secondly you would need to extend the switch statement to check for the new LogTypes and output or store messages accordingly. This violates the OCP.

Refactored Code

We can easily refactor the logging code to achieve compliance with the OCP. Firstly we need to remove the LogType enumeration, as this restricts the types of logging that can be included. Instead of passing the type to the Logger, we will create a new class for each type of message logger that we require. In the final code we will have two such classes, named "ConsoleLogger" and "PrinterLogger". Additional logging types could be added later without changing any existing code.
The Logger class still performs all logging but using one of the message logger classes described above to output a message. In order that the classes are not tightly coupled, each message logger type implements the IMessageLogger interface. The Logger class is never aware of the type of logging being used as its dependency is provided as an IMessageLogger instance using constructor injection.

The refactored code is as follows:

public class Logger
{
IMessageLogger _messageLogger;
public Logger(IMessageLogger messageLogger)
{
_messageLogger = messageLogger;
}
public void Log(string message)
{
_messageLogger.Log(message);
}
}

public interface IMessageLogger
{
void Log(string message);
}

public class ConsoleLogger : IMessageLogger
{
public void Log(string message)
{
Console.WriteLine(message);
}
}

public class PrinterLogger : IMessageLogger
{
public void Log(string message)
{
// Code to send message to printer
}
}