18 Packages and program design

When a program starts to get large it needs to broken up into smaller pieces, to make it easier to understand, distribute the work, quicker to fix, isolate changes etc.

But how to do this? Typical introductory programming courses teach programming with a large main procedure, and perhaps a couple of packages.

So how do you progress from this initial stage to a more mature view of system developement? One way is to show a simple mapping (or transformation) of a program to a functionally equivalent program with a number of packages.

The following attempts to show how an existing Ada program may be broken up into smaller pieces.

Consider the following program...

Figure 1. Sample program.

Here the code makes use of different types, and has numerous procedures. We can imagine, for example, that the procedure Make_Booking calls Search_For_Empty_Booking, which in turn calls the function After to check if the requested booking is after another booking.

When we look at the code there does not seem to be much cohesion - declarations of types, constants and subprograms are in their own little sections. This is one way to split a program up into pieces, but it is not a very good way.

Another way to split it up is to note that we can see that somethings "belong" together. For example the type date is closely related to the function After and procedure Display. What we can do is group these routines more closely together.

(Imagine that there are others programs being developed, and that they also had a need of these routines. It would be silly to have programmers recode these routines. Reuse of the code would be cheaper - but it is not easy in this situation. If someone copies the code by doing "cut and paste" then any bug fixes made to the date routines will only occur in one program. It would be better if we made these routines available in one place for all programs).

The first thing we should think about is grouping these items together...

Figure 2. Like types and procedures grouped together.

This does not change how the program runs, only how it is layed out.

You can see that the bookings require the definition of a date to be available, in fact they have to preceede the definition of the booking. The date definitions do not require any previous declarations.

From this we can see that it is possible to take these two sections out, and place them into separate files. However we have to be careful to maintain Bookings visibility of the Date declarations. Likewise the main program has to be able to "see" all of the booking declarations, and all of the date declarations.

The bookings section only has to be able to "see" the date declarations.

The date section doesn't have to be able to "see" any other declarations.

We are now in a position to place them into separate packages...

Figure 4. Date package

Note that instead of writing out the subprograms in full, we just write them as specifications. The full description of them (with all the code) is placed in the package body.

Figure 5. Bookings package

The main procedure can now be...

When this program is compiled and run, it will execute exactly the same as the very first example. All that is happened is we have moved code into different places, so as to make it easier to write and maintain. These are software engineering, or program construction concerns, not issues relating to how the program runs.

However in separating items out, we have had to be very aware of what declarations are dependent on what other declarations. If we had dates declared after bookings, bookings would not compile. This is called the dependency relationship and it is a very useful piece of information that tells us a lot about how a program is constructed.

For example if we found out that a routine in the date package was incorrect, we would need to look at all the routines that had something to do with dates to see if they were affected, and should also be fixed. In the first example, this may not have been easy. With the dependency relationships explicity described, it makes it very easy to search through large programs and find what may be affected by a bug and what isn't.

Also when a maintenance programmer has to change a program, they always have to be aware of the ripple effect - the possibility that a change in one part of a program may have consequences in other parts that are not anticipated. Dependency relationships (which are explicitly stated in the with clause) help the programmer understand how a large program is pieced together.


A "Layered" view of the program

When we think about the dependency relationships in a system, we can see that some packages don't have any dependencies. Others depend on one or two other packages, and some depend on many. In general we can depict this in a layered diagram...

Here the procedure main is in the hightest layer, and it depends on services provided by lower layer packages. An item at a lower layer however, never depends on higher layer items. Designing systems in layers makes the process of abstraction, and the notion of providing services a fairly natural one.

Interestingly by turning the diagram upside down, you get the same program strucutre as in Figure 2.

Look at putting the pieces into package specs...

When a function such as

is called from, say, procedure

the procedure doesn't really care how the function is written - what internal variables it has, or the order of if statements etc, so long as it produces the correct result. All it really cares about is giving two dates, and getting a boolean result. It only really depends on the specification of the function, and not the body of it.

Ada enforces this distinction when we put things into a package, by allowing only subprogram specifications in the pacakge spec, and the full subprogram in the package body.

Look at putting pieces into a package body

When we examine our main program, we may find that subprograms such as

are never called from any routine other than the other booking related subprograms. For this reason there is not much point in making it publically available. We can place it soley in the package body for bookings, and not have any adverse impact on the system at all.


Designing programs with Direct_IO/Sequential_IO

Many students desiging programs with Direct_IO, or indeed any generic package, often have trouble figuring out where the instantiation should be placed.

Generally a program can be structured using child packages; the definition of an item is placed in one package, and child packages are created to contain the I/O facilities for it.

Consider, for example...

If we want a package to perform I/O on this type we can instantiate direct_io as a child package...

However direct_io (and sequential_io) are extremely low level packages. They offer very little in terms of the functionality you would really like to have, such as the ability to search for an item, or even delete an item.

If you want to write and retrieve binary data to a file, consider the Direct_IO generic as simply a building block, used to create more sophisticated services.

(We can make an analogy between direct_io and arrays. Both are very low level concepts, and are generally used to construct higher level concepts such as hash tables).

In the example below, we create a higher level file abstraction, that supports searching and deletion of items in the file.

At this point, we have a file type that can be used in further child packages to build several different I/O facilities. The package has been instantiated in the private section for two reasons.

1. It prevents packages outside the Blahs.IO hierachy from accessing the low level routines in Blahs_Direct_IO, that are of no concern to them. We can force them to use the high level routines we will provide.

2 It allows child packages to 'see' the Blahs_Direct_IO package, and therefore to be able to call on these routines.

The package body would look like...

For example you may want to produce a package with facilities for reading, deleting, searching etc, while another package could be used to consolidate a file.

The package body for these packages can be roughly sketched out as follows. Note that it makes a simplifying assumption as to how it searches for a key value.

It is informative to note that if we examine the non private interfaces in this package hierachy, we will see no references to the generic Direct_IO at all.

To use these routines you may have code as follows...

These child packages can also be written as generic packages, so that this structure can be replicated for different file types.


Private packages

When we develop systems we find that new issues appear as we tackle larger and larger programs. Initially splitting a program into subprograms makes developement easier. We can place variables that are only used by one subprogram inside it - hiding it from outside view and interference.

As the number of subprograms grows, however, we find a need to split them up into another level of grouping - the package. This gives us the opportunity to hide entire subprograms inside package bodies - subprograms that are not meant for other routines to use.

After the number of packages starts to grow, we turn to child packages to create subsystems, in which a family of logically related types and functionality is packaged together. Each subsystem provides services to all clients by advertising them in it's client specifications. Once again however, we find at this high level of abstraction some of the services that are offered should only be available to those within the subsystem.

Private packages allow you to structure your program with local packages, and prevent offering services to anyone who wants them. Banks are, of course, more secure if the internal procedures they make use of to deliver services to customers, are not made available to those same clients. We definately want the same level of security for our subsystems!

How do I know when to use private packages?

Armies during war time generally run on a "needs to know" basis. You only tell people what they need to know. Similarly when you progressed from to each new level in the diagram above, you dealt with the issue of what to hide on a "needs to know" basis. If a client doesn't need to know the details of a service then they are hidden from view. This split is based on the difference between what service is offered, and how it is implemented. You simply need to apply the same knowledge, only at a higher level.

This may come about from a hierachical object decomposition, where the services provided at a high level analysis provide the subsystem level interfaces. Further elaboration of the design results in objects which may only be needed to implement the subsystem services already offered.

A case study

A temporal assertions package (used for making assertions such as "this event must happen within 5 seconds of that event", or "this event must never occur before that event") has been developed at RMIT. Several concepts emerged from the analysis of the requirements.

As well as these types/concepts discovered at the analysis stage, other types were found at the design stage, such as data structures to hold the events and predicates that would be declared.

The packages developed for this were...

Assertion contains most of the code of the program. Package Tri_State, although not involved in any other part of the system, was felt to be a useful sort of package, and was therefore not declared private.

Package Assertion.Events_Table maintains the data structure for storing the events that clients have declared. As an alternative it could have been included in the package body of Assertion in a number of different ways.

One technique would be to dump all of the code, data structures and variables in the package body. This would not be satisfactory as it would make the package body more cluttered.

Another technique would be to place a package inside the package body...

Although it fixes the problem of cluttering, it still causes the package to be longer than it needs to be (increased compilation times), harder to develop (it is harder to make calls on it). As well the structure of the program is harder to understand (the program structure is easiest to see when we can easily see the packages that make up the program).

For this reason the package was made a private child package of Assertion.

A package because

We need to hide away the low level details of how the table is implemented

A child package because

It needs to see the declarations of type Event in the spec of Assertion

It's name clearly links it into the Assertion subsystem of a program

A Private child package because

No other part of a program, apart from the Assertion subsystem, needs to know the internal details of how events are stored away.