Approaches to Software Engineering -- Introduction [part 1]

HOME | Project Management

Data Warehousing / Mining

Software Testing | Technical Writing



The evolution of electronic computers began in the 1940s. Early efforts in the field of computing were focused on designing the hardware, as that was the challenge, and hardware was where most technical difficulties existed. In the early computing systems, there was essentially no operating system; the programs were fed with paper tapes or by switches. With the evolution of second-generation machines in the 1950s, early concepts of operating systems evolved and single-user operating systems came into existence. High-Level languages, particularly FORTRAN and Cobol, along with their compilers were developed. There was a gradual trend toward isolating the user from the machine internals, so the user could concentrate on solving the problem at hand rather than getting bogged down in the machine details.

With the coming of the multiprogramming operating systems in the early 1960s, the usability and efficiency of the computing machines took a big leap. Prices of hardware also decreased, and awareness of computers increased substantially since their early days. With the availability of cheaper and more powerful machines, higher-level languages, and more user-friendly operating systems, the applications of computers grew rapidly. In addition, the nature of software engineering evolved from simple programming exercises to developing software systems, which were much larger in scope, and required great effort by many people. The techniques for writing simple programs could not be scaled up for developing soft ware systems, and the computing world found itself in the midst of a "software crisis." Two conferences, sponsored by the NATO Science Committee, were held in Europe in the 1960s to discuss the growing software crisis and the need to focus on software development. The term software engineering was coined at these meetings.

The use of computers is growing very rapidly. Now computer systems are used in such diverse areas as business applications, scientific work, video games, air traffic control, aircraft control, missile control, hospital management, airline reservations, and medical diagnostic equipment. There is probably no discipline that does not use computer systems now--even artists, linguists, and filmmakers use it.

With this increased use of computers the need for software is increasing dramatically. Furthermore, the complexity of these systems is also increasing-imagine the complexity of the software for aircraft control or a telephone network monitoring system. Actually, the complexity of applications and software systems has grown much faster than our ability to deal with it. Consequently, many years after the software crisis was first declared, we find that it has not yet ended. Software engineering is the discipline whose goal is to deal with this problem.

In this section we first define our problem domain and discuss the major reasons for the "software problem." Then we discuss the major problems that software engineering faces. This is followed by the basic approach followed by software engineering to handle the software crisis. In the rest of the guide we discuss in more detail the various aspects of the software engineering approach.

1. The Software Problem

Let us first discuss what we mean by software. Software is not merely a collection of computer programs. There is a distinction between a program and a programming systems product. A program is generally complete in itself and is generally used only by the author of the program. There is usually little documentation or other aids to help other people use the program. Because the author is the user, the presence of "bugs" is not a major concern; if the program crashes, the author will fix the program and start using it again. These programs are not designed with such issues as portability, reliability, and usability in mind.

A programming system product, on the other hand, is used largely by people other than the developers of the system. The users may be from different backgrounds, so a proper user interface is provided. There is sufficient documentation to help these diverse users use the system. Programs are thoroughly tested before operational use, because users do not have the luxury of fixing bugs that may be detected. And because the product may be used in a variety of environments, perhaps on a variety of hardware platforms, portability is a key issue.

Clearly, a program to solve a problem and a programming systems product to solve the same problem are two entirely different things. Obviously, much more effort and resources are required for a programming systems product. Brooks estimates that as a rule of thumb, a programming systems product costs approximately ten times as much as a corresponding program. The software industry is largely interested in developing programming systems products, and most commercial software systems or packages fall in this category. The programming systems product is also sometimes called production or industrial-quality software.

IEEE defines software as the collection of computer programs, procedures, roles, and associated documentation and data. This definition clearly states that software is not just programs, but includes all the associated documentation and data. This implies that the discipline dealing with the development of software should not deal only with developing programs, but with developing all the things that constitute software. Overall, we can say that software engineering is largely concerned with the development of industrial-quality software, where software is as defined earlier. In the rest of the guide software means industrial-quality software.


FIG. 1. Hardware-software cost trends.

1.1 Software Is Expensive

Over the past decades, with the advancement of technology, the cost of hardware has consistently decreased. For example, the cost per bit of memory decreased more than 50 fold in two decades. The situation with the processors is similar; virtually every year newer and faster processors are introduced that provide many times the compute power of earlier mainframe computer systems at a cost that is a fraction of those mainframe systems. On the other hand, the cost of software is increasing. As a result, the HW/SW ratio for a computer system has shown a reversal from the early years, as is shown in FIG. 1. The main reason for the high cost of software is that software development is still labor-intensive. To get an absolute idea of the costs involved, let us consider the current state of practice in the industry. Delivered lines of code (DLOC) is by far the most commonly used measure of software size in the industry. (We will discuss the issue of software size later in the guide.) As the main cost of producing software is in the manpower employed, the cost of developing software is generally measured in terms of person-months of effort spent in development. And productivity is frequently measured in the industry in terms of DLOC per person-month.

The current productivity in the software industry for writing fresh code ranges from 300 to 1000 DLOC per person-month. That is, for developing software, the average productivity per person, per month, over the entire development cycle is about 300 to 1000 DLOC. And software companies charge the client for whom they are developing the software upwards of $100,000 per person-year or more than $8,000 per person-month (which comes to about $50 per hour). With the current productivity figures of the industry, this translates into a cost per line of code of approximately $8 to $25. In other words, each line of delivered code costs between $8 and $25 at current costs and productivity levels! And even moderately sized projects easily end up with software of 50,000 LOC. (For projects like the software for the space shuttle, the size is millions of lines of code.) With this productivity, such a software project will cost between $0.5 million and $1.25 million! Given the current compute power of machines, such software can easily be used on a workstation or a small network with a server. This implies that software that can cost more than a million dollars can run on hardware that costs at most tens of thousands of dollars, clearly showing that the cost of hardware on which such an application can run is a fraction of the cost of the application software! This example clearly shows that not only is software very expensive, it indeed forms the major component of the total automated system, with the hardware forming a very small component.

1.2 Late, Costly, and Unreliable

There are many instances quoted about software projects that are behind schedule and have heavy cost overruns. The software industry has gained a reputation of not being able to deliver on time and within budget. Consider the example of the V.S. Air Force command-and-control software project. The initial estimate given by the winning contractor was $400,000. Subsequently, the cost was renegotiated to $700,000, to $2,500,000, and finally to $3,200,000. The final project completion cost was almost 10 times the original estimate! Take another example from. A Fortune 500 consumer products company plans to get an information system developed in nine months at the cost of $250,000. Two years later, after spending $2.5 million, the job was still not done, and it was estimated that another $3.6 million would be needed. The project was scrapped (evidently, the extra cost of $3.6 million was not worth the returns!). Many such disaster examples are given. A survey reported in states that of the 600 firms surveyed, more than 35% reported having some computer-related development project that they categorized as a runaway. And a runaway is not a project that is somewhat late or somewhat over budget--it is one where the budget and schedule are out of control. The problem has become so severe that it has spawned an industry of its own; there are consultancy companies that advise how to rein such projects, and one such company had more than $30 million in revenues from more than 20 clients. Similarly, a large number of instances have been quoted regarding the unreliability of software; the software does not do what it is supposed to do or does something it is not supposed to do. In one defense survey, it was reported that more than 70% of all the equipment failures were due to software! And this is in systems that are loaded with electrical, hydraulic, and mechanical systems. This just indicates that all other engineering disciplines have advanced far more than software engineering, and a system comprising of the products of various engineering disciplines finds that software is the weakest component. Failure of an early Apollo flight was also attributed to software. Similarly, failure of a test firing of a missile in Asia was attributed to software problems. Many banks have lost millions of dollars due to inaccuracies and other problems in their software. Overall, the software industry has gained a reputation of not delivering software within schedule and budget and of producing software systems of poor quality.

There are numerous instances of projects that enforce this view. In fact, a whole column in Software Engineering Notes is dedicated to such instances. It is clear that cost and schedule overruns and the problem of reliability are major contributors to the software crisis.

A note about the cause of unreliability in software: Software failures are different from failures of, say, mechanical or electrical systems. Products of these other engineering disciplines fail because of the change in physical or electrical properties of the system caused by aging. A software product, on the other hand, never wears out due to age. In software, failures occur due to bugs or errors that get introduced during the design and development process. Hence, even though a software may fail after operating correctly for some time, the bug that causes that failure was there from the start! It only got executed at the time of the failure. This is quite different from other systems, where if a system fails, it generally means that sometime before the failure the system developed some problem (due to aging) that did not exist earlier.

1.3 Problem of Change and Rework

Once the software is delivered and deployed, it enters the maintenance phase. All systems need maintenance, but for other systems it is largely due to problems that are introduced due to aging. Why is maintenance needed for software, when software does not age? Software needs to be maintained not because some of its components wear out and need to be replaced, but because there are often some residual errors remaining in the system that must be removed as they are discovered.

It is commonly believed that the state of the art today is such that almost all software that is developed has residual errors, or bugs, in them. Many of these surface only after the system has been in operation, sometimes for a long time. These errors, once discovered, need to be removed, leading to the software getting changed. This is sometimes called corrective maintenance.

Even without bugs, software frequently undergoes change. The main reason is that software often must be upgraded and enhanced to include more features and pro vide more services. This also requires modification of the software. It has been argued that once a software system is deployed, the environment in which it operates changes. Hence, the needs that initiated the software development also change to reflect the needs of the new environment. Hence, the software must adapt to the needs of the changed environment. The changed software then changes the environment, which in turn requires further change. This phenomenon is some times called the law of software evolution. Maintenance due to this phenomenon is sometimes called adaptive maintenance.

Though maintenance is not considered apart of software development, it is an extremely important activity in the life of a software product. If we consider the total life of software, the cost of maintenance generally exceeds the cost of developing the software! The maintenance-to-development-cost ratio has been variously suggested as 80:20, 70:30, or 60:40. FIG. 1 also shows how the maintenance costs are increasing.

Maintenance work is based on existing software, as compared to development work that creates new software. Consequently, maintenance revolves around understanding existing software and maintainers spend most of their time trying to understand the software they have to modify. Understanding the software involves understanding not only the code but also the related documents. During the modification of the software, the effects of the change have to be clearly understood by the maintainer because introducing undesired side effects in the system during modification is easy. To test whether those aspects of the system that are not sup posed to be modified are operating as they were before modification, regression testing is done. Regression testing involves executing old test cases to test that no new errors have been introduced.

Thus, maintenance involves understanding the existing software (code and related documents), understanding the effects of change, making the changes-to both the code and the documents-testing the new parts (changes), and retesting the old parts that were not changed. Because often during development, the needs of the maintainers are not kept in mind, few support documents are produced during development to help the maintainer. The complexity of the maintenance task, coupled with the neglect of maintenance concerns during development, makes maintenance the most costly activity in the life of software product.

Maintenance is one form of change or software rework that typically is done after the software development is completed and the software has been deployed.

However, there are other forms of changes that lead to rework during the software development itself.

One of the biggest problems in software development, particularly for large and complex systems, is that what is desired from the software (i.e., the requirements) is not understood. To completely specify the requirements, all the functionality, interfaces, and constraints have to be specified before software development has commenced! In other words, for specifying the requirements, the clients and the developers have to visualize what the software behavior should be once it is developed. This is very hard to do, particularly for large and complex systems. So, what generally happens is that the requirements are "frozen" when it is believed that they are generally in good shape, and then the development proceeds. However, as time goes by and the understanding of the system improves, the clients frequently discover additional requirements they had not specified earlier. This leads to requirements getting changed when the development may have proceeded to the coding, or even testing, stage! This change leads to rework; the requirements, the design, the code all have to be changed to accommodate the new or changed requirements.

Just uncovering requirements that were not understood earlier is not the only reason for this change and rework. Software development of large and complex systems can take a few years. And with the passage of time, the needs of the clients change.

After all, the current needs, which initiate the software product, are a reflection of current times. As times change, so do the needs. And, obviously, the clients want the system deployed to satisfy their most current needs. This change of needs while the development is going on also leads to rework.

In fact, changing requirements and associated rework are a major problem of the software industry. It is estimated that rework costs are 30 to 40% of the development cost. In other words, of the total development effort, rework due to various changes consume about 30 to 40% of the effort! No wonder change and rework is a major contributor to the software crisis. However, unlike the issues discussed earlier, the problem of rework and change is not just a reflection of the state of software development, as changes are frequently initiated by clients as their needs change. However, change is a reality that has to be dealt with properly.

2. Software Engineering Problem

It is clear that the current state of software leaves much to be desired. A primary reason for this is that approaches to software development are frequently ad hoc and programming-centered. The ad hoc or programming-centered approach (which considers developing software essentially as a programming exercise) may work for small projects, but for the problem domain that we are interested in (i.e., large industrial-quality software), these approaches generally do not work. If we have to control this software crisis, some methodical approach is needed for software development. This is where software engineering comes in. Software engineering is defined as:

Software engineering is the systematic approach to the development, operation, maintenance, and retirement of software.

Another definition from the economic and human perspective is given by Boehm by combining the dictionary's definition of engineering with its definition of software. His definition states:

Software Engineering is the application of science and mathematics by which the capabilities of computer equipment are made useful to man via computer programs, procedures, and associated documentation.

The use of the terms systematic approach or mathematics and science for the development of software means that software engineering provides methodologies for developing software as close to the scientific method as possible. That is, these methodologies are repeatable, and if the methodology is applied by different people, similar software will be produced. In essence, the goal of software engineering is to take software development closer to science and away from being an art. Note also that the focus of software engineering is not developing software per se, but methods far developing software. That is, the focus is on developing methods that can be used by various software projects.

The phrase usable to man emphasizes the needs of the user and the software's interface with the user. This definition implies that user needs should be given due importance in the development of software, and the final program should give importance to the user interface. With this definition of software engineering, let us now discuss a few fundamental problems that software engineering faces.

2.1 The Problem of Scale

A fundamental problem of software engineering is the problem of scale; development of a very large system requires a very different set of methods compared to developing a small system. In other words, the methods that are used for developing small systems generally do not scale up to large systems. An example will illustrate this point. Consider the problem of counting people in a room versus taking a census of a country. Both are essentially counting problems. But the methods used for counting people in a room (probably just go row-wise or column-wise) will just not work when taking a census. Different set of methods will have to be used for conducting a census, and the census problem will require considerably more management, organization, and validation, in addition to counting.


FIG. 2. The problem of scale.

Similarly, methods that one can use to develop programs of a few hundred lines cannot be expected to work when software of a few hundred thousand lines needs to be developed. A different set of methods have to be used for developing large software. Any large project involves the use of technology and project management.

For software projects, by technology we mean the methods, procedures, and tools that are used. In small projects, informal methods for development and management can be used. However, for large projects, both have to be much more formal, as shown in FIG. 2.

As shown in the figure, when dealing with a small software project, the technology requirement is low (all you need to know is how to program and a bit of testing) and the project management requirement is also low (who needs formal management for developing a 100-line program?). However, when the scale changes to large systems, to solve such problems properly, it is essential that we move in both directions-the methods used for development need to be more formal, and the project management for the development project also needs to be more formal. For example, if we leave 50 bright programmers together (who know how to develop small programs wail) without formal management and development procedures and ask them to develop an on-line inventory control system for an automotive manufacturer, it is highly unlikely that they will produce anything of use. To successfully execute the project, a proper method of development has to be used and the project has to be tightly managed to make sure that methods are indeed being followed and that cost, schedule, and quality are under control.

Though there is no universally acceptable definition of what is a "small" project and what is a "large" project, one can use the definitions used in the COCOMO cost model (to be discussed later in the guide) to get an idea of scale. According to this model, a project is small if its size in thousands of delivered lines of code (KDLOC) is 2 KDLOC, intermediate if the size is 8 KDLOC, medium if the size is 32 KDLOC, and Zarge if the size is 128 KDLOC (or larger).

2.2 Cast, Schedule, and Quality

An engineering discipline, almost by definition, is driven by practical parameters of cost, schedule, and quality. A solution that takes enormous resources and many years may not be acceptable. Similarly, a poor-quality solution, even at low cost, may not be of much use. Like all engineering disciplines, software engineering is driven by the three major factors: cost, schedule, and quality. In some contexts, cost and quality are considered the primary independent factors, as schedule can be modeled as cost or considered as an independent variable whose value is more or less fixed for a given cost. We will also consider these as the primary driving factors.

We have already seen that the current state of affairs is that producing software is very expensive. Clearly, a practical and consistent goal on software engineering is to come up with methods of producing software more cheaply. Cost is a consistent driving force in software engineering.

The cost of developing a system is the cost of the resources used for the system, which, in the case of software, are the manpower, hardware, software, and other support resources. Generally, the manpower component is predominant, as software development is largely labor-intensive and the cost of the computing systems (the other major cost component) is now quite low. Hence, the cost of a software project is measured in terms of person-months, i.e., the cost is considered to be the total number of person-months ·spent in the project. To convert this to a dollar amount, it is multiplied with the dollar cost per person-month. In defining this unit cost for a person-month, the other costs are included (called overheads). In this manner, by using person-months for specifying cost, the entire cost can be modeled.

Schedule is an important factor in many projects. Business trends are dictating that the time to market of a product should be reduced; that is, the cycle time from concept to delivery should be small. Any business with such a requirement will also require that the cycle time for building a software needed by the business be small. Similarly, there are examples, particularly in the financial sector, where the window of opportunity is small. Hence, any software needed to exploit this window of opportunity will have to be done within a small cycle time. Due to these types of applications, where a reduced cycle time is highly desirable even if the costs become higher, there is a growing interest in rapid application development (RAD).

--------------


FIG. 3. Software quality factors. Maintainability ;Flexibility; Testability; Product Operations; Correctness; Reliability; Efficiency; Integrity Usability; Portability; Reusability; Interoperability

--------------

One of the major factors driving any production discipline is quality. In the current times, quality is the main "mantra," and business strategies are designed around quality. Clearly, developing methods that can produce high-quality software is another fundamental goal of software engineering. However, while cost is generally well understood, the concept of quality in the context of software needs further discussion. We can view quality of a software product as having three dimensions:

• Product Operation

• Product Transition

• Product Revision

The first factor, product operations, deals with quality factors such as correctness, reliability, and efficiency. Product transition deals with quality factors like portability and interoperability. Product revision is concerned with those aspects related to modification of programs, including factors such as maintainability, and testability.

These three dimensions and the different factors for each are shown in FIG. 3. Correctness is the extent to which a program satisfies its specifications. Reliability is the property that defines how well the software meets its requirements. Efficiency is a factor in all issues relating to the execution of software; it includes such considerations as response time, memory requirement, and throughput. Usability, or the effort required to learn and operate the software properly, is an important property that emphasizes the human aspect of the system. Maintainability is the effort required to locate and fix errors in operating programs. Testability is the effort required to test to ensure that the system or a module performs its intended function.

Flexibility is the effort required to modify an operational program (perhaps to enhance its functionality). Portability is the effort required to transfer the software from one hardware configuration to another. Reusability is the extent to which parts of the software can be reused in other related applications. Interoperability is the effort required to couple the system with other systems.

There are two important consequences of having multiple dimensions to quality.

First, software quality cannot be reduced to a single number (or a single parameter). And second, the concept of quality is project-specific. For some ultra-sensitive project, reliability may be of utmost importance but not usability, while in some commercial package for playing games on a PC, usability may be of utmost importance and not reliability. Hence, for each software development project, a project specific quality objective must be specified before the development starts, and the goal of the development process should be to satisfy that quality objective.

Despite the fact that there are many factors, reliability is generally accepted to be main quality criteria. As unreliability of software comes due to presence of defects in the software, one measure of quality is the number of defects in the software per unit size (generally taken to be thousands of lines of code, or KLOC). With this as the major quality criteria, the quality objective of software engineering becomes to reduce the number of defects per KLOC as much as possible. Due to this, defect tracking is another essential activity that must be done in a software project, in addition to tracking cost. Current best practices in software engineering have been able to reduce the defect density to less than 1 defect per KLOC. It should be pointed out that before this definition of quality is used, what a defect is has to be clearly defined. A defect could be some problem in the software that causes the software to crash or a problem that causes an output to be not properly aligned or one that misspells some word, etc. The exact definition of what is considered a defect will clearly depend on the project or the standards of the organization developing the project (typically it is the latter).

2.3 The Problem of Consistency

Though high quality, low cost (or high productivity), and small cycle time are the primary objectives of any project, for an organization there is another goal: consistency. An organizati6n involved in software development does not just want low cost and high quality for a project, but it wants these consistently. In other words, a software development organization would like to produce consistent quality with consistent productivity. Consistency of performance is an important factor for any organization; it allows an organization to predict the outcome of a project with reasonable accuracy, and to improve its processes to produce higher-quality products and to improve its productivity.

Achieving consistency is an important problem that software engineering has to tackle. As can be imagined, this requirement of consistency will force some standardized procedures to be followed for developing software. >>

PREV. | NEXT

top of page | Article IndexHome