What should we be looking at?


By Chris Douce

Through the discussion list I asked the PPIG community what issues we should be studying. I took the opportunity to ask Ruven Brooks, a long-standing contributor to the psychology of programming community, this question. Many thanks are extended to Ruven Books and Carl Hardeman for their considered replies.

Ruven Brooks

The phrase, 'programming in the large,' was introduced in 1975 by Frank Deremer and Hans Kron to emphasize the difference in tasks and activities between software developments done by one or two people and software development done by larger groups.

One widely circulated model of effort allocation in large scale development claims that as little as 20% of the effort may be devoted to writing code that will be part of the delivered product. Since most psychology of programming work has focused on the coding phase, a useful guide for research directions may be to look at psychological aspects of those activities that take up the remaining 80% of the resources.

One way to view all of these other activities is that programming is a problem solving process that begins with a problem to be solved. At the beginning, this problem is not the problem of writing a program, but rather is some need from outside the programming domain: 'give me an easier way to manage my schedule than a paper calendar.'

As with all problems, not just 'ill-defined' ones, a problem elaboration activity takes place. This is referred to as requirements gathering or requirements analysis. As the features of the problem emerge, design solutions at various levels are proposed for them. Eventually, the design becomes refined to a detailed enough level to permit coding in an existing programming language.

What parts of this process are well supported by research and which ones have been neglected?

The upstream problem elaboration process has been an area of active research; topics such as task analysis and user modeling have a large research literature. The design process has been the subject of substantial study, with concepts such as 'opportunistic design' being used to explain design behavior.

In software, there is a large body of work on design and specification notations, although little of the work is focused on behavioral issues. Where the light of investigation seems dimmer is at the boundary between specification and coding.

How much specification is effective? Can too much specification actually reduce coding performance? What specification notations work best? Are more formal ones really better in terms of the coding product produced? Is it useful to present the same information in more than one format? How much role does general domain knowledge play in interpreting specifications? Is the design affected by the order in which a specification is presented? These are all questions which could benefit from more investigation.

Although the body of work on coding is large (although probably not large enough) nearly all of it has been focused on general purpose programming languages, particularly those taught in academic environments to novices.

In commercial software production, though, there is a great deal of work done in scripting languages, particularly those associated with an application. For example, there are install script languages, build scripting languages (MAKE) and test scripting languages. In particular, for those selling packaged software, install scripts are critical, since if a product will not install, or worse still, messes up already installed products, the customer is far more likely to make a support line call or return the product than if there are bugs in the code once the product is installed.

In addition to working with different objects than general purpose languages, these scripting languages often have very different syntax and control flow. It may well turn out to be the case that what is known about general purpose languages applies to these languages as well, but the question is still entirely open.

In software development environments which produce software to be sold externally, testing is a major use of software development resources. The number of testers may equal or exceed the number of programmers, and even in methodologies such as Extreme Programming which focus on small programming teams, programmers spend significant amounts of time on test code.

Among the questions to be answered are, what factors affect a person's choice of what tests are to be performed? How good are programmers at testing their own code? Are they better at testing other people's code than their own? What are the individual differences in testing? How do you train novices to do testing? The research areas mentioned earlier also interact with testing; there are test script languages, and 'black box' testing starts with the specifications, so script language and specification research questions have their testing aspects.

Alas, my perception is that most software companies are currently far more interested in transferring work to lower wage countries than they are in understanding software development and providing better tools; nevertheless, I suspect that research in the areas outlined is more likely to be seen as relevant and, at least, a starting point for interaction, than the past areas of psychology of programming.

As well as being a seasoned industry professional, Ruven published his first psychology of programmer paper in the International Journal of Man-Machine studies back in 1977. He is widely known for his work on 'program beacons' which continues to inspire and direct empirical research exploring strategies of program comprehension.

Carl Hardeman

It is clear to all but mangement and professional project managers and most developers that the problems with getting software correct stem from:

  1. Inherent complexity. Stop here if you do not understand that the number of business cases and test cases explodes exponentially (like a salesman's route optimization problem) and therefore testing can never be more than sampling. Design must be simplified and abstracted just like outlining a chapter in one's book in Ms Thistlebaum's English Literature class.
  2. Constant change. Design for change.
  3. Extreme requirements for performance, reliability, volume, etc.
  4. Maintenance of conceptual integrity (from Fred Brooks of Brooks' Law fame).

So the questions I suggest PPIG address are:

  1. Do developers fail to recognize complexity or shun it based on their own confidence?
  2. Why do developers detest engineering, particularly measurement and statistical process control and how can we overcome that? This is a sine qua non for getting software right.
  3. What definitive objective statements can be made as to when a software product is ready or production use?
  4. Why do we fail to recognize a big ball of mud for which the only solution is redesign? Big Ball of Mud
  5. Why do we fail to understand that business model changes of the past few years (to a realtime state-event process control model from an after the fact damage control model) requires a rearchitecting of software rather than an extensive patchup?
  6. Why are there massive projects and failures? One could argue there are no massive projects, only large scale integration of smaller projects - none of which should be so large as to have lost intellectual control over it. For instance, the US Treasury Department income tax system was a failure. Could they have written a small system which handled only the simple 1090EZ forms which would have the large majority of taxpayers up and running on a new system in a short time? Then develop separate systems to handle the more complex cases.
    Yes, they could have, but they worked on a failed monolithic model.

Failing to recognize and handle with engineering methods basic complexity is what the problem is with software quality. It seems most cannot see that, the Snuffleupagus, or the King's new clothing.

I once knew of a system with 100+ variables which could affect the outcome. Assuming they are bimodal, and many were multi-modal, that's 2 to the 100 - 1 cases. That's more cases than stars in the observable universe. And that ignores further complexity such as the variables having to occur in specific sequences. That makes DNA look easy. Yet I imagine the failed new US IRS System had at least that complex a situation.

That should help with the understanding that

  1. you cannot test when you have a combinatorial explosion of cases - you can only sample
  2. therefore you can only use statistics (e.g. standard error of the estimate and Markov chain analysis) to make statements about being ready for producion, and
  3. these problems cannot be overpowered with human intellect alone and require engineering e.g. Cleanroom Software Engineering using sequence enumeration for specification and statistical sampling for verification.

Having rambled all that. How can PPIG contribute to the improvement in the quality of software? Clearly coding is a minor issue. I am positive your members can frame the proper questions once they appreciate the correct problem.

As a Master Gardener, one grows to understand gardening is about the soil as the plants will generally take care of themselves once planted in a nice prepared bed of soil.

Carl Wayne Hardeman has 40+ years experience as a developer and 25+ adjunct faculty in Computer Science.