Wednesday, October 17, 2012

Productivity in Electronics Design

One has heard it innumerable times, especially from management types. "We reward productivity", or "We want greater productivity", with the common variant of "result orientation" thrown in for good measure.

That's all very nice, and it sounds really nice as well, but we engineering types don't get excited when we hear miscellaneous morphemes of the sort.

No, not at all. We are far below all that. We need specifics. What is productivity? - and one means a real definition - a mathematical one. Without such a definition of a quantity, it follows that one can't really think about measuring it, much less increasing it and then rewarding it.

No, certainly not - without a meaningful definition of productivity, the whole management concept of increasing and rewarding it axiomatically reduces to a game of dangling carrots in front of donkeys.

After not much thought one can come up with some kind of definition like "amount of work done per unit effort". One can then immediately proceed to blow a neat little hole in that definition - what is "effort"? OK then, how about work per unit time? That's nice again, but for something like manufacturing widgets. How could one measure creativity that way? What if one didn't get any ideas for a month and solved three problems in three minutes? One recognizes manufacturing as a linear process, so any slope like definition of productivity - number of widgets made in a day - to use a tired old example - works.

But creating something is a nonlinear process, and measuring slopes of nonlinear things is an exercise in non-existence. It is fallacious application of mathematics at best, and at worst, a fairly certain method to kill the motivation levels of an engineering organization.

Having thus destroyed that pedestal of management - productivity - at least as understood by the not-so-rigorous, we now proceed to figure out measurements that do matter, and then go about trying to really optimize and improve them.

Say we are designing an ASIC. We'll concentrate on ASICs for this piece, but the concept can be extended to any kind of electronics design, or for that matter, any kind of design.

One parameter of significance is the total time taken from concept to completion. Reduce this, and rather predictably, thou shalt win the gratitude of management. Another good parameter to measure is the team size for the project. Reduce this and thou shalt also win the gratitude of management.

This is easier said than done, however. Astute readers will already have recognized the problem - that "reduction" is a difference operation. That means that one needs to have two time lines, or two team sizes. The problem of course is that nobody does the exact project twice - in time or space. So one of the numbers required for taking the difference is necessarily a guess, or an estimate - just short of being imaginary. No, not imaginary as in mathematics, imaginary as in fictional.

And yet, as any experienced engineer will tell you, a guess is better than no data at all; and many estimates of time lines and team sizes are surprisingly accurate, notwithstanding the empirical methods used to arrive at them.

So what if one took an estimate for time or manpower, and met the budget for a change? Better yet, what if one beat it by any significant amount? Now that would be a good achievement, wouldn't it? Whether or not the management was impressed with it or was even told about it, one's team would certainly be grateful. Now that's that kind of gratitude that could genuinely lead to better quality, or perhaps a lower attrition rate.

One hopes that we agree that attempting to reduce time and team size meaningfully would be a good thing. So then, what is a meaningful reduction in time or team size? One posits that anything less than a 30% difference is lost in the noise. That is, given the inherent variability of the design and creation process, saving anything less than 30% on time or team size could be attributed to any number of other hard to measure factors - good management, good team integration, better engineering practices or pixie dust.

Stated alternately, if you went to your management and claimed a 10% reduction in time for a given project, they'd probably send you on your way with a pat on your back and nothing else. Because it couldn't be proved to the exclusion of other factors that this 10% was won because you did something differently. In this one actually could sympathize with the management - the signal to noise ratio for this kind of measurement is abysmal.

This is an important insight - that "productivity" as it is defined today is exactly what one suspects it is - a game of dangling carrots before donkeys, not only because its definition is vague; but also because even when it is defined with some semblance of rigor, the noise in the measurement obfuscates the signal to the point of uselessness. Keep this in mind the next time you unsuccessfully ask your management for a 10% raise for a 10% increase in productivity. The former measurement has nearly infinite SNR and the latter, nearly zero.

But there is a way. If the reduction in time or team size is greater than a third or so, then the improvement rises above the noise floor to establish itself firmly as signal - that is a true increase in the reciprocal measure - productivity or some analogous concept.

Improving productivity in ASIC design is somewhat like flying an aircraft. Any of thousands of causes can cause a crash, and innocent errors can string together to spell disaster. Quite like that, there are a thousand factors conspiring to kill productivity. And like the operation of an aircraft, you must tightly pin down the entire development chain.

In the case of an ASIC, you must plan out in some detail the various steps - concept, architecture, modeling, RTL, verification, performance, synthesis, timing, place and route, signal integrity, post-route verification, emulation, bring-up, debug, failure analysis, QA and so on. You must plan out resources - people, server time, tool licenses etc. You must develop cross-functional relationships, however strange that sounds. Any one of several such steps goes wrong, and your chip is toast. It is in the milieu of such pitfalls that we would like to decrease time or team size by a third or more.

It has been seen time and time again that execution time increases dramatically when the number of iterations for a given task increase. Iterations increase because of specification changes or because of not anticipating issues and problems or because of execution errors. There isn't much that can be done about the latter two, since it truly requires experience and skill to do ones job right. There is, in short, no process that can replace experience and skill when those specific qualities are required.

Specification changes are another matter. Sadly, more than anything else, even a well planned design is not immune to this curse - which causes teams to repeat whatever has already been done. Not surprisingly, this is a factor in creating boredom, and a desire to not work now, but rather wait until the specification has "solidified", as it were - and time lost thus is like the rent of empty hotel rooms - lost forever.

One has to accept that specification changes are more or less unavoidable, especially in a design phase that lasts several months. During this time, new data many-a-time come to light about desirable features, what the competition is doing or about what customers prefer, that often necessitate changes in the specs. Unsurprisingly, most teams are loath to respond positively to such changes.

This stems from a well justified desire in teams to not iterate. However, design teams choose to avoid iteration by going at it sequentially, as if every future step depended on past ones. Perhaps this is because we are humans - we expect temporal sequences in effects to map bijectively from temporal sequences in causes. This need not be so - indeed the very insight that such causality is an assumption is one of the most dramatic harbingers of future success.

Breaking temporal sequences requires doing things in parallel, as it is known colloquially. "Parallelism in the process" is not a new concept - it is widely used in manufacturing, for instance to dramatically speed up an assembly line. The new thing is trying to apply it to something that is inherently the antithesis of an assembly line - design and development.

Parallelism not only means thinking about and planning out several of the steps in advance, but actually executing them ahead of time too. Thus, the smart engineer will do those parts of the RTL  first that can be used simultaneously to create a software emulator of the chip so that some members of the team, for instance those who do back-end work can spend time writing bring-up scripts. Sometimes such connections are non-intuitive, so some thought is necessary to determine connections between steps, and do as many in parallel as possible.

Once again, the sharp amongst one's readers will realize that much of this is easier said than done - for instance, there is always the perturbing thought that a change in specification will now impact  multiple teams, compounding all the evils of iteration.

This seems like a huge stumbling block, until one comes to the understanding that if one could identify and automate away those processes or sub-processes that need iteration then one could parallelize away to one's heart's content.

Thus, in the specific case above, a register generation tool that takes in a specification and simultaneously generates RTL and all that would be needed for verification, emulation, bring-up scripts, standards compliance and audits or whatever-else-not would mean that an architecture team, an RTL team, a verification team and a bring-up team could work in parallel, with all attendant benefits.

This brings us to the real point of today's article: To successfully improve the vague notion of productivity, one needs to parallelize the process - and to parallelize successfully, one needs a proper toolset for automating as many tasks as possible. Specifically, one needs to create tools that allow simultaneous development of a multitude of sub-tasks with nearly zero penalty for iteration on any of them.

So is such a toolset possible - even if only conceptually? One asserts that indeed it is, and hopes that this rather titillating thought will suffice until future missives.

No comments:

Post a Comment