The target audience for this article is digital system designers or students who are unfamiliar with basic elements of design for test.

The two essential parameters of an effective test with high fault coverage are controllability and observability. Scan test is a means of increasing both in a sequential digital IC design. To understand scan test, let’s do a brief thought experiment. Picture a chip design with a memory deeply embedded within the structure. In order to remove the memory from the IC and put it out on the circuit board, you would need to increase the pin count of the package. Now take that idea to an absurd extreme: Instead of a memory, assume that you are removing all the flip-flops from the design and putting them on the circuit board, and that you have a package with an effectively infinite pin count. With the flip-flops removed, all that’s left in the chip is combinatorial logic. That’s what a scan test is, except that you access the flip-flops not in parallel by using I/O pins but serially, by adding a 2:1 mux in front of each flop. The select and the second input of the mux are used to configure the flops into one or more shift registers.

The scan test design has two modes, a functional mode and a test mode in which the flip-flops are accessed via the shift register configuration. A comprehensive test can be created by an Automatic Test Pattern Generation, or ATPG, program. ATPG sees the flip-flops as I/O buffers, which is in effect the absurd part of the preceding thought experiment. Since all the program sees is combinatorial logic, each test pattern is independent of every other test pattern; there is no history. The ATPG program targets a fault, configures the combinatorial logic to force the node to the opposite value and to allow that value to propagate to an “output” (usually a flip-flop input, but an output buffer will also do), checks to see what other faults are also detected, modifies the master fault list, and repeats that procedure until the programmed limit is reached in coverage or number of patterns. The resulting patterns constitute the scan test.

A small caveat for anyone debugging a failed scan test is to remember that, even though the designer is accustomed to thinking of the scan chain in sequence from input shift register to output shift register, the results as they appear to the tester are in the exact reverse order; the first bit the tester sees is from the last flop in the chain, and the last bit the tester sees is from the first flop in the chain. A data log of a failure will usually be configured the same way, from the tester’s point of view.

The scan test operation on the tester is as follows: put the chip into test mode, load the scan chain, put the chip into functional mode, observe the primary outputs (to cover any logic between flip-flop and output buffer), set the primary inputs (to cover any logic between input buffer and flip-flop), execute one clock cycle, put the chip into test mode, observe the scan chain output, and start clocking the scan chain and observing the output to record the results from the flip-flops. The process then repeats, with the additional point that the first and last steps overlap one another; in other words, the new scan pattern is being loaded in to the scan chain while the results from the previous pattern are being shifted out. Notice also that the first bit of interest is in the last flop in the chain and ready to be observed as soon as the chip goes from functional mode to test mode, so observation (a.k.a. strobing) must be done before the first shift clock.

That’s the ideal situation. However, a little thought will reveal a number of issues. One of the first that comes to mind is bidirect pins. The shifting process might cause the bidirect outputs to enable and disable chaotically. This is not a problem if scan test mode also disables the bidirect outputs. Another issue that comes to mind is the number of clock cycles required to load the scan chain. As designs get larger and larger, this time spent on the tester may become prohibitive. The first order solution is to use multiple scan chains loading in parallel. This works well if all scan chains are of approximately equal length. In a design with multiple clock domains, balancing scan chain length will likely require multiple clocks on the same scan chain. Lock-up latches should be inserted between clock domains to solve any clock skew issues. Whether you have multiple scan chains or one, you can arbitrarily load any set of values into any scan chain, and directly monitor all results. Another feature to take advantage of is that the sequence of the flip-flops in the scan chain, and even which flip-flop is in which scan chain, are irrelevant. Scan chains can be re-sequenced and re-partitioned after placement to conserve routing resources. It is not even necessary to re-run the ATPG program; just reformat the output to accommodate the re-sequencing and re-partitioning.

As designs pass a million gates, even this technique of multiple scan chains might not be sufficient. You run out of tester time, I/O, or both. A relatively recent innovation is scan compression. In short, this involves a very large number of short scan chains loaded through a decoder driven by a small number of inputs, and monitored through a small number of outputs, usually with some form of signature generation. The attributes you lose are the ability to arbitrarily load any value into any scan chain, and perhaps the ability to re-sequence and re-partition without re-running APTG. Fortunately, random patterns will cover at least eighty percent of all faults, and any remaining faults can be targeted with what controlability and observability you have. To overcome the lack of direct access, you need a more complex ATPG program.

Advertisements