A clock system for a huge grid of small clock regions is presented.
There is an oscillator in each clock region, which drives the local
clock of a processing element (PE). The oscillators are kept
synchronized by exploiting the phase of their neighbors. In an infinite
mesh, the clock skew would be zero, but in a network of limited size
there will be fringe effects. In a mesh with 25×25 oscillators, the
maximum skew between neighboring regions is within 3.3 ps. By slightly
adjusting the free running frequency of the oscillators, this skew can
be reduced to 1.2 ps. The mesh may contain millions of clock regions.
Because there is no central clock, both power consumption and clock
frequency can be improved compared to a conventional clock distribution
network. A PE of 150×150 µm² running at 6.7 GHz with 93 master-slave
flip-flops is used as an example. The PE-internal clock skew is less
than 2.3 ps, and the energy consumption of the clock system 807 µW per
PE. It corresponds to an effective gate and wire capacitance of 509 aF,
or 7.3 gate capacitances.
Power noise is reduced by scheduling the local oscillators gradually
along one of the grid’s axes. In this way, surge currents, which
generally have their peaks at the clock edges, are distributed evenly
over a full clock cycle.