Domain Specific Architectures (DSAs)
Certain applications have sufficient demand to lead to the ultimate optmization technique - designing hardware targeted at that one application. Examples of this include:
Cryptography accelerators
Compression gngines
Block chain hashers
Neural networks
In reality, many DSAs are targeting fast moving application areas, and as such a trade off is made between fixed purpose hardware and some level of programablility. However, these devices are still not suited for totally general purpose usage like A CPU. Devices in this category include
Network switch processors. This is a big area with a lot of interesting architecture, but in most cases they are not suitable for accelerating workloads not closely coupled to network routing / filtering.
Intel’s Exascale Dataflow engine (hard to tell how domain specific this is yet)
Vector processors such as Nec SX-Aurora.
A particularly big growth area for DSA is around neural networks:
Google TPU - Detailed description in Hennesy and Patterson.
Mythic - An unsual hardware approach doing inference in the analog domain.
Many other NN accelerators are under development or already on the market, see Wikipedia AI accelerator
Another big area is image processing DSAs which have been around for a long time. Recent progress has been towards making them more programable and flexible.
Most mobile SoCs have some level of programmable image processor.
Programming DSAs
The wide variety of different DSA architectures typically means that the method used to work with each device is through a library. Examples include Tensorflow for Google’s TPU.
Some DSAs have more general programing approaches such as Halide for the Google Visual Core