Fast polynomial multiplication and decomposition

(a) Fix integers and , and let . (Elements of are polynomials in , where the coefficients are taken modulo , and .) For a sense of scale, you should imagine that , and .

We want to perform the following two operations on elements of .

We want to be able to perform a sequence of multiplications and decompositions as quickly as possible. How?

The fastest algorithm we have currently does an FFT at each step; it would be great if we could avoid this.

(b) Now suppose is a power of , so .

Can you come up with a fast algorithm for multiplying elements of ?

Here we’re looking for an algorithm that will run fast in practice on real hardware. So: