A coarse grained morphable Datapath Unit (mDPU) has been proposed. This mDPU implements multiplier in a smart way that enables the component adders to be reused when we do not need the multiplier. A pipelined design further enhances the design by creating a balanced datapath in temporal sense. These two features results in a design that optimally uses silicon and time. A judicious set of Coarse Granular instructions are enabled by the mDPU that we show can implement typical signal processing functions. A radix-2 64 point FFT has been implemented in 90 nm technology using the proposed mDPUs and performance and energy results from physical design phase are reported and compared to a state-of-the-art comparable design from the research community. 4X improvement in performance and 2.5X improvement in power-performance product are reported.