1. Serial multiplier
The product of two N-bit binary numbers x and y is calculated by simple method, which is realized by shift operation.
module multi_CX(clk, x, y, result); input clk; input [7:0] x, y; output [15:0] result; reg [15:0] result; parameter s0 = 0, s1 = 1, s2 = 2; reg [2:0] count = 0; reg [1:0] state = 0; reg [15:0] P, T; reg [7:0] y_reg; always @(posedge clk) begin case (state) s0: begin count <= 0; P <= 0; y_reg <= y; T <= {{8{1'b0}}, x}; state <= s1; end s1: begin if(count == 3'b111) state <= s2; else begin if(y_reg[0] == 1'b1) P <= P + T; else P <= P; y_reg <= y_reg >> 1; T <= T << 1; count <= count + 1; state <= s1; end end s2: begin result <= P; state <= s0; end default: ; endcase end endmodule
The multiplication function is correct, but it takes eight cycles to calculate a multiplication. Therefore, it can be seen that the serial multiplier is slow and time-delay, but the advantage of this multiplier is that it occupies the least resources of all types of multipliers and has a wide range of applications in low-speed signal processing.
2. Pipeline multiplier
In general, the fast multiplier usually adopts a bit-by-bit parallel iterative array structure, which submits the N bits of each operand to the multiplier in parallel. But generally speaking, for the FPGA, carry speed is faster than add speed, this array structure is not optimal. So we can adopt the form of multi-level pipeline and add the product of two adjacent parts to the final output product, that is, to form a structure of binary tree, so we need lb (N) level to implement N-bit multiplier.
module multi_4bits_pipelining(mul_a, mul_b, clk, rst_n, mul_out); input [3:0] mul_a, mul_b; input clk; input rst_n; output [7:0] mul_out; reg [7:0] mul_out; reg [7:0] stored0; reg [7:0] stored1; reg [7:0] stored2; reg [7:0] stored3; reg [7:0] add01; reg [7:0] add23; always @(posedge clk or negedge rst_n) begin if(!rst_n) begin mul_out <= 0; stored0 <= 0; stored1 <= 0; stored2 <= 0; stored3 <= 0; add01 <= 0; add23 <= 0; end else begin stored0 <= mul_b[0]? {4'b0, mul_a} : 8'b0; stored1 <= mul_b[1]? {3'b0, mul_a, 1'b0} : 8'b0; stored2 <= mul_b[2]? {2'b0, mul_a, 2'b0} : 8'b0; stored3 <= mul_b[3]? {1'b0, mul_a, 3'b0} : 8'b0; add01 <= stored1 + stored0; add23 <= stored3 + stored2; mul_out <= add01 + add23; end end endmodule
As can be seen from the figure, pipeline multiplier is much faster than serial multiplier, which is widely used in non-high-speed signal processing. As for the multiplication of high-speed signals, it is generally necessary to use the hard core DSP unit embedded in the FPGA chip.