在 Verilog 中向下舍入有符号定点数的绝对值

问题描述 投票:0回答:1

背景

您好,我正在致力于在 Verilog 中构建 R2MDC-FFT 引擎。

目前,引擎输出出现舍入错误(它在一些提供的测试用例中略有失败),我怀疑这是由于我对Butterfly Unit的乘法结果进行舍入的方式所致。

更详细...

  • 有符号定点数采用 Q<8.8> 格式,使用 2s 补码。

    • 1 位用于签名
    • 整数部分为 7 位
    • 小数部分为 8 位
  • Q<8.8> 输入 * Q<8.8> 输入给出 Q<16.16> 值,我的 Butterfly Unit 将其输出为 Q<8.8> 结果。

    • 因此,我们需要从 Q<16.16> 转换为 Q<8.8>,这就是舍入发挥作用的地方。

    • 所需舍入是对有符号定点数的绝对值进行向下舍入。

  • Q<16.16> 值转换为 Q<8.8> 输出可以分两步完成...

第 1 步 - 将

Q<16.16> 转换为 Q<24.8>(即丢弃多余的小数位),这就是舍入生效的地方。

参考以下资源...

我注意以下两点

  1. 有符号定点数执行算术右移(即截断),将始终导致结果的绝对值舍入向下

      因此,我可以截断正符号定点数来实现我想要的舍入。
  2. 有符号定点数执行算术右移(即截断),将始终导致结果的绝对值向上舍入

    因此,将
      1'b1
    • 添加到截断的负符号定点数将“反转”此效果,因为它会导致
      absolute
      减少。这实现了我想要的四舍五入。
第 2 步 - 将
Q

转换为 <24.8>Q(即丢弃额外的整数位)<8.8> 我们只需要取

Q

的低16位即可,无需舍入。<24.8>

这是蝴蝶单元中的(相关)实现...

module bf ( input signed [15:0] A, input signed [15:0] B, output reg signed [15:0] C ); localparam NUM_FRACTIONAL_BITS = 8; // 1. Perform sign extension of inputs wire signed [31:0] A_extended; wire signed [31:0] B_extended; assign A_extended = {{16{A[15]}}, A}; assign B_extended = {{16{B[15]}}, B}; // 2. Perform multiplication, and then collect the results from the LSB wire signed [63:0] mult_result_extended; wire signed [31:0] mult_result; assign mult_result_extended = A_extended * B_extended; assign mult_result = mult_result_extended[31:0]; // 3. Convert Q<16.16> to Q<8.8> reg signed [31:0] mult_result_shifted; always @(*) begin mult_result_shifted = mult_result >>> NUM_FRACTIONAL_BITS; // Q<16.16> to Q<24.8> truncation // Check signedness of result (i.e. Check MSB of 2s complement representation) if (mult_result[31]) begin // Result is negative, we add 1 to round down it's absolute value C = mult_result_shifted + 2'sb01; // Round, then convert Q<24.8> to Q<8.8> // Edgecase of overflow is checked and handled (code not shown) end else begin // Result is positive, truncation has same effect as rounding down it's absolute value C = mult_result_shifted; // Round, then convert Q<24.8> to Q<8.8> end end endmodule

需要帮助

上述实现不起作用,因为我仍在观察舍入错误。

我想验证一下上面的舍入逻辑是否正确。如果没有,我可以做什么来纠正它?

任何帮助将不胜感激,谢谢。

我尝试过的事情

我查看了可能相关的其他帖子,例如

https://stackoverflow.com/questions/73630956/truncated-signed-fixed-point-conversion-from-q2-28-to-q2-14-in -verilog

. 但是,它们都涉及向“最近的”整数舍入,这不是我需要的。

我还专门为乘法单元开发了一个测试平台并尝试验证它。 输出的结果符合我想要的舍入行为。我怀疑这也可能是因为我的测试输入太简单了,但我不知道除了随机输入测试向量并希望出现错误之外还能做什么。

可重现的示例

// Multiplication Unit module mult ( input signed [15:0] A, input signed [15:0] B, output reg signed [15:0] C ); localparam NUM_FRACTIONAL_BITS = 8; // 1. Perform sign extension of inputs wire signed [31:0] A_extended; wire signed [31:0] B_extended; assign A_extended = {{16{A[15]}}, A}; assign B_extended = {{16{B[15]}}, B}; // 2. Perform multiplication, and then collect the results from the LSB wire signed [63:0] mult_result_extended; wire signed [31:0] mult_result; assign mult_result_extended = A_extended * B_extended; assign mult_result = mult_result_extended[31:0]; // 3. Convert Q<16.16> to Q<8.8> reg signed [31:0] mult_result_shifted; always @(*) begin mult_result_shifted = mult_result >>> NUM_FRACTIONAL_BITS; // Q<16.16> to Q<24.8> truncation // Check signedness of result (i.e. Check MSB of 2s complement representation) if (mult_result[31]) begin // Result is negative, we add 1 to round down it's absolute value C = mult_result_shifted + 2'sb01; // Round, then convert Q<24.8> to Q<8.8> // Check for overflow if (mult_result_shifted + 2'sb01 == 32'b0) C = 16'hFFFF; end else begin // Result is positive, truncation has same effect as rounding down it's absolute value C = mult_result_shifted; // Round, then convert Q<24.8> to Q<8.8> end end endmodule

// Testbench module tb_mult( ); reg [15:0] tb_A; reg [15:0] tb_B; wire [15:0] tb_C; mult DUT ( .A(tb_A), .B(tb_B), .C(tb_C) ); // All inputs/outputs are in Q<8.8> format, using 2s complement // Q<8.8> can represent -128 to +127.99609375, with resolution 0.00390625 initial begin tb_A = 16'b1111_1010_0111_1111; // Q<8.8> = -5.50390625, 2s complement decimal value = -1409 tb_B = 16'b0000_1101_0001_1010; // Q<8.8> = +13.1015625, 2s complement decimal value = 3354 //tb_C expected to be 16'b1011_0111_1110_0100; Equivalently Q<16.0> = -18460 or Q<8.8> = -72.109375 /* Explanation 1. tb_A * tb_B = 32'b1111_1111_1011_0111_1110_0011_1110_0110; lets call this result D. 2. Note that D is in Q<16.16> format with the fixed-point representation -72.10977172851562. 3. Naively, we could convert it directly by taking the middle 16bits. i.e D[23:8] 16'b1011_0111_1110_0011, which has Q<8.8> = -72.11328125 - However, we want the final result's absolute value to be rounded down. i.e We want 16'b1011_0111_1110_0100, which has Q<8.8> = -72.109375 4. Suppose that D = 32'b1111_1111_1011_0111_1110_0011_0110_0110. (Notice that D[7] is different from the previous scenario) - This has fixed point representation -72.11172485351562. 5. We still want the final result to be 16'b1011_0111_1110_0100 or -72.109375, as we want to final Q<8.8> result's absolute value to ALWAYS be rounded down. */ end endmodule


我没有尝试代码,但我怀疑有两个问题:
verilog rounding fixed-point
1个回答
0
投票

总是在截断负数后进行调整是不对的。如果所有被截断的位都为零,则截断后的数字已经“完美地表示”并且不需要调整。只有当截断位非零时才应该进行调整。

  1. 添加 1'b1 是对 2 的补码值的无符号表示进行

    raw
  2. 加法的正确调整。但是,您正在使用
  3. signed

    类型,因此您希望调整为 减法truncval - 2'sb01

        

© www.soinside.com 2019 - 2024. All rights reserved.