如何在 Pytorch 中手动对某一层的输出进行反量化并为下一层重新量化?

问题描述 投票:0回答:1

我正在开展一个学校项目,需要我对模型的每一层进行手动量化。具体来说,我想手动实现:

量化激活,结合量化权重A-A层- 量化输出 - 反量化输出 - 重新量化输出,组合 具有量化权重 B - B 层 - ...

我知道Pytorch已经有量化函数,但该函数仅限于int8。我想从bit = 16到bit = 2进行量化,然后比较它们的准确性。

我遇到的问题是,量化后,一层的输出大了多个数量级(bit = 16),并且我不知道如何将其反量化回来。我正在使用相同的激活和权重的最小值和最大值来执行量化。这是一个例子:

Activation = [1,2,3,4]
Weight = [5,6,7,8]
Min and max across activation and weight = 1, 8
Expected, non-quantized output = **260**

Quantize with bit = 16
Quantized activation = [-32768, -23406, -14044, -4681]
Quantized weight = [4681, 14043, 23405, 32767]
Quantized output = -5609635504
Dequantize output with min = 1, max = 8 = **-599178.3750**

这个计算对我来说很有意义,因为输出涉及激活值和权重的相乘,它们的幅度增加也相乘。如果我用原始的最小值和最大值执行一次反量化,那么有更大的输出是合理的。

Pytorch 如何处理反量化?我尝试定位Pytorch的量化,但是找不到。如何对输出进行反量化?

python machine-learning pytorch quantization
1个回答
0
投票

我认为你计算反量化输出的公式可能有问题。

import numpy as np

# Original values
activation = np.array([1, 2, 3, 4])
weight = np.array([5, 6, 7, 8])

# Quantization parameters
bit = 16  # Desired bit precision
min_val = min(np.min(activation), np.min(weight))
max_val = max(np.max(activation), np.max(weight))

# Calculate scale factor
scale_factor = (2 ** (bit - 1) - 1) / max(abs(min_val), abs(max_val))

# Quantize activation and weight values
quantized_activation = np.round(activation * scale_factor).astype(np.int16)
quantized_weight = np.round(weight * scale_factor).astype(np.int16)

# Dequantize activation and weight values
dequantized_activation = quantized_activation / scale_factor
dequantized_weight = quantized_weight / scale_factor

# Print values
print("Original activation:", activation)
print("Original weight:", weight)
print("Minimum value:", min_val)
print("Maximum value:", max_val)
print("Scale factor:", scale_factor)
print("Quantized activation:", quantized_activation)
print("Quantized weight:", quantized_weight)
print("Dequantized activation:", dequantized_activation)
print("Dequantized weight:", dequantized_weight)

---------------------------------------------------------

Original activation: [1 2 3 4]
Original weight: [5 6 7 8]
Minimum value: 1
Maximum value: 8
Scale factor: 4095.875
Quantized activation: [ 4096  8192 12288 16384]
Quantized weight: [20479 24575 28671 32767]
Dequantized activation: [1.00003052 2.00006104 3.00009156 4.00012207]
Dequantized weight: [4.99990844 5.99993896 6.99996948 8.        ]

计算输出:

output = np.sum(dequantized_activation * dequantized_weight)
print("Dequantized output:", output) # 70.00183110125477
© www.soinside.com 2019 - 2024. All rights reserved.