如何将苹果硅的非正规数清零?

问题描述 投票:0回答:1

我正在寻找与 中找到的 x86/64 FTZ/DAZ 指令等效的指令,但适用于 M1/M2/M3。另外,假设“苹果硅”等于 ARM 是否安全?

我正在将实时音频插件 (VST3/CLAP) 从 x64 Windows 移植到 Apple Silicon 硬件上的 MacOS。至少在 x64 上,对于实时音频代码很重要,非正规数(也称为次正规数)被硬件视为零,因为这些非常小的数字以其他方式在软件中处理,这会导致真正的性能下降。

现在,由于非正规数是 IEEE 浮点标准的一部分,并且这里明确提到了它们 https://developer.arm.com/documentation/ddi0403/d/Application-Level-Architecture/Application-Level-Programmers --Model/The-optional-Floating-point-extension/Floating-point-data-types-and-arithmetic?lang=en#BEICCFII,我相信一定有一个相当于英特尔的 _MM_SET_FLUSH_ZERO_MODE 和 _MM_SET_DENORMALS_ZERO_MODE 宏。当然,我可能是错的,或者硬件可能默认刷新为零(从ARM文档中我不太清楚),在这种情况下,我也想知道这一点。

c++ arm apple-silicon denormal-numbers
1个回答
0
投票

包括

<fenv.h>
并使用:

int r = fesetenv(FE_DFL_DISABLE_DENORMS_ENV);
// check r == 0

来自

man fegetenv

The fesetenv() function attempts to establish the floating-point environment
represented by the object pointed to by envp.  This object shall have been
set by a call to fegetenv() or feholdexcept(), or be equal to a floating-point
environment macro defined in <fenv.h>.

并且来自

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/fenv.h
(在
__arm64__
ifdef 检查后面进行门控):

/*  FE_DFL_DISABLE_DENORMS_ENV
 
    A pointer to a fenv_t object with the default floating-point state modified
    to set the FZ (flush to zero) bit in the FPCR.  When using this environment
    denormals encountered by floating-point calculations will be treated as
    zero.  Denormal results of floating-point operations will also be treated
    as zero.  This calculation mode is not IEEE-754 compliant, but it may
    prevent lengthy stalls that occur in code that encounters denormals.  It is
    suggested that you do not use this mode unless you have established that
    denormals are the source of measurable performance problems.
 
    Note that the math library, and other system libraries, are not guaranteed
    to do the right thing if called in this mode.  Edge cases may be incorrect.
    Use at your own risk.                                                     */
extern const fenv_t _FE_DFL_DISABLE_DENORMS_ENV;
#define FE_DFL_DISABLE_DENORMS_ENV &_FE_DFL_DISABLE_DENORMS_ENV

测试代码:

#include <fenv.h>
#include <stdint.h>
#include <stdio.h>

typedef volatile union
{
    float f;
    uint32_t u;
} num_debug_t;

int main(void)
{
    num_debug_t n = { .u = 0x00800001 };
    printf("Hex value: 0x%08x\n", n.u);
    printf("Float value: %e\n", n.f);
    printf("===== Normalised =====\n");
    num_debug_t d = { .f = n.f / 2.0f };
    printf("Division result hex: 0x%08x\n", d.u);
    printf("Division result float: %e\n", d.f);
    int r = fesetenv(FE_DFL_DISABLE_DENORMS_ENV);
    if(r != 0)
    {
        fprintf(stderr, "fesetenv returned %d\n", r);
        return -1;
    }
    printf("===== Denormalised =====\n");
    d.f = n.f / 2.0f;
    printf("Division result hex: 0x%08x\n", d.u);
    printf("Division result float: %e\n", d.f);
    return 0;
}

输出:

Hex value: 0x00800001
Float value: 1.175494e-38
===== Normalised =====
Division result hex: 0x00400000
Division result float: 5.877472e-39
===== Denormalised =====
Division result hex: 0x00000000
Division result float: 0.000000e+00

在幕后,这只是设置

FPCR
系统寄存器中的位 24 (FZ)。

© www.soinside.com 2019 - 2024. All rights reserved.