我正在使用 A9MP 处理器(NXP/Freescale iMX6Q)进行裸机项目,并且正在设置 MMU。该项目将使用 2 个(4 个)核心。核心 0 将读取 OCRAM 中公共可共享数据区域中的数据,并将数据显示在 LCD 显示屏上。核心 1 正在收集数据并将其插入公共区域。使用 LDREX/STREX 互斥体保护对公共区域的读/写。公共数据设置为STRONGLY_ORDERED,不执行(我认为这是正确的)。 我有几个问题:
我遇到了一些内存问题,代码内存被损坏。此时我的代码内存类型为 RW。我计划或改变这一点,但想先看看我是否可以获得上述问题的答案。
SMP bit and FW bit:
If you are using more than one core and there is any sharing of resources (memory, peripherals, etc.) between the cores, then the SMP bit should be set to enable cache coherency between the cores.
The FW bit (Forwarding bit) should generally be set when the SMP bit is set, as it allows cache line transfers between cores without the need for explicit cache maintenance operations.
The downside of setting the FW bit is a slight increase in complexity and power consumption, but the benefits of cache coherency outweigh this in most multi-core applications.
L1 Dcache prefetch bit:
Enabling L1 Dcache prefetch can improve performance by prefetching data into the cache before it is actually needed, reducing cache misses.
The downside is increased power consumption and potential cache pollution if the prefetched data is not actually used.
For your use case with large memory copies, enabling prefetch could potentially improve performance, but you may need to experiment to determine the actual impact.
Alloc in one way bit:
This bit can be useful for reducing cache thrashing when performing large memory copies or operating on large data structures.
It restricts cache line allocation to a single way in the cache, reducing conflicts and cache evictions.
For your 1.5MB frame buffer copies, enabling this bit could potentially improve performance by reducing cache thrashing.
You can enable it only during the copy operations or leave it enabled all the time, depending on your specific workload and performance requirements.
For large copies, using DMA may be more efficient than NEON or CPU copies, as it offloads the work from the CPU and can operate concurrently.
Strongly-ordered memory and access permissions:
You are correct, strongly-ordered memory is inherently shareable and readable/writable by default.
Setting the shareable and RW access bits for strongly-ordered memory is redundant, as these properties are implicit for this memory type.
Write-back vs. write-through for DRAM memory:
Using write-back caching for DRAM memory is generally preferred, as it can provide better performance by reducing the number of writes to memory.
Write-through caching can be useful for memory-mapped peripherals or other memory regions where writes must be immediately visible to other components.
Your plan to use write-back for DRAM memory and uncached memory for frame buffers is a reasonable approach.
WBWA memory type:
WBWA stands for "Write-Back, Write-Allocate" and is a valid memory type option.
It means that writes to this memory region will be cached and allocated in the cache on a write miss, and the cache lines will be written back to memory when evicted or explicitly cleaned.
This memory type is commonly used for normal DRAM memory regions, as it provides good performance for both read and write operations.
关于代码内存损坏问题,将代码内存的内存类型更改为只读、不可执行(RO、nX)是一个很好的步骤,因为它将防止对代码内存的意外修改。此外,如果代码内存在内核之间共享,您可能需要检查是否存在任何潜在的缓存维护问题或一致性问题。