fortran 中的 Openmp“分段错误”[重复]

问题描述 投票:0回答:1

我的系统采用 Debian 12,配备 Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz。

我正在尝试使用 openmp 并行化一些 fortran 代码,它使用 FFTW 库,并由使用如下所示的 Makefile 编译的几个模块组成

....
....
LIBS =-lfftw3   

CC = g++
CFLAGS = -O2 
FC = gfortran 
FFLAGS = -O
F90 = gfortran
F90FLAGS = -O3 -pipe -fomit-frame-pointer -fopenmp  
#F90FLAGS = -O3 -pipe -fomit-frame-pointer  
LDFLAGS = -fopenmp  

all: 

main_gradient: $(OBJS) main_gradient.f90
    $(F90) -I$(INCLUDE) $(F90FLAGS) -c [email protected] 
    $(F90) -I$(INCLUDE) $(LDFLAGS) -L/usr/lib:/usr/local/lib  $(OBJS) [email protected] $(LIBS)  -o [email protected] 


.SUFFIXES: $(SUFFIXES) .f90

%.o: %.f90 
    $(F90)  -I$(INCLUDE) $(F90FLAGS) -c $<

%.o: $(DIR_GENERAL)/%.f90
    $(F90)  -I$(INCLUDE) $(F90FLAGS) -c $<

%.o: $(DIR_GEOMETRY)/%.f90
    $(F90)  -I$(INCLUDE) $(F90FLAGS) -c $<

%.o: $(DIR_POLYMERS)/%.f90
    $(F90)  -I$(INCLUDE) $(F90FLAGS) -c $<

%.o: $(DIR_INTERACTIONS)/%.f90
    $(F90)  -I$(INCLUDE) $(F90FLAGS) -c $<

clean:
    rm -f *.o *.mod fort.* dens_* phi* *.exe
....
....

出于测试目的,我还没有并行化我的任何代码。

我在网上找到的以下信息我为大型数组使用了分配的数组(大小= 646464*100)来解决编译时的分段错误,但在运行可执行文件时仍然遇到分段错误,即使我有 OMP_NUM_THREADS= 1.使用 valgrind 运行它会给出:

==463360== Warning: client switching stacks?  SP change: 0x1ffefff958 --> 0x1ff97ff6f0
==463360==          to suppress, use: --max-stackframe=92275304 or greater
==463360== Invalid write of size 8
==463360==    at 0x11C7BA: mix_fields.1 (in 
==463360==  Address 0x1ff97ff6f8 is on thread 1's stack
==463360== 
==463360== Can't extend stack to 0x1ff97fe7a8 during signal delivery for thread 1:
==463360==   no stack segment
==463360== 
==463360== Process terminating with default action of signal 11 (SIGSEGV)
==463360==  Access not within mapped region at address 0x1FF97FE7A8
==463360==    at 0x11C7BA: mix_fields.1 (in 
==463360==  If you believe this happened as a result of a stack
==463360==  overflow in your program's main thread (unlikely but
==463360==  possible), you can try to increase the size of the
==463360==  main thread stack using the --main-stacksize= flag.
==463360==  The main thread stack size used in this run was 8388608.
==463360== 
==463360== HEAP SUMMARY:
==463360==     in use at exit: 71,488,141 bytes in 755 blocks
==463360==   total heap usage: 298,496 allocs, 297,741 frees, 1,390,223,513 bytes allocated
==463360== 
==463360== LEAK SUMMARY:
==463360==    definitely lost: 80,000 bytes in 1 blocks
==463360==    indirectly lost: 0 bytes in 0 blocks
==463360==      possibly lost: 0 bytes in 0 blocks
==463360==    still reachable: 71,408,141 bytes in 754 blocks
==463360==         suppressed: 0 bytes in 0 blocks
==463360== Rerun with --leak-check=full to see details of leaked memory
==463360== 
==463360== Use --track-origins=yes to see where uninitialised values come from
==463360== For lists of detected and suppressed errors, rerun with: -s
==463360== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

据我了解,这是数组分配到堆栈并耗尽空间的问题。在这种情况下我可以做什么来摆脱这个问题?使用 -fopenmp 是否会导致数组的分配方式与 -fopenmp 标志不存在时不同?

segmentation-fault fortran openmp valgrind
1个回答
-1
投票

首先

CFLAGS = -O2 
F90FLAGS = -O3 -pipe -fomit-frame-pointer -fopenmp  

如果将 -g 添加到这两者中,您将获得更好的结果。如果您使用 -

fno-omit-frame-pointer
,那就更好了。

我不是OpenMP专家,但粗略地说OpenMP架构就是创建一个线程池,然后每当你使用pragma parallel for之类的东西时,与for循环体对应的工作包就会交给线程池来表演。

对于堆栈大小,有两件事需要考虑。默认情况下堆栈大小限制为 8M。您可以使用 shell limit/ulimit 命令增加主线程的大小限制。线程堆栈大小限制是在创建线程时设置的,并且可以通过 OMP_STACKSIZE 环境变量进行控制。除此之外,Valgrind 还对主线程施加了限制。正如您的输出所示,您可以使用

--max-stackframe=
来增加它。

您可以在主线程堆栈上放置很多内容。不要在线程堆栈上放置太多内容。

你没有说出你的 646464*100 数组是由什么组成的。如果是浮点双精度则需要大约520M内存。为了安全起见,我会将其四舍五入到 600M。

© www.soinside.com 2019 - 2024. All rights reserved.