我正在使用 Qt6.5.3 编写一个图像处理应用程序。有一个生产者(相机)不断抓取图像,还有一个消费者对抓取的图像进行检测。由于检测可能非常慢,我使用多线程来加速整个管道。我的代码可以概括为:
#include <QCoreApplication>
#include <QDebug>
#include <QImage>
#include <QThread>
#include <QTimer>
#include <QtConcurrent>
class Producer : public QObject {
Q_OBJECT
public:
Producer(QObject *parent = nullptr) : QObject(parent) {}
public slots:
void produce() {
constexpr auto count = 1000;
for (int i = 0; i < count; ++i) {
QImage img(2448, 2048, QImage::Format_Grayscale8);
img.fill(0);
emit imageReady(img);
}
}
signals:
void imageReady(QImage image);
};
class Consumer : public QObject {
Q_OBJECT
public:
Consumer(QObject *parent = nullptr) : QObject(parent) {}
int consumedCount() const { return count_; }
public slots:
void onImageReady(QImage image) {
QFuture<void> future = QtConcurrent::run([=] {
QImage copy = image.copy(); // Make a deep copy first
QThread::msleep(200); // Mock detection on the copy
qDebug() << ++count_;
});
}
private:
std::atomic_int count_ = 0;
};
int main(int argc, char *argv[]) {
QCoreApplication a(argc, argv);
Producer producer;
Consumer consumer;
QObject::connect(&producer, &Producer::imageReady, &consumer,
&Consumer::onImageReady);
QTimer::singleShot(0, &producer, &Producer::produce);
return a.exec();
}
#include "main.moc"
由于消费者比生产者慢得多,因此在运行时,由于图像排队等待检测,该进程可能会占用大量内存(~5GB)。这是预料之中的。
奇怪的是,即使检测到所有图像后,进程内存占用仍然很高(~2GB)。
一开始我以为是内存泄漏,但是valgrind中的memcheck否认了。
==123984== Memcheck, a memory error detector
==123984== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==123984== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==123984== Command: ./MultithreadImage
==123984== Parent PID: 99641
==123984==
==123984==
==123984== Process terminating with default action of signal 2 (SIGINT)
==123984== at 0x5CA9BCF: poll (poll.c:29)
==123984== by 0x60091F5: ??? (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7200.4)
==123984== by 0x5FB13E2: g_main_context_iteration (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7200.4)
==123984== by 0x569B809: QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) (qeventdispatcher_glib.cpp:393)
==123984== by 0x53FCF6A: QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) (qeventloop.cpp:182)
==123984== by 0x53F97CD: QCoreApplication::exec() (qcoreapplication.cpp:1439)
==123984== by 0x10B94F: main (main.cpp:55)
==123984==
==123984== HEAP SUMMARY:
==123984== in use at exit: 187,502 bytes in 332 blocks
==123984== total heap usage: 16,974 allocs, 16,642 frees, 10,028,770,623 bytes allocated
==123984==
==123984== LEAK SUMMARY:
==123984== definitely lost: 0 bytes in 0 blocks
==123984== indirectly lost: 0 bytes in 0 blocks
==123984== possibly lost: 1,648 bytes in 7 blocks
==123984== still reachable: 185,854 bytes in 325 blocks
==123984== of which reachable via heuristic:
==123984== newarray : 328 bytes in 3 blocks
==123984== suppressed: 0 bytes in 0 blocks
==123984== Rerun with --leak-check=full to see details of leaked memory
==123984==
==123984== For lists of detected and suppressed errors, rerun with: -s
==123984== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
经过一些调试,我发现QImage内部的引用计数似乎直到应用程序即将退出时才达到0,这验证了memcheck的结果:内存没有泄漏,它只是被我的应用程序中的某个地方持有。另一方面,如果我只评论
QThread::msleep(200);
部分,使消费者几乎与生产者同步,那么内存使用量就可以了。所以,图像排队时一定出了问题,但我在代码中找不到它在哪里。
我可以重现该行为。但是,这不是内存泄漏或保留分配的某些结构。这只是
malloc
实现没有将内存释放给操作系统。
试试这个:
#include <malloc.h>
...
public slots:
void onImageReady(QImage image) {
QFuture<void> future = QtConcurrent::run([=] {
QImage copy = image.copy(); // Make a deep copy first
QThread::msleep(200); // Mock detection on the copy
int count = ++count_;
qDebug() << count;
if(count == 1000)
malloc_trim(0);
});
}
这应该释放内存。
您的真实代码不需要摆弄这个。内存将被重用。
替代方案包括:
mallopt(M_MMAP_THRESHOLD, (2448*2048-1) & -4096);
,以便所有等于或大于图像大小的分配都由 mmap
处理。不推荐,因为它会减慢应用程序mmap
。像这样的东西: using info_type = QPair<void*, qsizetype>;
constexpr qsizetype size = 2448 * 2048;
auto info = std::make_unique<info_type>(nullptr, size);
info->first = mmap(
nullptr, size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
QImageCleanupFunction cleanup = +[](void* vinfo) {
std::unique_ptr<info_type> info{ static_cast<info_type*>(vinfo) };
munmap(info->first, info->second);
};
QImage img { static_cast<uchar*>(info->first), 2448, 2048,
2448 /*bytes per line*/, QImage::Format_Grayscale8,
cleanup, info.get() };
再说一次,我不认为你有真正的问题。无论如何,您的实际应用程序可能会继续分配和释放图像,从而重用内存。在这个测试用例中这一点很明显,因为您在完成释放之前很久就停止了分配,因为消费者的速度慢得多。