在ClickHouse中内存不足,无法搜索文本

问题描述 投票:0回答:1

我正在调查ClickHouse对于OLAP是否是一个不错的选择。为此,我使用ClickHouse的sintax复制了一些在PostgreSQL上运行的查询。

我运行的所有查询都比Postgres的查询快[[much,但是执行文本搜索的查询内存不足。下面是错误代码和堆栈跟踪。

clickhouse_driver.errors.ServerException:代码:241。DB :: Exception:超出了内存限制(用于查询):将使用9.31 GiB(尝试使用分配块524288字节),最大:9.31 GiB。

查询的脚本是:

SELECT COUNT(*) FROM ObserverNodeOccurrence as occ LEFT JOIN ObserverNodeOccurrence_NodeElements as occ_ne ON occ._id = occ_ne.occurrenceId WHERE occ_ne.snippet LIKE '<img>'

上面的查询计算包含HTML图像(snippet)的列<img>的条目数。此列包含HTML代码段,因此搜索文本变得非常昂贵。近期/中期目标是解析此列并将其转换为一组其他列(例如contains_imgcontains_script等)。但是,到目前为止,我希望能够运行这样的查询

而不会耗尽内存

我的问题是:

  • 我如何在这样的列上成功执行文本搜索查询而不会耗尽内存?
    是否有办法
  • force
查询计划器在内存不足时立即使用磁盘?我正在使用MergeTree引擎。是否有另一个引擎可以在ram和磁盘之间分配负载?

  • 完整堆栈跟踪:

    clickhouse_driver.errors.ServerException: Code: 241. DB::Exception: Memory limit (for query) exceeded: would use 9.31 GiB (attempt to allocate chunk of 524288 bytes), maximum: 9.31 GiB. Stack trace: 0. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x22) [0x781c272] 1. /usr/bin/clickhouse-server(MemoryTracker::alloc(long)+0x8ba) [0x71bbb4a] 2. /usr/bin/clickhouse-server(MemoryTracker::alloc(long)+0xc5) [0x71bb355] 3. /usr/bin/clickhouse-server() [0x67aeb4e] 4. /usr/bin/clickhouse-server() [0x67af010] 5. /usr/bin/clickhouse-server() [0x67e5af4] 6. /usr/bin/clickhouse-server(void DB::Join::joinBlockImpl<(DB::ASTTableJoin::Kind)1, (DB::ASTTableJoin::Strictness)2, DB::Join::MapsTemplate<DB::JoinStuff::WithFlags<DB::RowRefList, false> > >(DB::Block&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, DB::NamesAndTypesList const&, DB::Block const&, DB::Join::MapsTemplate<DB::JoinStuff::WithFlags<DB::RowRefList, false> > const&) const+0xe1c) [0x68020dc] 7. /usr/bin/clickhouse-server(DB::Join::joinBlock(DB::Block&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, DB::NamesAndTypesList const&) const+0x1a5) [0x67bc415] 8. /usr/bin/clickhouse-server(DB::ExpressionAction::execute(DB::Block&, bool) const+0xa5d) [0x6d961dd] 9. /usr/bin/clickhouse-server(DB::ExpressionActions::execute(DB::Block&, bool) const+0x45) [0x6d97545] 10. /usr/bin/clickhouse-server(DB::ExpressionBlockInputStream::readImpl()+0x48) [0x6c52888] 11. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6635628] 12. /usr/bin/clickhouse-server(DB::FilterBlockInputStream::readImpl()+0xd9) [0x6c538b9] 13. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6635628] 14. /usr/bin/clickhouse-server(DB::ExpressionBlockInputStream::readImpl()+0x2d) [0x6c5286d] 15. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6635628] 16. /usr/bin/clickhouse-server(DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::loop(unsigned long)+0x139) [0x6c7f409] 17. /usr/bin/clickhouse-server(DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::thread(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long)+0x209) [0x6c7fc79] 18. /usr/bin/clickhouse-server(ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::*)(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long), DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>*, std::shared_ptr<DB::ThreadGroupStatus>, unsigned long&>(void (DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::*&&)(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long), DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>*&&, std::shared_ptr<DB::ThreadGroupStatus>&&, unsigned long&)::{lambda()#1}::operator()() const+0x7f) [0x6c801cf] 19. /usr/bin/clickhouse-server(ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>)+0x1af) [0x71c778f] 20. /usr/bin/clickhouse-server() [0xb2ac5bf] 21. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fc5b50826db] 22. /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fc5b480988f]

  • full-text-search sqlperformance clickhouse
    1个回答
    0
    投票
    在终端中运行Clickhouse-Client

    set max_bytes_before_external_group_by=20000000000; --20 GB for external group by set max_memory_usage=40000000000; --40GB for memory limit

    © www.soinside.com 2019 - 2024. All rights reserved.