mongodb 没有在分片之间均匀分布数据

问题描述 投票:0回答:0

我有一个包含 800k 个对象的数据库,我定义了大约 13 个分片服务器来快速访问数据。我为每个对象分配了一个字母以供在分片过程中使用,例如,shards: 'a' 表示第一个对象,shards: 'b' 表示第二个对象,依此类推。我使用每个对象中的分片字段创建了一个分片键,并希望在 13 个分片服务器上尽可能均匀地分布对象。我使用“hashed”作为分片字段的分片键。我将字母均匀地分配给所有对象,例如,50k 个对象有分片:'a',50k 个对象有分片:'b',依此类推。我使用 "sh.shardCollection("test.testCollection", { "shards": "hashed" } ) 对集合进行分片,但是数据只去了 13 个分片服务器中的两个。分布在两个服务器之间不均匀, 大约有 72% 分配给一台服务器,28% 分配给另一台服务器。我希望数据均匀分布在所有 13 个分片服务器中。你能帮我解决这个问题吗?

[
  {
    _id: 'a',
    host: 'a/127.0.0.1:21000,127.0.0.1:21001,127.0.0.1:21002',
    state: 1,
    topologyTime: Timestamp({ t: 1675107083, i: 3 })
  },
  {
    _id: 'b',
    host: 'b/127.0.0.1:22000,127.0.0.1:22001,127.0.0.1:22002',
    state: 1,
    topologyTime: Timestamp({ t: 1675107100, i: 5 })
  },
  {
    _id: 'c',
    host: 'c/127.0.0.1:23000,127.0.0.1:23001,127.0.0.1:23002',
    state: 1,
[direct: mongos] test>
    draining: true
  },
  {
    _id: 'd',
    host: 'd/127.0.0.1:23010,127.0.0.1:23011,127.0.0.1:23012',
    state: 1,
    topologyTime: Timestamp({ t: 1676821653, i: 5 })
  },
  {
    _id: 'e',
    host: 'e/127.0.0.1:23020,127.0.0.1:23021,127.0.0.1:23022',
    state: 1,
    topologyTime: Timestamp({ t: 1676821663, i: 5 })
  },
  {
    _id: 'f',
    host: 'f/127.0.0.1:23030,127.0.0.1:23031,127.0.0.1:23032',
    state: 1,
    topologyTime: Timestamp({ t: 1676821668, i: 1 })
  },
  {
    _id: 'g',
    host: 'g/127.0.0.1:23040,127.0.0.1:23041,127.0.0.1:23042',
    state: 1,
    topologyTime: Timestamp({ t: 1676821673, i: 5 })
  },
  {
    _id: 'h',
    host: 'h/127.0.0.1:23050,127.0.0.1:23051,127.0.0.1:23052',
    state: 1,
    topologyTime: Timestamp({ t: 1676821678, i: 5 })
  },
  {
    _id: 'j',
    host: 'j/127.0.0.1:23060,127.0.0.1:23061,127.0.0.1:23062',
    state: 1,
    topologyTime: Timestamp({ t: 1676821685, i: 5 })
  },
  {
    _id: 'k',
    host: 'k/127.0.0.1:23070,127.0.0.1:23071,127.0.0.1:23072',
    state: 1,
    topologyTime: Timestamp({ t: 1676821689, i: 5 })
  },
  {
    _id: 'l',
    host: 'l/127.0.0.1:23080,127.0.0.1:23081,127.0.0.1:23082',
    state: 1,
    topologyTime: Timestamp({ t: 1676821694, i: 5 })
  },
  {
    _id: 'm',
    host: 'm/127.0.0.1:23090,127.0.0.1:23091,127.0.0.1:23092',
    state: 1,
    topologyTime: Timestamp({ t: 1676821698, i: 5 })
  },
  {
    _id: 'n',
    host: 'n/127.0.0.1:24000,127.0.0.1:24001,127.0.0.1:24002',
    state: 1,
    topologyTime: Timestamp({ t: 1676821708, i: 4 })
  }
]
Shard a at a/127.0.0.1:21000,127.0.0.1:21001,127.0.0.1:21002
{
  data: '125.57MiB',
  docs: 227420,
  chunks: 1,
  'estimated data per chunk': '125.57MiB',
  'estimated docs per chunk': 227420
}
Shard k at k/127.0.0.1:23070,127.0.0.1:23071,127.0.0.1:23072
{
  data: '326.31MiB',
  docs: 576209,
  chunks: 1,
  'estimated data per chunk': '326.31MiB',
  'estimated docs per chunk': 576209
}

对象样本:

{
  "_id": {
    "$oid": "63dd7324289226c918818c55"
  },
  "Title": "",
  "Product": {
    "web1": {
      "Harry Potter and the Chamber of Secrets: 2/7 (Harry Potter 2)": {
        "Price": 15,
        "Url": "https://www.amazon.com/Harry-Potter-Chamber-Secrets-Book/dp/B017V4IPPO/ref=sr_1_2?crid=GCT8C7Z3Q4SE&keywords=Harry+Potter+and+the+Chamber+of+Secrets&qid=1676836656&sprefix=harry+potter+and+the+chamber+of+secrets%2Caps%2C230&sr=8-2",
        "Time": {
          "$date": {
            "$numberLong": "1676669514749"
          }
        }
      }
    }
  },
  "Category": [
    "Book",
    "Fantasy"
  ],
  "Time": {
    "$date": {
      "$numberLong": "1676669514749"
    }
  },
  "shards": "h"
}

我想确保数据在我的分片服务器之间均匀分布。我想了解我需要为此做些什么。

mongodb mongodb-query pymongo sharding mongodb-shell
© www.soinside.com 2019 - 2024. All rights reserved.