sharding - MongoDB shard key as (ObjectId, ObjectId, RandomKey). Unbalanced collections -
i trying shard collection approximately 6m documents. following details sharded cluster
mongod version 2.6.7, 2 shards, 40 % writes, 60% reads.
my database has collection events around 6m documents. normal document looks below:
{
_id : objectid, sector_id : objectid, subsector_id: objectid, . . . many event specific fields go here . . created_at: date, updated_at: date, uid : 16digitrandomkey
}
each sector has multiple (1,2, ..100) subsectors , each subsector has multiple events. there 10000 such sectors, 30000 subsectors , 6m events. numbers keep growing.
the normal read query includes sector_id, subsector_id. every write operation includes sector_id, subsector_id, uid (randomly generated unique key) , rest of data.
i tried/considered following shard keys , results described below:
a. _id:hashed --> not provide query isolation, reason: _id not passed read query.
b. sector_id :1, subsector_id:1, uid:1 --> strange distribution: few sectors old objectid goes shard 1, few sectors having sector_id of mid age(objectid) balanced , equally distributed among both shards. few sectors recent objectid stays on shard 0.
c. subsector_id: hashed --> results same shard key b.
d. subsector_id:1, uid:1 --> same b.
e. subsector_id:hashed, uid:1 --> can not create such index
f. uid:1 --> writes distributed no query isolation
what may reason uneven distribution? can right shard key based upon given data.
i see expected behaviour astro, sectorids , subsectorids objectid type contains timestamp first 4 bytes monotonic in nature , go same chunk (and hence same shard) failed provide randomness pointed in point (b).
the best way choose shard key key has business meaning (unlike objectid field) , should mixed hash suffix ensure random mix on equal distribution. if have sectorname , subsectorname pls try out , let know if working using that.
you may consider link choose right shard key.
mongodb shard date on single machine
-$
Comments
Post a Comment