Kenny Gorman presents on the gotchas with MongoDB and how to scale it appropriately.
Scale = very big, busy, tricky system
Denormalize till it works to get to true scale; split vertically and then horizontally
Scaling today is focused on rapid development
MongoDB is an OPS nightmare
built-in horiz. scaling and replication 65% deployments are in the cloud
MongoDB challenges:
- Course lock scope; 2.2 has DB lock scope; 2.0 was global
- Visibility
- Schema – still need to business stuff
- When bad things happen…it goes down fast
Design for scale:
- Denormalize into diff. DB’s
- Tune your workloads
- NoORM; dump it
- Shard early (Think about your shard key in advance)
- Replicate (How to get availability)
- Load test pre-prod
Other thoughts:
- Embed vs. not
- Indexing the right amount (btree index 10% drop in performance for every index added; covered indexes are cool)
- Atomic ops; use them! (set, pop, push)
- Use profiler and explain(); you can mine the profiler collection for optimizing
Shard Keys:
- Tradeoffs – uses range-based shard key
- Local vs. Scattered
- Figure this out at design time
Architecture:
- Engage all processors
- Replication is election based and has fault zones so you can plan; “shell game” – rebuild slaves due to fragmentation
- Client connections, getLastError
I/O
- Need fast response time from disk when passivating for large DBs
Perform work on slave and rotate that back in for the primary to do the “shell game”
Add arbitar to replicaSets so it prevents replica set down from voting for itself error
Random partitioning tips:
- Monitor elections and who is the primary
- Write scripts to kill sessions
- automate or die
- mongostat (like iostat)
- historical performance and serverstatus to get histogram of perf. (or just use MMS)
Gotchas:
- Logical schema corruption – use docs to manage it across teams especially types
- Locks are DB so split DB’s if required; 50% lock status is considered slow
- Not enough I/O perf
- Engage all processors (use test harness and optimize for scaling)
- Visibility (save data over time; monitoring; MMS, etc.)
- Not understanding how MongoDB works; use it when it makes sense
- Don’t believe the FUD
@kennygorman; kgorman@objectrocket.com
rocketstat a good replacement for mongostat
