第1页
An introduction to MongoDB
Rácz Gábor
ELTE IK, 2013. febr. 10.
第2页
2
In Production
http://www.mongodb.org/about/production-deployments/
2009: Initial release
At now: version 2.2.3
第3页
3
NoSQL
Key-value
Graph database
Document-oriented
Column family
Huge quantity of data => Distributed systems => expensive joins =>
New fields, new demands (graphs) =>
Different data strucutres:
Simplier or more specific
第4页
4
Document store
Javascript
第5页
5
Document store
> db.user.findOne({age:39})
{
"_id" : ObjectId("5114e0bd42…"),
"first" : "John",
"last" : "Doe",
"age" : 39,
"interests" : [
"Reading",
"Mountain Biking ]
"favorites": {
"color": "Blue",
"sport": "Soccer"}
}
Flexible schema
Javascript
第6页
6
CRUD
Create
db.collection.insert( <document> )
db.collection.save( <document> )
db.collection.update( <query>, <update>, { upsert: true } )
Read
db.collection.find( <query>, <projection> )
db.collection.findOne( <query>, <projection> )
Update
db.collection.update( <query>, <update>, <options> )
Delete
db.collection.remove( <query>, <justOne> )
Create
The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array.
The field names cannot start with the $ character.
The field names cannot contain the . character.
Create with save
If the argument does not contain the _id field or contains an _id field with a value not in the collection, the save() method performs an insert of the document.
Otherwise, the save() method performs an update.
sds
第7页
7
CRUD example
> db.user.insert({
first: "John",
last : "Doe",
age: 39
})
> db.user.find ()
{
"_id" : ObjectId("51…"),
"first" : "John",
"last" : "Doe",
"age" : 39
}
> db.user.update(
{"_id" : ObjectId("51…")},
{
$set: {
age: 40,
salary: 7000}
}
)
> db.user.remove({
"first": /^J/
})
第8页
8
Features
Document-Oriented storege
Full Index Support
Replication & High Availability
Auto-Sharding
Querying
Fast In-Place Updates
Map/Reduce
Agile
Scalable
第9页
9
Memory Mapped Files
„A memory-mapped file is a segment of virtual memory which has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource.”1
mmap()
1: http://en.wikipedia.org/wiki/Memory-mapped_file
http://docs.mongodb.org/manual/faq/storage/
第10页
10
Replica Sets
Redundancy and Failover
Zero downtime for upgrades and maintaince
Master-slave replication
Strong Consistency
Delayed Consistency
Geospatial features
Host1:10000
Host2:10001
Host3:10002
replica1
Client
DC – Data center
第11页
11
Sharding
Partition your data
Scale write throughput
Increase capacity
Auto-balancing
Host1:10000
Host2:10010
Host3:20000
shard1
shard2
Host4:30000
configdb
Client
1970 – 2000: Vertical Scalability (scale-up)
Google, ~2000: Horizontal Scalability (scale-out)
第12页
12
Mixed
Host4:10010
Host5:20000
shard1
shardn
Host6:30000
configdb
Client
Host1:10000
Host2:10001
Host3:10002
replica1
Host7:30000
...
第13页
13
Map/Reduce
db.collection.mapReduce(
<mapfunction>,
<reducefunction>,
{
out: <collection>,
query: <>,
sort: <>,
limit: <number>,
finalize: <function>,
scope: <>,
jsMode: <boolean>,
verbose: <boolean>
}
)
var mapFunction1 = function() { emit(this.cust_id, this.price); };
var reduceFunction1 = function(keyCustId, valuesPrices)
{ return sum(valuesPrices); };
http://docs.mongodb.org/manual/reference/method/db.collection.mapReduce/#db.collection.mapReduce
第14页
14
Other features
Easy to install and use
Detailed documentation
Various APIs
JavaScript, Python, Ruby, Perl, Java, Java, Scala, C#, C++, Haskell, Erlang
Community
Open source
第15页
15
Theory of noSQL: CAP
CAP Theorem:satisfying all three at the same time is impossible
A
P
Many nodes
Nodes contain replicas of partitions of data
Consistency
all replicas contain the same version of data
Availability
system remains operational on failing nodes
Partition tolarence
multiple entry points
system remains operational on system split
C
第16页
16
Theory of noSQL: CAP
CAP Theorem:satisfying all three at the same time is impossible
A
P
Many nodes
Nodes contain replicas of partitions of data
Consistency
all replicas contain the same version of data
Availability
system remains operational on failing nodes
Partition tolarence
multiple entry points
system remains operational on system split
C
MongoDB
A: What if a primary node is down?
第17页
17
ACID - BASE
Pritchett, D.: BASE: An Acid Alternative (queue.acm.org/detail.cfm?id=1394128)
Atomicity
Consistency
Isolation
Durability
Basically
Available (CP)
Soft-state
Eventually consistent (AP)
ACID
Atomicity. All of the operations in the transaction will complete, or none will.
Consistency. The database will be in a consistent state when the transaction begins and ends.
Isolation. The transaction will behave as if it is the only operation being performed upon the database.
Durability. Upon completion of the transaction, the operation will not be reversed.
BASE
Basically Available: some parts of system remain availabe on failure
Soft-state:
(the information will expire unless it is refreshed )
system will change state without user intervention due to eventual consistency
Eventually consistency:
asynchron propagation
consistancy window
第18页
18
Thank you for your attention!