第1页
MongoDB and the Internet of Things
Chris Biow
Sr. Solutions Architect
chris.biow@mongodb.com
July 28, 2015
第2页
Power Your IoT App with MongoDB
芒果数据
第3页
ABOUT SPEAKER
芒@c果hr数is_据biow
Search Engines Post-relational Databases
第4页
AGENDA
IoT Overview & Use Cases Architecture & Challenges Agility & Scalability with MongoDB Powered by MongoDB Case Study
第5页
WHAT’S IoT?
It’s a BUZZWORD!
IoT
BIG DATA CLOUD
TOP BUZZWORDS!
第6页
Internet 4.0
The Evolution of Internet
第7页
What’s in IoT?
Source: http://postscapes.com/what-exactly-is-the-internet-of-things-infographic
第11页
CONNECTED COW by VITAL HERD
E-pill ingested into stomach
Transmits heart rate, temp, chemical composition
Notifies farmer when abnormality is detected
Health management
94 Million Cows in US, Billions of savings
第12页
MyJohnDeere
第13页
Source: Cisco.
第14页
Source: GSMA.
第15页
Source: IDC.
第16页
TECHNOLOGY STACK
Value Delivery: Business Analytics, User Access & Control Middleware and Storage: Application servers, Database Servers Communication Protocol: MQTT, CoAP, XMPP, AMQP, RESTful Wireless Transport: Zigbee, Z-Wave, WIFI, GPRS, Bluetooth-LE Hardware Platform: Arduino, Raspberry Pi, Intel Edison
第18页
CHALLENGES
Value Delivery Middleware and Storage
Variable data format Enormous data volume
Communication Protocol
Wireless Transport
Long last, efficient connectivity
Hardware Platform
Sensor interface not standard
第19页
CASE STUDY – Airplane Tracking
MH-370
第20页
Variable Data Structure
Multiple sources
ADS-C, HFDL, ASDI, EUROCONTROL, ACARS
Multiple forms
location: [ 38.2031, -120.4904 ] , speed: 750, altitude: 29384, engine:
fuel_level: 78% , temperature: 89, EPR: xx N-value: { N1: xxx, N2: xxx, N3: xx}
…
第21页
SAMPLE DESIGN 1
Modeling all metrics as columns in one relational table
EVENT_ID 100001 100002
PLANE_ID 3902 3902
TIMESTAMP
LAT
LONG
1437297148810 38.2031 -124.4904 1437297149213
ENGINE TEMP
FUEL LEVEL
… SPEED 750
Huge table, lots of wasted space caused by empty values
Frequent schema change and data migrations when adding new metrics
第22页
SAMPLE DESIGN 2
Store variable metrics in an EAV table
EVENT_ID 100001
PLANE_ID 3902
TIMESTAMP 1437297148810
EVENT_ID 100001 100001 100002
METRIC_NAME LAT LONG SPEED
METRIC_VALUE 38.2031 -124.4904 750
METRIC_VALUE needs be defined as TEXT field
Index implication for METRIC_VALUE field
Multiple self joins necessary
第23页
A single flight, per minute interval:
Enormous Data Volume
3 * 60 * 100 = 18K data points/flight
100,000 flights per day:
1.8 Billion, 1.8TB per day 21,000 QPS
第24页
Managing IoT data is hard …
第25页
LET
POWER YOUR NEXT IoT SOLUTION
第26页
Nexus Architecture
Relational
NoSQL
Expressive Query Language
Flexibilit y
Strong Consistency
Scalability
Secondary Indexes
Performance
第27页
AGILITY
SCALABILITY
第28页
AGILITY
Start coding now, without month long ER design. Changing schema as you go without penalty. Flexible schema models variable structure with ease
第29页
location: (-84.2391, 34.1039) speed: 750 engine:
fuel_level: 100 , temperature: 88.48
1 Variable data structure
2 Sparse Indexes
3 Dynamic Schema
DATA MODEL
2 3
第30页
Find all planes within 20km of New York
QUERY EXAMPLE
第31页
OPTIMIZE
With document model
A time series is
a sequence of data points,
typically consisting of successive measurements made over a time interval. Examples of time series are ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.
--wikipedia
第32页
{ plane_id: "3209", ts: ISODate("2014-07-03T16:00:00.000Z") metrics: { engine_fuel: 99 }
},
{ plane_id: "3209", ts: ISODate("2014-07-03T16:01:00.000Z") metrics: { engine_fuel: 98.5 }
},
{ plane_id: "3209", ts: ISODate("2014-07-03T16:02:00.000Z") metrics: { engine_fuel: 98 }
}
...
{ plane_id: "3209", ts: ISODate("2014-07-03T16:59:00.000Z") metrics: { engine_fuel: 69 }
}
60:1
{ plane_id: "3209",
hour: ISODate("2014-07-03T16:00:00.000Z"), metrics: {
engine_fuel: {
"0": 99, "1": 98.5,
"2": 98, ...
"59": 69
}, avg: 81.4
} }
• Less docs – space savings • Write performance - less index entries • Queryable & better analytics support
BUCKETING OPTIMIZATION of TIME SERIES DATA
第33页
SCALABILITY
Shared-nothing, scales horizontally, linearly Auto-balance ensures a balanced cluster
第34页
SHARDED CLUSTER
config config config
第35页
CHOOSING A SHARD KEY FOR SENSORS
Cardinality - LARGE Write distribution - EVEN Query isolation – ISOLATED
第36页
CHOCaOrdSinINalGity
A
Write
SDHiAstrRibDutKioEn Y
Query Isolation
Reliability
Index Locality
_id hash(_id) asset_id
DocClevaelrdinaliOtyne shard Write distribution Query isolation
HashRleevelliabilityAll Shards Index locality
Scatter/gather Scatter/gather
All users affected All users affected
Many docs
All Shards
Targeted
Some assets affected
Good Poor Good
asset_id, ts
Doc level
All Shards
Targeted
Some assets affected
Good
第37页
芒果数据
第42页
Production in 16 grams (e.g. under a beer keg)
第43页
WE CAN HELP
MongoDB Enterprise Advanced
The best way to run MongoDB in your data center
MongoDB Ops Manager
The easiest way to run MongoDB in your datacenter
Production Support
In production and under control
Development Support
Let’s get you running
Consulting
We solve problems
Training
Get your teams up to speed.