第1页
NoSQL and Postgres
Presented by: | 2014-10-31
© 2014 EnterpriseDB Corporation. All rights reserved.
第2页
The NoSQL Conundrum
Why do developers like NoSQL only solutions?
• Easy integration with Web 2.0 development style
− JSON is mapped all the way from the web page to the database
• Little/no knowledge of SQL required • Who needs data normalization? • “I don’t need no DBA” • The data model is developed as the application matures
– schema-less works well with Agile and Sprints • Get up and running really quickly
© 2014 EnterpriseDB Corporation. All rights reserved.
第3页
The NoSQL Conundrum
Problems and fallacies of NoSQL (only)
• Data silos
− Data standards − Data islands − Data access
• Technology silos
− New database technology − Backup, recovery, replication, tuning, maintenance
• Lack of integration with corporate data standards
• Data models include data access paths
− Customers > Orders or Orders > Customers
© 2014 EnterpriseDB Corporation. All rights reserved.
第4页
The NoSQL Conundrum
Data Standards
• Structures and standards emerge!
• Data has references (products link to catalogs; products have bills of material; components appear in multiple products; storage locations link to ISO country tables)
• When the database has duplicate data entries, the application has to manage updates in multiple places – what happens when there is no ACID transactional model?
© 2014 EnterpriseDB Corporation. All rights reserved.
第5页
The NoSQL Conundrum
Data Islands
• Solutions exist in context
• Applications must be integrated
− Orders flow to the warehouse − Invoices go to AP − Customer data must tie into the CRM
• Data must be managed
“By 2017, 50% of data stored in NoSQL DBMSs will be damaging to the business due to a lack of applied information governance policies and programs.”
Dec. 3, 2013 Research Note “Does Your NoSQL DMBS Result in Information Governance Debt?” by Nick Heudecker and Ted Friedman.
© 2014 EnterpriseDB Corporation. All rights reserved.
第6页
The NoSQL Conundrum
Data models include data access paths
• Document-based models encourage hierarchical data models
• Fast and easy access for the intended use case
− What did this customer order? − When did they order? − How much did they order?
• Really problematic when the use case changes
− Which customers ordered Widget A in January?
© 2014 EnterpriseDB Corporation. All rights reserved.
第7页
How is NoSQL Possible in Postgres?
• PostgreSQL is an Object-Relational Database
− O-R development foundation completed in 1994 − Built upon classes and inheritance − Community use of O-R foundation continues today
• ORDBMS properties
− Highly extensible architecture (also easy) − Extensible data types, operators, functions, casts, aggregates,
indexes, and languages − New features are easily added to the original DB feature set − New features behave and perform as well as the original
feature set
© 2014 EnterpriseDB Corporation. All rights reserved.
第8页
How is NoSQL Possible in Postgres?
ORDBMS extensions…
• PostGIS April 2005
− Geospatial extension − Considered one of the best in the world for geospatial
applications
• HSTORE December 2006
− Key-Value store − Excellent for handling rows with sparse data collections
• JSON/JSONB September 2012
− NoSQL Document data store − Full support for JSON document functions and operators − Embed JSON in your SQL − Easy bi-directional interchange of JSON and relational data
© 2014 EnterpriseDB Corporation. All rights reserved.
第9页
Postgres’ NoSQL Capabilities
N Postgres
S Q L
• HSTORE
− Key-value pair − Simple, fast and easy − Postgres v 8.2 – pre-dates many NoSQL-only solutions − Ideal for flat data structures that are sparsely populated
• JSON
− Hierarchical document model − Introduced in Postgres 9.2, perfected in 9.3
• JSONB
− Binary version of JSON − Faster, more operators and even more robust − Postgres 9.4
© 2014 EnterpriseDB Corporation. All rights reserved.
第10页
Postgres: Key-value Store
• Supported since 2006, the HStore contrib module enables storing key/value pairs within a single column
• Allows you to create a schema-less, ACID compliant data store within Postgres
• Create single HStore column and include, for each row, only those keys which pertain to the record
• Add attributes to a table and query without advance planning
•Combines flexibility with ACID compliance
© 2014 EnterpriseDB Corporation. All rights reserved.
第11页
HSTORE Examples
• Create a table with HSTORE field
CREATE TABLE hstore_data (data HSTORE);
• Insert a record into hstore_data
INSERT INTO hstore_data (data) VALUES (’ "cost"=>"500", "product"=>"iphone", "provider"=>"apple"');
• Select data from hstore_data
SELECT data FROM hstore_data ; -----------------------------------------"cost"=>"500”,"product"=>"iphone”,"provider"=>"Apple" (1 row)
© 2014 EnterpriseDB Corporation. All rights reserved.
第12页
Postgres: Document Store
• JSON is the most popular data-interchange format on the web
• Derived from the ECMAScript Programming Language Standard (European Computer Manufacturers Association).
• Supported by virtually every programming language
• New supporting technologies continue to expand JSON’s utility
− PL/V8 JavaScript extension − Node.js
• Postgres has a native JSON data type (v9.2) and a JSON parser and a variety of JSON functions (v9.3)
• Postgres will have a JSONB data type with binary storage and indexing (coming – v9.4)
© 2014 EnterpriseDB Corporation. All rights reserved.
第13页
JSON Examples
• Creating a table with a JSONB field
CREATE TABLE json_data (data JSONB);
• Simple JSON data element:
{"name": "Apple Phone", "type": "phone", "brand": "ACME", "price": 200, "available": true, "warranty_years": 1}
• Inserting this data element into the table json_data
INSERT INTO json_data (data) VALUES (’ { "name": "Apple Phone", "type": "phone", "brand": "ACME", "price": 200, "available": true, "warranty_years": 1 } ')
© 2014 EnterpriseDB Corporation. All rights reserved.
第14页
JSON Examples
• JSON data element with nesting:
{“full name”: “John Joseph Carl Salinger”, “names”:
[ {"type": "firstname", “value”: ”John”}, {“type”: “middlename”, “value”: “Joseph”}, {“type”: “middlename”, “value”: “Carl”}, {“type”: “lastname”, “value”: “Salinger”} ] }
© 2014 EnterpriseDB Corporation. All rights reserved.
第15页
A simple query for JSON data
SELECT DISTINCT data->>'name' as products
FROM json_data;
products -----------------------------Cable TV Basic Service Package AC3 Case Black Phone Service Basic Plan AC3 Phone AC3 Case Green Phone Service Family Plan AC3 Case Red AC7 Phone
This query does not return JSON data – it returns text values associated with the key ‘name’
© 2014 EnterpriseDB Corporation. All rights reserved.
第16页
A query that returns JSON data
SELECT data FROM json_data; data -----------------------------------------{"name": "Apple Phone", "type": "phone", "brand": "ACME", "price": 200, "available": true, "warranty_years": 1}
This query returns the JSON data in its original format
© 2014 EnterpriseDB Corporation. All rights reserved.
第17页
JSON and ANSI SQL – A Great Fit
• JSON is naturally integrated with ANSI SQL in Postgres
• JSON and HSTORE are elegant and easy to use extensions of the underlying objectrelational model
• JSON and SQL queries use the same language, the same planner, and the same ACID compliant transaction framework
© 2014 EnterpriseDB Corporation. All rights reserved.
第18页
JSON and ANSI SQL Example
ANSI SQL
SELECT DISTINCT product_type,
JSON
data->>'brand' as Brand, data->>'available' as Availability FROM json_data
JOIN products ON (products.product_type=json_data.data->>'name') WHERE json_data.data->>'available'=true;
product_type
| brand | availability
---------------------------+-----------+--------------
AC3 Phone
| ACME | true
No need for programmatic logic to combine SQL and NoSQL in the application – Postgres does it all
© 2014 EnterpriseDB Corporation. All rights reserved.
第19页
Bridging between SQL and JSON
Simple ANSI SQL Table Definition
CREATE TABLE products (id integer, product_name text );
Select query returning standard data set SELECT * FROM products;
id | product_name ----+--------------
1 | iPhone 2 | Samsung 3 | Nokia
Select query returning the same result as a JSON data set SELECT ROW_TO_JSON(products) FROM products;
{"id":1,"product_name":"iPhone"} {"id":2,"product_name":"Samsung"} {"id":3,"product_name":"Nokia”}
© 2014 EnterpriseDB Corporation. All rights reserved.
第20页
JSON Data Types
• 1. Number:
− Signed decimal number that may contain a fractional part and may use exponential notation.
− No distinction between integer and floating-point
• 2. String
− A sequence of zero or more Unicode characters. − Strings are delimited with double-quotation mark − Supports a backslash escaping syntax.
• 3. Boolean
− Either of the values true or false.
JSON is defined per RFC – 7159 For more detail please refer http://tools.ietf.org/html/rfc7159
• 4. Array
− An ordered list of zero or more values, − Each values may be of any type. − Arrays use square bracket notation with elements being comma-separated.
• 5. Object
− An unordered associative array (name/value pairs). − Objects are delimited with curly brackets − Commas to separate each pair − Each pair the colon ':' character separates the key or name from its value. − All keys must be strings and should be distinct from each other within that object.
• 6. null
− An empty value, using the word null
© 2014 EnterpriseDB Corporation. All rights reserved.
第21页
JSON Data Type Example
{
"firstName": "John",
-- String Type
"lastName": "Smith",
-- String Type
"isAlive": true,
-- Boolean Type
"age": 25,
-- Number Type
"height_cm": 167.6,
-- Number Type
"address": {
-- Object Type
"streetAddress": "21 2nd Street”,
"city": "New York”,
"state": "NY”,
"postalCode": "10021-3100”
},
"phoneNumbers": [
// Object Array
{ // Object
"type": "home”,
"number": "212 555-1234”
},
{
"type": "office”,
"number": "646 555-4567”
}
],
"children": [],
"spouse": null
// Null
}
© 2014 EnterpriseDB Corporation. All rights reserved.
第22页
JSON 9.4 – New Operators and Functions
• JSON
− New JSON creation functions (json_build_object, json_build_array) − json_typeof – returns text data type (‘number’, ‘boolean’, …)
• JSONB data type
− Canonical representation
− Whitespace and punctuation dissolved away − Only one value per object key is kept − Last insert wins − Key order determined by length, then bytewise comparison
− Equality, containment and key/element presence tests − New JSONB creation functions − Smaller, faster GIN indexes − jsonb subdocument indexes
− Use “get” operators to construct expression indexes on subdocument:
− CREATE INDEX author_index ON books USING GIN ((jsondata -> 'authors'));
− SELECT * FROM books WHERE jsondata -> 'authors' ? 'Carl Bernstein'
© 2014 EnterpriseDB Corporation. All rights reserved.
第23页
JSON vs BSON
• BSON – stands for ‘Binary JSON’
• BSON != JSONB
− BSON cannot represent an integer or floating-point number with more than 64 bits of precision.
− JSONB can represent arbitrary JSON values.
• Caveat Emptor!
− This limitation will not be obvious during early stages of a project!
© 2014 EnterpriseDB Corporation. All rights reserved.
第24页
JSON, JSONB or HSTORE?
• JSON/JSONB is more versatile than HSTORE • HSTORE provides more structure • JSON or JSONB?
− if you need any of the following, use JSON
Storage of validated json, without processing or indexing it Preservation of white space in json text Preservation of object key order Preservation of duplicate object
keys Maximum input/output speed
• For any other case, use JSONB
© 2014 EnterpriseDB Corporation. All rights reserved.
第25页
JSONB and Node.js - Easy as
• Simple Demo of Node.js to Postgres cnnection
© 2014 EnterpriseDB Corporation. All rights reserved.
第26页
JSON Performance Evaluation
• Goal
− Help our customers understand when to chose Postgres and when to chose a specialty solution
− Help us understand where the NoSQL limits of Postgres are
• Setup
− Compare Postgres 9.4 to Mongo 2.6 − Single instance setup on AWS M3.2XLARGE (32GB)
• Test Focus
− Data ingestion (bulk and individual) − Data retrieval
© 2014 EnterpriseDB Corporation. All rights reserved.
第27页
Performance Evaluation
Generate 50 Million JSON Documents
Load into Postgres 9.4
(COPY)
Load into MongoDB 2.6 (IMPORT)
T1
50 Million individual INSERT commands
50 Million individual INSERT commands
T2
Multiple SELECT statements
© 2014 EnterpriseDB Corporation. All rights reserved.
Multiple SELECT statements
T3
第28页
NoSQL Performance Evaluation
© 2014 EnterpriseDB Corporation. All rights reserved.
Correction to earlier versions: MongoDB console does not allow for INSERT of documents > 4K. This lead to truncation of the MongoDB size by approx. 25% of all records in the benchmark.
第29页
Foreign Data Wrappers – Co-Existence Platform
• FDW implements SQL/MED ("SQL Management of External Data")
• PostgreSQL 9.1 - read-only support
• PostgreSQL 9.3 – read/write support
• FDW
− Makes data on other servers (or services) look like tables in Postgres
− available for databases (MongoDB, MySQL, Oracle, …), files, services (Twitter, …)
− coming soon: HDFS FDW (Q1’15)
• MongoDB FDW: https://github.com/EnterpriseDB
© 2014 EnterpriseDB Corporation. All rights reserved.
第30页
MongoDB FDW Example
CREATE EXTENSION mongo_fdw;
CREATE SERVER mongo_server FOREIGN DATA WRAPPER mongo_fdw OPTIONS (address '172.24.39.129', port '27017');
CREATE USER MAPPING FOR enterprisedb SERVER mongo_server OPTIONS (username 'mongo', password 'mongo');
CREATE FOREIGN TABLE mongo_data( name text, brand text, type text) SERVER mongo_server OPTIONS ( database 'benchmark', collection 'json_tables');
© 2014 EnterpriseDB Corporation. All rights reserved.
第31页
MongoDB FDW Example
SELECT * FROM mongo_data WHERE brand='ACME' limit 10;
name | brand | type -------------+-------+------AC7553 Phone | ACME | phone AC7551 Phone | ACME | phone AC7519 Phone | ACME | phone AC7565 Phone | ACME | phone AC7555 Phone | ACME | phone AC7529 Phone | ACME | phone AC7528 Phone | ACME | phone AC7547 Phone | ACME | phone AC7587 Phone | ACME | phone AC7541 Phone | ACME | phone
(10 rows)
© 2014 EnterpriseDB Corporation. All rights reserved.
第32页
MongoDB FDW Example
INSERT INTO mongo_data(name, brand, type) VALUES('iphone6 phone','Apple Inc','phone');
SELECT* FROM mongo_data WHERE brand='Apple Inc'; _id | name | brand | type
--------------------------+----------------+-----------+------53ea4f59fe5586a15714881d | iphone6 phone | Apple Inc | phone
UPDATE mongo_data SET brand='Apple Product' WHERE brand='Apple Inc’;
SELECT * FROM mongo_data WHERE brand='Apple Product’; _id | name | brand | type
--------------------------+----------------+---------------+------53ea4f59fe5586a15714881d | iphone6 phone | Apple Product | phone
© 2014 EnterpriseDB Corporation. All rights reserved.
第33页
Structured or Unstructured?
• “No SQL Only” or “Not Only SQL”?
• Structures and standards do emerge!
• Data has references (products link to catalogues; products have bills of material; components appear in multiple products; storage locations link to ISO country tables)
• When the database has duplicate data entries, then the application has to manage updates in multiple places – what happens when there is no ACID transactional model?
© 2014 EnterpriseDB Corporation. All rights reserved.
第34页
Say yes to ‘Not only SQL’
SQL + NoSQL =
Not Only SQL
• Postgres overcomes many of the standard objections “It can’t be done with a conventional database system”
• Postgres
− Combines structured data and unstructured data (ANSI SQL and JSON/HSTORE)
− Is faster (for many workloads) than than the leading NoSQLonly solution
− Integrates easily with Web 2.0 application development environments
− Can be deployed on-premise or in the cloud
Do more with Postgres – the Enterprise NoSQL Solution
© 2014 EnterpriseDB Corporation. All rights reserved.
第35页
360° view of the customer
A Customer and Order Example
• The traditional way:
− Landing, Staging, Normalizing
• The NoSQL Way:
− Use the schema-less data model to maintain the original data formats for customers and orders
− Align orders with customers in one large collection of customer order documents
• The Not Only SQL Way:
− Use the schema-less data model to maintain the original data formats for customers and orders
− Link customers and orders with a flexible N:M relationship
© 2014 EnterpriseDB Corporation. All rights reserved.
第36页
360° view of the customer
The traditional approach
CRCMRM
MaMnOaanrOgdaergedmreemernetnt
Billing eCommerce
Marketing Marketing
Marketing
Customer Customer
Custome r
Customer
Order
Order Order Order
Source Systems
© 2014 EnterpriseDB Corporation. All rights reserved.
Landing 36
Customer Order
Staging
Integrated Data Model
第37页
360° view of the customer
The NoSQL Only Approach
CRM
CRM Order ManaOgerdmeernt Management
Billing
Customer Collection Customer Information Customer CustomerCustome
r Customer
Order Information
eCommerce Marketing Marketing
Marketing
Order
Order Order Order
• Document-oriented approach
• Very fast implementation
• Differences in source data can be maintained
• Orders belong to customers (sub documents) or viceversa
• Hard codes the data access model!
• Limited reuse!
© 2014 EnterpriseDB Corporation. All rights reserved.
第38页
360° view of the customer
The Not Only SQL Approach
CRM
CRM
• Two types of documents (Customers and Orders) • Very fast implementation • Differences in source data can be maintained
Order ManaOgerdmeernt
Management
Billing
eCommerce
Customer Information
Custome
Customer r
Customer
Customer
Order Information
N:M
Order
Order
Order
Order
Marketing Marketing
Marketing
Source Systems
© 2014 EnterpriseDB Corporation. All rights reserved.
• No hard coded relationships
• Multiple access methods
• Reuse for multiple use cases
第39页
Data Integration – The Not Only SQL Way
• Use JSON schema-less data where appropriate (customers and orders)
• Leverage SQL structures and relationships where possible (who bought what when)
• Design for reuse – do not encode the initial use case into the data model
• Create highly reusable data models
• Build on well-established technology platforms that integrate with your IT environment and that support your data strategy
© 2014 EnterpriseDB Corporation. All rights reserved.
第40页
Not Only SQL Rules!
• Use structures and relationships where appropriate, be flexible everywhere else
− Use JSON for data with high degrees of variability − Use the flexibility of JSON to bridge multiple formats and data
elements − Use SQL to make relationships explicit – don’t hard code them
• Leverage a single tech platform to avoid tech silos, skills silos and data silos
• Focus on creating value-add, not on setting up yet another infrastructure
• Build on the most advanced open source DBMS’s Capabilities
Do more with Postgres!
© 2014 EnterpriseDB Corporation. All rights reserved.
第41页
Doing More with Postgres…
Open source alternative to commercial RDBMS
•Reduce cost •Leverage in-house talent •Flexible license model
How to do more with Postgres
© 2014 EnterpriseDB Corporation. All rights reserved.
RDBMS platform for new developments
•Proven RDBMS •SQL compliant •Extremely stable
Innovative DBMS Platform
•Not only SQL (SQL + JSON/KVP) •Web 2.0 friendly •Foreign Data Wrappers •PostGIS
第42页
Useful Resources
• Whitepapers @ http://www.enterprisedb.com/nosql-for-enterprise
− PostgreSQL Advances to Meet NoSQL Challenges (business oriented)
− Using the NoSQL Capabilities in Postgres (full of code examples)
• Run the NoSQL benchmark
− https://github.com/EnterpriseDB/pg_nosql_benchmark
• Now Live: PG XDK for Postgres NoSQL App Development
− http://blogs.enterprisedb.com/2014/10/14/now-live-pg-xdk-forpostgres-nosql-app-development/
© 2014 EnterpriseDB Corporation. All rights reserved.
第43页
© 2014 EnterpriseDB Corporation. All rights reserved.