mongoDB入门简介-Java篇： Strongly Typed Languages and Flexible Schemas

AirJD

没有录音文件

00:00/00:00

加收藏

mongoDB入门简介-Java篇： Strongly Typed Languages and Flexible Schemas

发布者 mongoconf 简介 MongoDB北京2014大会

发布于 1439254175894 浏览 10198 关键词 MongoDB, Java, NoSQL, English

分享到

第1页

第2页

Strongly Typed Languages and Flexible Schemas

第3页

Agenda

Strongly Typed Languages
Flexible Schema Databases
Change Management
Strategies
Tradeoffs

第4页

Strongly Typed Languages

第5页

"A programming language that requires a variable to be defined as well as the variable it is"

Do not confuse strongly typed with statically typed languages because they tend to be different.
You can find different definitions out in the internet on what does this means and what's the different categorization aspects

第6页

Flexible Schema Databases

第7页

Traditional RDMS

create table users (id int, firstname text, lastname text);

Table definition

Column structure

Definition

第8页

Traditional RDMS

Table with checks

create table cat_pictures(
    id int not null,
    size int not null,
    picture blob not null,
    user_id int,
    primary key (id),
    foreign key (user_id) references users(id));

Null checks

Foreign and Primary key checks

Definition

第9页

Traditional RDMS

users

cat_pictures

Once you have this structure you can now start building up your application

第10页

Is this Flexible?

What happens when we need to change the schema?
Add new fields
Add new relations
Change data types
What happens when we need to scale out our data structure?

第11页

Flexible Schema Database

Document

Graph

Key Value

There are few examples of Flexible schema databases:
Document oriented databases
Graph databases
Key Value Stores

第12页

Flexible Schema

No mandatory schema definition
No structure restrictions
No schema validation process

No mandatory schema definiton
If a collection does not exist one will be created
If a database does not exist one will be created
No Structure Restrictions
No forced fields or data types

第13页

We start from code

public class CatPicture {

int size;
byte[] blob;

}

public class User {

int id;
String firstname;
String lastname;

CatPicture[] cat_pictures;

}

Definition

第14页

Document Structure

{
  _id: 1234,
  firstname: 'Juan',
  lastname: 'Olivo',
  cat_pictures: [ {
      size: 10,
      picture: BinData("0x133334299399299432"),
    }
  ]
}

Rich Data Types

Embedded Documents

Definition

第15页

Flexible Schema Databases

Challenges
Different Versions of Documents
Different Structures of Documents
Different Value Types for Fields in Documents

第16页

Different Versions of Documents

Same document across time suffers changes on how it represents data

{ "_id" : 174, "firstname": "Juan" }

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo" }

First Version

Second Version

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo" , "cat_pictures":
[{"size": 10, picture: BinData("0x133334299399299432")}]
}

Third Version

第17页

Different Versions of Documents

Same document across time suffers changes on how it represents data

{ "_id" : 174, "firstname": "Juan" }

{ "_id" : 174, "name": { "first": "Juan", "last": "Olivo"} }

Different Structure

第18页

Different Structures of Documents

Different documents coexisting on the same collection

{ "_id" : 175, "brand": "Ford", "model": "Mustang", "date": ISODate("XXX") }

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo" }

Within same collection

第19页

Different Data Types for Fields

Different documents coexisting on the same collection

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "bdate": 1224234312}

{ "_id" : 175, "firstname": "Paco", "lastname": "Hernan", "bdate": "2015-06-27"}

{ "_id" : 176, "firstname": "Tomas", "lastname": "Marce", "bdate": ISODate("2015-06-27")}

Same field, different data type

第20页

Change Management

第21页

Change Management

第22页

Strategies

第23页

Strategies

Decoupling Architectures
ODM'S
Versioning
Data Migrations

第24页

Decoupled Architectures

第25页

Strongly Coupled

第26页

Becomes a mess in your hair…

第27页

Coupled Architectures

Database

Application A

Application C

Application B

Let me perform some schema changes!

第28页

Decoupled Architecture

Database

Application A

API

Application C

Application B

第29页

Decoupled Architectures

Allows the business logic to evolve independently of the data layer
Decouples the underlying storage / persistency option from the business service
Changes are "requested" and not imposed across all applications
Better versioning control of each request and it's mapping

第30页

ODM's

第31页

ODM

Reduce impedance between code and Databases
Data management facilitator
Hides complexity of operators
Tries to decouple business complexity with "magic" recipes

第32页

Spring Data

POJO centric model
MongoTemplate || CrudRepository extensions to make the connection to the repositories
Uses annotations to override default field names and even data types (data type mapping)

public interface UserRepository extends MongoRepository<User, Integer>{

}

public class User {

@Id
int id;

@Field("first_name")
String firstname;
String lastname;

第33页

Spring Data Document Structure

{
  "_id": 1,
  "first_name": "first",
  "lastname": "last",
  "catpictures": [
    {
      "size": 10,
      "blob": BinData(0, "Kr3AqmvV1R9TJQ==")
    },
  ]
}

Definition

第34页

Spring Data Considerations

Data formats, versions and types still need to be managed
Does not solve issues like type validation out-of-box
Can make things more complicated but more "controllable"

@Field("first_name")
String firstname;

第35页

Morphia

Data source centric
Will do all the discovery of POJO's for given package
Also uses annotations to perform overrides and deal with object mapping

@Entity("users")
public class User {
@Id
int id;
String firstname;
String lastname;

morphia.mapPackage("examples.odms.morphia.pojos");

Datastore datastore = morphia.createDatastore(new MongoClient(), "morphia_example");
datastore.save(user);

第36页

Morphia Document Structure

{
  "_id": 1,
  "className": "examples.odms.morphia.pojos.User",
  "firstname": "first",
  "lastname": "last",
  "catpictures": [
    {
      "size": 10,
      "blob": BinData(0, "Kr3AqmvV1R9TJQ==")
    },
  ]
}

Class Definition

Definition

第37页

Morphia Considerations

Enables better control at Class loading
Also facilitates, like Spring Data, the field overriding (tags to define field keys)
Better support for Object Polymorphism

第38页

Versioning

第39页

Versioning

Versioning of data structures (specially documents) can be very helpful

You must correctly generate the new version number in a multithreaded system
You must return only the current version of each document when there is a query
You must "update" correctly by including all current attributes in addition to newly provided attributes
If the system fails at any point, you must either have a consistent state of the data, or it must be possible on re-start to infer the state of the data and clean it up, or otherwise bring it to consistent state.

第40页

Versioning – Option 0

Change existing document each time there is a write with monotonically increasing version number inside

{ "_id" : 174, "v" : 1, "firstname": "Juan" }

{ "_id" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" }

{ "_id" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" }

> db.users.update( {"_id":174 } , { {"$set" :{ ... }, {"$inc": { "v": 1 }} } )

Increment field value

第41页

Versioning – Option 1

Store full document each time there is a write with monotonically increasing version number inside

{ "docId" : 174, "v" : 1, "firstname": "Juan" }

{ "docId" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" }

{ "docId" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" }

> db.users.insert( {"docId":174 …})
> db.docs.find({"docId":174}).sort({"v":-1}).limit(-1);

Find always latest version

第42页

Versioning – Option 2

Store all document versions inside a single document.

> db.users.update( {"_id": 174 } , { {"$set" :{ "current": ... },
{"$inc": { "current.v": 1 }}, {"$addToSet": {"prev": {... }}} } )

Current value

{  "_id" : 174, "current" : { "v" :3, "attr1": 184, "attr2" : "A-1" },
    "prev" : [
          {  "v" : 1,  "attr1": 165 },
          {  "v" : 2,  "attr1": 165, "attr2": "A-1" }
    ]
}

Previous values

第43页

Versioning – Option 3

Keep collection for "current" version and past versions

> db.users.find( {"_id": 174 })
> db.users_past.find( {"pid": 174 })

{ "pid" : 174, "v" : 1, "firstname": "Juan" }

{ "pid" : 174, "v" : 2, "firstname": "Juan", "lastname": "Olivo" }

{ "_id" : 174, "v" : 3, "firstname": "Juan", "lastname": "Olivo", "gender": "M" }

Previous versions collection

Current collection

第44页

Versioning

第45页

Migrations

第46页

Migrations

Several types of "Migrations":

You must correctly generate the new version number in a multithreaded system
You must return only the current version of each document when there is a query
You must "update" correctly by including all current attributes in addition to newly provided attributes
If the system fails at any point, you must either have a consistent state of the data, or it must be possible on re-start to infer the state of the data and clean it up, or otherwise bring it to consistent state.

第47页

Add / Remove Fields

For Flexible Schema Database this is our Bread & Butter

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "gender": "M" }

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo", "newfield": "value" }

> db.users.update( {"_id": 174}, {"$set": { "newfield": "value" }, "$unset": {"gender":""} })

第48页

Change Field Names

Again, programmatically you can do it

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo",}

{ "_id" : 174, "first": "Juan", "last": "Olivo" }

> db.users.update( {"_id": 174}, {"$rename": { "firstname": "first", "lastname":"last"} })

第49页

Change Field Data Type

Align to a new code change and move from Int to String

{..."bdate": 1435394461522}

{..."bdate": "2015-06-27"}

1) Batch Process

2) Aggregation Framework

3) Change based on usage

第50页

Change Field Data Type

1) Batch Process – bulk api

public void migrateBulk(){
DateFormat df = new SimpleDateFormat("yyyy-MM-DD");
...
List<UpdateOneModel<Document>> toUpdate =
new ArrayList<UpdateOneModel<Document>>();
for (Document doc : coll.find()){
String dateAsString = df.format( new Date( doc.getInteger("bdate", 0) ));
Document filter = new Document("_id", doc.getInteger("_id"));
Document value = new Document("bdate", dateAsString);
Document update = new Document("$set", value);

toUpdate.add(new UpdateOneModel<Document>(filter, update));
}
coll.bulkWrite(toUpdate);

第51页

Change Field Data Type

1) Batch Process – bulk api

public void migrateBulk(){
...
for (Document doc : coll.find()){
...
}
coll.bulkWrite(toUpdate);

Is there any problem with this?

第52页

Change Field Data Type

1) Batch Process – bulk api

public void migrateBulk(){
...
//bson type 16 represents int32 data type
Document query = new Document("bdate", new Document("$type", "16"));
for (Document doc : coll.find(query)){
...
}
coll.bulkWrite(toUpdate);

More efficient filtering!

第53页

Extract Document into Collection

Normalize your schema

{"size": 10, picture: BinData("0x133334299399299432")}

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo",}

> db.users.aggregate( [
  {$unwind: "$cat_pictures"},
  {$project: { "_id":0, "uid":"$_id",  "size": "$cat_pictures.size", "picture": "$cat_pictures.picture"}},
  {$out:"cats"}])

{ "_id" : 174, "firstname": "Juan", "lastname": "Olivo" , "cat_pictures":
[{"size": 10, picture: BinData(0, "m/lhLlLmoNiUKQ==")}]
}

{"size": 10, "picture": BinData(0, "m/lhLlLmoNiUKQ==")}

第54页

Tradeoffs

第55页

Tradeoffs

第56页

Recap

第57页

Recap

Flexible and Dynamic Schemas are a great tool
Use them wisely
Make sure you understand the tradeoffs
Make sure you understand the different strategies and options
Works well with Strongly Typed Languages

第58页

Free Education

https://university.mongodb.com/courses/M101J/about

Next Session starting on Aug 04

第59页

Obrigado!

Norberto Leite
Technical Evangelist
http://www.mongodb.com/norberto
norberto@mongodb.com
@nleite

mongoDB入门简介-Java篇： Strongly Typed Languages and Flexible Schemas

第1页

第2页

第3页

第4页

第5页

第6页

第7页

第8页

第9页

第10页

第11页

第12页

第13页

第14页

第15页

第16页

第17页

第18页

第19页

第20页

第21页

第22页

第23页

第24页

第25页

第26页

第27页

第28页

第29页

第30页

第31页

第32页

第33页

第34页

第35页

第36页

第37页

第38页

第39页

第40页

第41页

第42页

第43页

第44页

第45页

第46页

第47页

第48页

第49页

第50页

第51页

第52页

第53页

第54页

第55页

第56页

第57页

第58页

第59页

第60页