Saturday, May 20, 2017

Docker Overview

A container is packaged as an entire runtime environment: the service/app plus all dependencies, libraries, & configuration files needed to run it
Portable across environments & lightweight (share the OS)



The above image summarize the difference between container and VM, yet they can be combined and docker can be nested inside VM.

Different docker technoloy available such as Docker : www.docker.com, Mesos : http://mesos.apache.org/ and Kubernetes : https://kubernetes.io/


We will pick Docker to give high level functionality of it here.

Docker


Docker began as an internal project for the dotCloud organization. 

It was developed in-house and then later open sourced in 2013.

Enables you to:
Separate your applications from your infrastructure so you can deliver software quickly.
Manage your infrastructure in the same ways you manage your applications



As we can see Docker composed of Server (docker daemon) which expose the docker functionality via REST APIs, the docker command line client uses these REST APIs to communicate with the daemon service/server.

The main components as we can see is the Images, Containers, Network and data Volumes.
We can add to them the registries.

The following show the architecture and include the registry in the picture:



1)  Docker Images : Templates


An image is a read-only template with instructions “Dockerfile” for creating a Docker container. Often, an image is based on another image, with some additional customization.
You might create your own images or use those created and published by others in a registry.
When you change the Dockerfile and rebuild the image, only those layers which have changed are rebuilt.
This is part of what makes images so lightweight, small, and fast, when compared to other virtualization technologies.


Example of Dockerfile:



It is composed of 3 main parts, base image for that docker, different docker building steps including our application, finally the start command of that docker.
You should know that docker image is layered and Any RUN commands you specify in the Dockerfile creates a new layer for the container, this allow us to share the layers and build upon them which improve the usability of the containers and their layers.


2) Docker Registries: Templates Store
A Docker registry stores Docker images.
Docker Hub and Docker Cloud are public registries that anyone can use, and Docker is configured to look for images on Docker Hub by default.
You can even run your own private registry - “Docker Trusted Registry (DTR)”
You can push, and pull images from any Docker registry

A free https://cloud.docker.com/ account can be created where you can use it to store your docker images.
To use your Docker Cloud account:
docker login : will prompt for username and password
docker push : push to store any image in your 
docker pull : pull any image to your local machine



You can use:  docker search keyword to search for any docker image.
e.g. docker search oracle  ==> to search for Oracle images.



3) Docker Containers: Running instances
A container is a runnable instance of an image.
You can create, run, stop, move, or delete a container using the Docker API or CLI
You can connect a container to:
One or more networks
Attach storage to it
Capture a new image based on its current state.


4) Docker Network :
By default, Docker provides two network drivers:
Bridge (default) : limited to a single host running Docker Engine.
Overlay network : supports multiple hosts.
You can create your own network:
docker network create -d bridge my_bridge
To list existing networks: docker network ls
To add a docker into a network:  docker run -d --net=my_bridge …..
Optionally you can select the IP as well using --ip=ip_address (or --ip6=…)
To inspect network: docker network inspect my_bridge


5) Docker Volumes :
In addition to Docker Union File System which compose the Docker layers, Additional Storage can be mounted such as Data Volume :
Used to persist data, independent of the container’s lifecycle. 
Mounted during create or run of the docker using -v
Example : $ docker run -d -P --name web -v /webapp training/webapp python app.py
You can also mount existing host directory using the same –v
Example: $ docker run -d -P --name web -v /src/webapp:/webapp training/webapp python app.py
To list volumes: docker volume ls
Note: Shared storage can be used but need to pay attention to write operations to avoid data corruption.


6) Docker Swarm :
A swarm is a group of machines (physical or virtual) that are running Docker and have been joined into a cluster.
Contains Swarm Manager and Worker Nodes.
Uses several strategies to run containers:
“emptiest node” : which fills the least utilized machines with containers
“global” : which ensures that each machine gets exactly one instance of the specified container.
You execute: docker swarm init to convert this machine to Swarm Manager and then use: docker swarm join in other machines to join this cluster.


7) Docker Service :
A service only runs one image.
Described using: docker-compose.yml
Describing: what ports it should use, how many replicas, resources, etc.



docker stack deploy -c docker-compose.yml myapp
docker stack ps myapp
docker stack rm myapp


Example: Micro-service Example: Java REST App connect to Oracle DB



The following are the steps to create this example using docker command line:

Execution Steps:
//Build our Java Application Docker ...
docker build -t my_java_docker .
//Create Network for our Java & DB Dockers
docker network create -d bridge my_bridge
//Search for Oracle DB XE
docker search oracle
//Pull one of Oracle DB EX images (not official)
docker pull wnameless/oracle-xe-11g
//Now run the DB container in the my_bridge network ...
docker run --net=my_bridge -d -p 49160:22 -p 49161:1521 -e ORACLE_ALLOW_REMOTE=true wnameless/oracle-xe-11g
//List current running docker instances to get DB XE Instance ID
docker ps
//Access the DB Docker using bash and execute the required DB scripts.
docker exec -it "container id from the previous step" bash
//run SQLPlus to create our DB objects ..
sqlplus
//install DB objects
...
//Inspect my network to get the IP address of the DB container
docker network inspect my_bridge
//Run the Java container using the DB IP address as Environment variable in the Java docker container so it can connect to the DB successfully.
docker run -p 4010:80 --net=my_bridge -e DBAAS_DEFAULT_CONNECT_DESCRIPTOR=172.18.0.3:1521:XE my_java_docker
// or use this in env file in the format of ENV VAR=VALUE

docker run -p 4010:80 --net=my_bridge --env-file=./env.txt my_java_docker



That's just an introduction about Docker and Container Technologies.

For More information check Docker Documentation including samples and a lot of examples can be found at:
https://docs.docker.com/



Wednesday, May 18, 2016

Designing Scalable MongoDB Documents versus Relational DB Entities

One of the challenging to design MongoDB data model is the background knowledge of relational DB which will affect our ability to design optimal scalable data model structure.

In this post we will demonstrate a use case that is taken from the book; Instant MongoDB by Amol Nayak.

The use case is about Students enrolled in courses that taught by lecturers.
The relations can be summarized as following:

We have student use case where,
- Student enroll into courses (many to many)
- Each course can belong to many categories (one to many)
- Each course is delivered by many lecturers (one to many)
- Each course has content (one to one)
- Each course content is divided into parts (one to many)
- Each content part is related to assignments (one to many)
- Each student has assignment submission that is related to assignment (one to one)











Now to model this ER diagram for MongoDB documents, we need to do the following:

1) Think of the main documents that we have
The main document is a key player, well defined, and contains a lot of information that doesn't let it simply included in other documents.
We can think of Student, Course and Lecturer.

2) Embed Related Documents
We can see the Student embed his/her submissions while Course embed all other documents that is related to it such as catalog, content, assignment being all part of the course document.

3) Add reference to other documents (using id)
We can see the Student reference his/her courses.
Course reference the lecturers .

4) Add minimal information to the referenced documents
Select the information that is not frequently change and will be mostly needed in the application.
e.g. add course name in the referenced course in the student document (mostly will be required instead of go and query the course document, plus the course name is rarely change).
Also add lecturer name in the course document which will be mostly required and will change rarely as well but will prevent us from query the lecturer document to get the name with each course.

5) Revisit the documents 
To see if we can omit some documents and include them in one of the existing documents.
So for example if you decided to have a separate document for Course Category, at this step you'll see that the category has only name value so it is better to include it inside the course instead of reference it with id+name as it will cost us more information in that case.





 As we can see in the previous figure, we have identified 3 main documents with some embedded documents and we have selected the referenced documents, finally we have included the required minimal data in each of the referenced entities.
For example Course category has only static values, so we have included it entirely in our Course document and we didn't defined a separate document for it.
The same for Students submissions which reference the assignment but include all the required information so no separate document for it.
The other information that is related to course is also included in the course being part of the course document including the content parts, assignments, etc..

The challenge about the Lecture object here, the lecturer object has a lot of information about the lecturer that doesn't make any sense to put them in the course document and repeat them for different courses instead we can reference the lecturer document and define the minimal required information that we need to show or use it in our application, in this case lecturer name, the good thing about this information is not frequently change as well.

This is how to design documents in mongoDB for salable applications.