How to deal with persistent storage (e.g. databases) in docker

How do people deal with persistent storage for your docker containers? I am currently using this approach: build the image, eg for Postgres, and then start the container with

docker run --volumes-from c0dbc34fd631 -d app_name/postgres

IMHO, that has the drawback, that I must not ever (by accident) delete container "c0dbc34fd631".

Another idea would be to mount host volumes "-v" into the container, however, the userid within the container does not necessarily match the userid from the host, and then permissions might be messed up.

Note: Instead of --volumes-from 'cryptic_id' you can also use --volumes-from my-data-container where my-data-container is a name you assigned to a data-only container, eg docker run --name my-data-container ... (see accepted answer)


Docker 1.9.0 and above

Use volume API

docker volume create --name hello
docker run -d -v hello:/container/path/for/volume container_image my_command

this means that the data only container pattern must be abandoned in favour of the new volumes.

Actually the volume API is only a better way to achieve what was the data-container pattern.

If you create a container with a -v volume_name:/container/fs/path docker will automatically create a named volume for you that can:

  • Be listed through the docker volume ls
  • Be identified through the docker volume inspect volume_name
  • Backed up as a normal dir
  • Backed up as before through a --volumes-from connection
  • The new volume api adds a useful command that let you identify dangling volumes:

    docker volume ls -f dangling=true
    

    And then remove it through its name:

    docker volume rm <volume name>
    

    as @mpugach underlines in the comments you can get rid of all the dangling volumes with a nice one liner:

    docker volume rm $(docker volume ls -f dangling=true -q)
    # or using 1.13.x
    docker volume prune
    

    Docker 1.8.x and below

    The approach that seems to work best for production is to use a data only container .

    The data only container is run on a barebone image and actually does nothing except exposing a data volume.

    Then you can run any other container to have access to the data container volumes:

    docker run --volumes-from data-container some-other-container command-to-execute
    
  • Here you can get a good picture of how to arrange the different containers
  • Here there is a good insight on how volumes work
  • In this blog post there is a good description of the so called container as volume pattern which clarifies the main point of having data only containers .

    Docker documentation has now the DEFINITIVE description of the container as volume/s pattern.

    Following is backup/restore procedure for Docker 1.8.x and below

    BACKUP:

    sudo docker run --rm --volumes-from DATA -v $(pwd):/backup busybox tar cvf /backup/backup.tar /data
    
  • --rm: remove the container when it exits
  • --volumes-from DATA: attach to the volumes shared by the DATA container
  • -v $(pwd):/backup: bind mount the current directory into the container; to write the tar file to
  • busybox: a small simpler image - good for quick maintenance
  • tar cvf /backup/backup.tar /data: creates an uncompressed tar file of all the files in the /data directory
  • RESTORE:

    # create a new data container
    $ sudo docker run -v /data -name DATA2 busybox true
    # untar the backup files into the new container᾿s data volume
    $ sudo docker run --rm --volumes-from DATA2 -v $(pwd):/backup busybox tar xvf /backup/backup.tar
    data/
    data/sven.txt
    # compare to the original container
    $ sudo docker run --rm --volumes-from DATA -v `pwd`:/backup busybox ls /data
    sven.txt
    

    Here is a nice article from the excellent Brian Goff explaining why it is good to use the same image for a container and a data container.


    In Docker release v1.0 , binding a mount of a file or directory on the host machine can be done by the given command:

    $ docker run -v /host:/container ...
    

    The above volume could be used as a persistent storage on the host running docker.


    As of docker-compose 1.6, there is now improved support for data volumes in docker Compose. The following compose file will create a data image which will persist between restarts (or even removal) of parent containers:

    Here is the blog announcement: https://blog.docker.com/2016/02/compose-1-6/

    Here's an example compose file:

    version: "2"
    
    services:
      db:
        restart: on-failure:10
        image: postgres:9.4
        volumes:
          - "db-data:/var/lib/postgresql/data"
      web:
        restart: on-failure:10
        build: .
        command: gunicorn mypythonapp.wsgi:application -b :8000 --reload
        volumes:
          - .:/code
        ports:
          - "8000:8000"
        links:
          - db
    
    volumes:
      db-data:
    

    As far as I can understand: This will create a data volume container ( db_data ) which will persist between restarts.

    If you run: docker volume ls you should see your volume listed:

    local               mypthonapp_db-data
    ...
    

    You can get some more details about the data volume:

    docker volume inspect mypthonapp_db-data
    [
      {
        "Name": "mypthonapp_db-data",
        "Driver": "local",
        "Mountpoint": "/mnt/sda1/var/lib/docker/volumes/mypthonapp_db-data/_data"
      }
    ]
    

    Some testing:

    # start the containers
    docker-compose up -d
    # .. input some data into the database
    docker-compose run --rm web python manage.py migrate
    docker-compose run --rm web python manage.py createsuperuser
    ...
    # stop and remove the containers:
    docker-compose stop
    docker-compose rm -f
    
    #start it back up again
    docker-compose up -d
    
    # verify the data is still there
    ...
    (it is)
    
    # stop and remove with the -v (volumes) tag:
    
    docker-compose stop
    docker=compose rm -f -v
    
    # up again .. 
    docker-compose up -d
    
    # check the data is still there:
    ...
    (it is). 
    

    Notes:

  • You can also specify various drivers in the volumes block. eg: You could specify the flocker driver for db_data:

    volumes:
      db-data:
        driver: flocker
    
  • As they improve the integration between Docker Swarm and Docker Compose (and possibly start integrating Flocker into the Docker eco-system (I heard a rumor that Docker have bought Flocker), I think this approach should become increasingly powerful.
  • Disclaimer: This approach is promising, and I'm using it successfully in a development environment. I would be apprehensive to use this in production just yet!

    链接地址: http://www.djcxy.com/p/3006.html

    上一篇: 将文件从Docker容器复制到主机

    下一篇: 如何处理docker中的永久存储(例如数据库)