vitastor/docs/installation/s3.en.md

7.7 KiB

Documentation → Installation → S3 for Vitastor


Читать на русском

S3 for Vitastor

The moment has come - Vitastor S3 implementation based on Zenko CloudServer is released.

Highlights

  • Zenko CloudServer is implemented in node.js.
  • Object metadata is stored in MongoDB.
  • Modified Zenko CloudServer version is used for Vitastor. It is slightly different from the original, has an optimised build and unneeded dependencies are stripped off.
  • Object data is stored in Vitastor block volumes, but the volume metadata is stored in the same MongoDB, not in Vitastor etcd.
  • Objects are written to volumes sequentially one after another. The space is allocated with rounding to the sector size (4 KB), so each object takes at least 4 KB.
  • An important property of such storage scheme is that small objects aren't chunked into parts in Vitastor EC N+K pools and thus don't require reads from all N disks when downloading.
  • Deleted objects are marked as deleted, but the space is only actually freed during asynchronously executed "defragmentation" process. Defragmentation runs automatically in the background when a volume reaches configured amount of "garbage" (20% by default). Defragmentation copies actual objects to new volume(s) and then removes the old volume. Defragmentation can be configured in locationConfig.json.

Plans for future development

  • User account storage in the DB instead of a static file. Original Zenko uses a separate closed-source "Scality Vault" service for it, that's why we use a static file for now.
  • More detailed documentation.
  • Support for other (and faster) key-value DBMS for object metadata storage.
  • Other performance optimisations, for example, related to the used hash function - MD5 used for Amazon compatibility purposes is relatively slow.
  • Object Lifecycle support. There is a Lifecycle implementation for Zenko called Backbeat but it's not adapted for Vitastor yet.
  • Quota support. Original Zenko uses a separate "SCUBA" service for quotas, but it's also proprietary and not available publicly.

Installation

In a few words:

  • Install MongoDB, create a user for S3 metadata DB.
  • Create a Vitastor pool for S3 data.
  • Download and setup the Docker container vitalif/vitastor-zenko.

Setup MongoDB

You can setup MongoDB yourself, following the MongoDB manual.

Or you can follow the instructions below - it describes a simple example of MongoDB setup in Docker (through docker-compose) with 3 replicas.

  1. On each host, create a file docker-compose.yml with the content listed below. Replace <YOUR_PASSWORD> with your future mongodb administrator password, and optionally replace 0.0.0.0 with localhost,<server_IP>. It's recommended to either use a private IP or setup TLS afterwards.
version: '3.1'

services:

  mongo:
    container_name: mongo
    image: mongo:7-jammy
    restart: always
    environment:
      MONGO_INITDB_ROOT_USERNAME: root
      MONGO_INITDB_ROOT_PASSWORD: <YOUR_PASSWORD>
    network_mode: host
    volumes:
      - ./keyfile:/opt/keyfile
      - ./mongo-data/db:/data/db
      - ./mongo-data/configdb:/data/configdb
    entrypoint: /bin/bash -c
    command: [ "chown mongodb /opt/keyfile && chmod 600 /opt/keyfile && . /usr/local/bin/docker-entrypoint.sh mongod --replSet rs0 --keyFile /opt/keyfile --bind_ip 0.0.0.0" ]
  1. Generate a shared cluster key using openssl rand -base64 756 > ./keyfile and copy that keyfile to all hosts.

  2. Start MongoDB on all hosts with docker compose up -d mongo.

  3. Enter Mongo Shell with docker exec -it mongo mongosh -u root -p <YOUR_PASSWORD> localhost/admin and execute the following command (replace IP addresses 10.10.10.{1,2,3} with your host IPs):

rs.initiate({ _id: 'rs0', members: [ { _id: 1, host: '10.10.10.1:27017' }, { _id: 2, host: '10.10.10.2:27017' }, { _id: 3, host: '10.10.10.3:27017' } ] })

  1. Stay in Mongo Shell and create a user for the future S3 database:

db.createUser({ user: 's3', pwd: '<YOUR_S3_PASSWORD>', roles: [ { role: 'readWrite', db: 's3' }, { role: 'dbAdmin', db: 's3' }, { role: 'readWrite', db: 'vitastor' }, { role: 'dbAdmin', db: 'vitastor' } ] })

Setup Vitastor

Create a pool in Vitastor for S3 object data, for example:

vitastor-cli create-pool --ec 2+1 -n 512 s3-data --used_for_app s3:standard

The --used_for_app options works as fool-proofing and prevents you from accidentally creating a regular block volume in the S3 pool and overwriting some S3 data. Also it hides inode space statistics from Vitastor etcd.

Retrieve the ID of your pool with vitastor-cli ls-pools s3-data --detail.

Setup Vitastor S3

  1. Add the following lines to docker-compose.yml (instead of network_mode: host, you can use ports: [ "8000:8000", "8002:8002" ]):
  zenko:
    container_name: zenko
    image: vitalif/vitastor-zenko
    restart: always
    security_opt:
      - seccomp:unconfined
    ulimits:
      memlock: -1
    network_mode: host
    volumes:
      - /etc/vitastor:/etc/vitastor
      - /etc/vitastor/s3:/conf
  1. Download Docker image: docker pull vitalif/vitastor-zenko

  2. Extract configuration file examples from the Docker image:

    docker run --rm -it -v /etc/vitastor:/etc/vitastor -v /etc/vitastor/s3:/conf vitalif/vitastor-zenko configure.sh
    
  3. Edit configuration files in /etc/vitastor/s3/:

    • config.json - common settings.
    • authdata.json - user accounts and access keys.
    • locationConfig.json - S3 storage class list with placement settings. Note: it actually contains storage classes (like STANDARD, COLD, etc) instead of "locations" (zones like us-east-1) as in the original Zenko CloudServer.
    • Put your MongoDB connection data into config.json and locationConfig.json.
    • Put your Vitastor pool ID into locationConfig.json.
    • For now, the complete list of Vitastor backend settings is only available in the code.

Start Zenko

Start the S3 server with:

docker run --restart always --security-opt seccomp:unconfined --ulimit memlock=-1 --network=host \
    -v /etc/vitastor:/etc/vitastor -v /etc/vitastor/s3:/conf --name zenko vitalif/vitastor-zenko

If you use default settings, Zenko CloudServer starts on port 8000. The default access key is accessKey1 with a secret key of verySecretKey1.

Now you can access your S3 with, for example, s3cmd:

s3cmd --access_key=accessKey1 --secret_key=verySecretKey1 --host=http://localhost:8000 mb s3://testbucket

Or even mount it with GeeseFS:

AWS_ACCESS_KEY_ID=accessKey1 \
    AWS_SECRET_ACCESS_KEY=verySecretKey1 \
    geesefs --endpoint http://localhost:8000 testbucket mountdir

Author & License