Getting Started

CPU 혹은 GPU를 기반으로 추론 엔진을 제공하는 ServerSDK와
Feature-Vector를 고속으로 매칭하는 MatchingSDK를 경험해 볼 수 있습니다.

Prerequisites

H/W Requirements

구분

Description

CPU

Intel 2nd Gen Core CPU or higher (16 core or more)

CPU instruction set

SSE4.2, AVX, AVX2, AVX-512

GPU

A10, A30, T4(단종)

RAM

64G

Hard drive

NVMe SSD or higher, 100G

OS

Ubuntu 20.04

S/W Requirements

구분

Description

Docker

v19.03 or later

Docker-Compose

v1.25.1 or later

NVIDIA Container Toolkit

컨테이너에서 GPU 사용을 위해 필요한 패키지

REST-API Service Port

구분

PORT

(GPU) Inference API

8080

(CPU) Inference API

8081

Matching API

8082

Server Installation

Docker-Compose를 통해 서버를 시작/중지하는 방법을 소개합니다.

Start Server

$ docker-compose up -d

Stop Server

$ docker-compose down

docker-compose.yml (예시)

아래에 열거된 Docker Image는, Cubox SDK 배포 담당자에게 문의하시기 바랍니다.

Inference Server (GPU)

version: "3.8"

services:
 inference-service:
  restart: always
  image: dlite-gpu-server:1.10.0-A10
  shm_size: '1gb'
  ulimits:
   memlock: -1
   stack: 67108864
  command: "tritonserver --model-repository=/models --model-control-mode=none --strict-model-config=1"
  ports:
   - 8000:8000
   - 8001:8001
   - 8002:8002
  networks:
   - gpu_infer_network
  working_dir: /opt/tritonserver/bin
  deploy:
   resources:
    reservations:
     devices:
      - capabilities: [gpu]
        driver: nvidia
        device_ids: ['0']

 api-service:
  restart: always
  image: dlite-gpu-api:1.10.3
  ports:
   - 8080:8080
  networks:
   - gpu_infer_network

networks:
 gpu_infer_network:
  driver: bridge

Inference Server (CPU)

version: "3.8"

services:
 inference-service:
  restart: always
  image: dlite-cpu-server:1.0.3
  shm_size: '1gb'
  ulimits:
   memlock: -1
   stack: 67108864
  command: "--config_path /ovms/config/config_cubox_face_recognition_pipeline_fp16.json --rest_port 9000 --port 9001 --file_system_poll_wait_seconds 0"
  ports:
   - 9000:9000
   - 9001:9001
  networks:
   - cpu_infer_network

 api-service:
  restart: always
  image: dlite-cpu-api:1.9.8
  ports:
   - 8081:8081
  networks:
   - cpu_infer_network

networks:
 cpu_infer_network:
  driver: bridge

Matching Server

version: "3.8"

services:
  standalone:
    image: milvusdb/milvus:v2.2.2
    command: ["milvus", "run", "standalone"]
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/Volumes/milvus:/var/lib/milvus
    ports:
      - "19530:19530"
      - "9091:9091"
    depends_on:
      - "etcd"
      - "minio"

  etcd:
    image: quay.io/coreos/etcd:v3.5.0
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296
      - ETCD_SNAPSHOT_COUNT=50000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/Volumes/etcd:/etcd
    command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd

  minio:
    container_name: milvus-minio
    image: minio/minio:RELEASE.2022-03-17T06-34-49Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    ports:
      - "9001:9001"
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/Volumes/minio:/minio_data
    command: minio server /minio_data --console-address ":9001"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3

  api-service:
    restart: always
    image: dlite-matching-api:0.1.4
    ports:
      - 8082:8082
    depends_on:
      - "standalone"

networks:
  default:
    name: milvus

Hello DLiteServer

서버와 통신하는 Python 클라이언트 코드를 통해, 서버기반의 추론과 매칭 기능을 확인합니다.

Hello GPU-Inference

# START GPU Inference Server

$ git clone (TBD)
$ python app.py
-----------------
# 결과확인

Hello CPU-Inference

# START CPU Inference Server

$ git clone (TBD)
$ python app.py
-----------------
# 결과확인

Hello Matching

# START Matching Server

$ git clone (TBD)
$ python app.py
-----------------
# 결과확인