Skip to content

Xavier Universal Model Interface (XUMI)

Purpose and Background

The Xavier Universal Model Interface (XUMI) is designed to standardize and streamline the process of integrating AI models into the enterprise MLOps platform. It addresses several challenges in the current model management process:

  1. Inconsistent model integration approaches across different model types and frameworks
  2. Lack of standardized validation and testing procedures for new models
  3. Complex deployment requirements and environment configuration
  4. Difficulty tracking model versions and dependencies
  5. Inefficient collaboration between ML engineers and platform engineers

The XUMI provides a unified framework for packaging, validating, and executing AI models using containerization, standardized manifests, and automated validation tools. This enables:

  • Easier onboarding of new models into the platform
  • Consistent model execution across development and production environments
  • Automated validation and testing procedures
  • Simplified model sharing and collaboration
  • Standardized interfaces for integration with existing workflows

Business/Functional Requirements

Core XUMI Framework Requirements

  • Standardized Model Packaging

    • Container-based model encapsulation using Docker
    • Consistent directory structure and entry points
    • Standardized manifest format for model metadata and execution parameters
  • Model Integration and Execution

    • Uniform interface for model execution regardless of underlying framework
    • Standardized parameter specification and validation
    • Consistent output format and structure
  • Validation and Testing

    • Automated security scanning of model code and dependencies
    • Template-based testing for different model types
    • Performance benchmarking and reporting
    • Code quality assessment
  • Development Tools

    • Python SDK for local model development and testing
    • CLI utilities for common tasks
    • Template generation for new model projects

User Stories

  • As an ML Engineer, I want to:

    • Download the XUMI SDK from a Git repository to develop models locally
    • Adapt existing model code to be compatible with the XUMI framework
    • Validate my model code with security and quality scans before submission
    • Run integration tests to ensure my model is correctly configured
    • Execute benchmark testing and receive performance reports
    • Submit validated models for ingestion through the Workbench
  • As a Workbench Engineer, I want to:

    • Review submitted models using standardized validation procedures
    • Execute template tests for different model types
    • Benchmark models against established baselines
    • Validate models against security and quality standards
    • Deploy approved models to the platform

Technical Requirements

Manifest Structure

  • JSON/YAML Format
    • Schema validation for structural integrity
    • Version control for backward compatibility
  • Model Metadata
    • Name, version, and description
    • Author and organization information
    • Creation and modification dates
    • License and usage restrictions
    • Model domain and applicable categories
  • Execution Configuration
    • Entry point specification (CLI command, Python module, etc.)
    • Runtime requirements (CPU, GPU, memory)
    • Environment variables and configuration settings
  • Parameter Specifications
    • Input parameter definitions (name, type, description, constraints)
    • Output parameter definitions (format, location, naming patterns)
    • File handling specifications (glob patterns, directory structures)
    • Remote asset storage support
  • Workflow Specification
    • Outline any number of commands which can be executed by name
    • Specify input & output parameters for each command

Container Requirements

  • Base Image Standards
    • Approved base images for different model types
    • Security-hardened configurations
    • Minimal required dependencies
  • Directory Structure
    • Standard locations for model code, artifacts, and configuration
    • Input/output directory conventions
    • Persistent storage mapping
  • Runtime Behavior
    • Container lifecycle management
    • Resource allocation and limits
    • Networking and security configurations

SDK and CLI Requirements

  • Python SDK Features
    • Model manifest creation and validation
    • Local testing and execution utilities
    • Container management functions
    • Authentication and API integration
  • CLI Capabilities
    • Authentication with platform services
    • Local model development environment setup
    • Container building and testing
    • Model validation and scanning
    • Benchmark execution
    • Model submission to platform

Non-Functional Requirements

  • Performance
    • Minimal overhead for model execution
    • Efficient container startup and shutdown
    • Optimized resource utilization
    • Support for batch processing and parallel execution
  • Security
    • Container isolation and resource constraints
    • Code scanning for vulnerabilities and malicious code
    • Secure parameter passing and credential management
    • Access control and authentication
  • Reliability
    • Error handling and reporting
    • Execution monitoring and logging
    • Container health checks and recovery mechanisms
    • Versioning and rollback capabilities
  • Usability
    • Intuitive interfaces for model development
    • Clear documentation and examples
    • Helpful error messages and debugging tools
    • Progressive disclosure of complexity
  • Scalability
    • Support for models of varying sizes and complexities
    • Horizontal scaling for concurrent model execution
    • Resource optimization for different execution environments
    • Efficient handling of large datasets and outputs

Detailed design

Functional Specifications

Model Domain Types

The SPI framework supports various model domains, each with specific input/output requirements and validation procedures:

Domain Description Input Types Output Types Benchmark Tools
Text Text generation, classification, summarization Text, JSON Text, JSON olmes
Image Image generation, classification, segmentation, detection Text, Image, JSON Image, JSON custom metrics
Video Video generation, editing, enhancement Text, Image, Video, JSON Video, JSON vBench
Audio Audio generation, transcription, enhancement Text, Audio, JSON Audio, Text, JSON custom metrics
Multimodal Combined processing across domains Any combination Any combination domain-specific

Each model domain has specific extensions to the base SPI manifest schema to accommodate domain-specific requirements:

Text Domain Extensions

model:
  domain: "text"
  domain_specific:
    max_context_length: 4096
    supports_streaming: true
    language_codes: ["en", "es", "fr"]
    text_features:
      - translation
      - summarization
      - classification

Image Domain Extensions

model:
  domain: "image"
  domain_specific:
    max_resolution: "1024x1024"
    color_modes: ["RGB", "RGBA"]
    supported_formats: ["png", "jpg", "webp"]
    image_features:
      - generation
      - inpainting
      - outpainting
      - style_transfer

Video Domain Extensions

model:
  domain: "video"
  domain_specific:
    max_resolution: "1920x1080"
    max_fps: 30
    max_duration_seconds: 60
    max_bitrate: 20000               
    color_space: ["BT.709", "BT.2020"]    
    supported_formats: ["mp4", "webm"]
    video_features:
      - generation
      - frame_interpolation
      - super_resolution
      - style_transfer

Audio Domain Extensions

model:
  domain: "audio"
  domain_specific:
    max_sample_rate: 48000
    max_duration_seconds: 300
    max_bitrate: 320
    supported_formats: ["wav", "mp3", "ogg"]
    audio_features:
      - speech_synthesis
      - music_generation
      - audio_enhancement

Input and Output Parameter Types

The SPI supports various parameter types with domain-specific extensions:

Basic Types

  • string: Text values, e.g. “string”
  • integer: Whole numbers, e.g. 123456
  • float: Float-precision 32-bit numbers, e.g. 10.2
  • decimal: Decimal-precision 128-bit numbers, e.g. 456.2345
  • boolean: true/false values
  • array: Ordered list of values
  • map: Key-value mapping
  • file: File reference (with type-specific extensions)
  • enum: One of possible values

Basic Types specifications

File
  type: file
     mime_types:
      - image/jpeg
      - image/png
      - image/webp
  file_pattern: 'file_*.{jpg,jpeg,png,webp}'
  mount_path: inputs/images
  index_padding: 3  # optional, used only for output parameters

mime_types: Restricts accepted file formats to specific mime-type (per IANA Media Types Registry). In the example, only jpeg, png or webp images will be accepted. Any other image type will be skipped with a warning.

file_pattern: Defines glob pattern matching files with extensions.

  • input parameters. The full glob specification will be considered. In the example, only files named starting with file_ and having extensions .jpg, .jpeg, .png, or .webp will be accepted.
  • output parameters. The pattern will affect the name of the resulting file. Note that * (star symbol) will be replaced by a counter number, starting from 0. The number will be padded by 0s depending on the index_padding property. In the example, the output files will be named file_000, file_001, file_002, etc with the same extension as input.

mount_path: Maps input and output files, relative to the model root folder. In the example, if the model base_work_dir is /app, then the input files from /app/inputs/images/ directory will be used

index_padding: Ensures file indices are zero-padded to X digits (000, 001, etc.)

Video Input Specifications

Video inputs require detailed specifications to ensure proper handling:

inputs:
  - name: "reference_video"
    type: "file"
    description: "Reference video for style transfer"
    required: true
    mime_types: ["video/mp4", "video/webm"]
    file_pattern: "*.{mp4,webm}"
    mount_path: "/app/inputs/videos"
    domain_specific:
      video:
        min_resolution: "480x360"
        max_resolution: "1920x1080"
        min_fps: 24
        max_fps: 60
        min_duration_seconds: 1
        max_duration_seconds: 60
        color_space: "RGB"
        codec_requirements: ["h264", "vp9"]

Image Input Specifications

inputs:
  - name: "reference_image"
    type: "file"
    description: "Reference image for styling"
    required: true
    mime_types: ["image/jpeg", "image/png", "image/webp"]
    file_pattern: "*.{jpg,jpeg,png,webp}"
    mount_path: "/app/inputs/images"
    domain_specific:
      image:
        min_resolution: "256x256"
        max_resolution: "2048x2048"
        color_modes: ["RGB", "RGBA"]
        aspect_ratios: ["1:1", "16:9", "4:3"]

Audio Input Specifications

inputs:
  - name: "reference_audio"
    type: "file"
    description: "Reference audio for style transfer"
    required: true
    mime_types: ["audio/wav", "audio/mp3", "audio/ogg"]
    file_pattern: "*.{wav,mp3,ogg}"
    mount_path: "/app/inputs/audio"
    domain_specific:
      audio:
        min_sample_rate: 16000
        max_sample_rate: 48000
        min_duration_seconds: 1
        max_duration_seconds: 300
  min_bitrate: 128   max_bitrate: 320
        channels: [1, 2]

Complex Parameter Configurations

For more complex inputs like video generation parameters:

inputs:
  - name: "video_config"
    type: "object"
    description: "Video generation configuration"
    required: true
    properties:
      resolution:
        type: "string"
        enum: ["480p", "720p", "1080p"]
        default: "720p"
      frame_rate:
        type: "integer"
        default: 30
        constraints:
          min_value: 24
          max_value: 60
      duration_seconds:
        type: "float"
        default: 5.0
        constraints:
          min_value: 1.0
          max_value: 60.0
      motion_strength:
        type: "float"
        default: 0.5
        constraints:
          min_value: 0.1
          max_value: 1.0

SPI Manifest Alignment with Workbench Catalog

To ensure seamless integration between the SPI framework and the WALL-E Workbench/Catalog systems, the SPI manifest schema has been aligned with the catalog data fields. This alignment ensures that models packaged with the SPI framework can be directly indexed and displayed in the AI.Portal catalog with consistent metadata.

Core Model Metadata Fields

# Standard metadata fields aligned with Workbench Catalog
model:
  name: "text-to-video-gen"  # Maps to "Model Name" in Workbench
  version: "1.0.0"  # Maps to "Model Version" in Workbench
  company: "ICVR"  # Maps to "Model Company" in Workbench
  repository_url: "https://github.com/icvr/text-to-video-gen"  # Maps to "Model Repository URL" in Workbench
  weights_url: "https://storage.icvr.io/models/text-to-video-gen/weights"  # Maps to "Model Weights URL" in Workbench
  documentation_url: "https://docs.icvr.io/models/text-to-video-gen"  # Maps to "Model Documentation URL" in Workbench
  forum_url: "https://community.icvr.io/models/text-to-video-gen"  # Maps to "Model Forums/Discussion URL" in Workbench
  example_outputs_url: "https://examples.icvr.io/models/text-to-video-gen"  # Maps to "Model Example Outputs URL" in Workbench

  # Model classification aligned with Workbench Model Type
  model_type: "generative_video"  # Maps to "Model Type" dropdown in Workbench

  # Inputs/Outputs aligned with Workbench categorization
  inputs:
    - "text"
    - "image"  # Optional reference image
  outputs:
    - "video"
    - "image"  # Preview frames

  # Usage information
  license_type: "proprietary"  # Maps to "Model License Type" in Workbench
  usage_category: "commercial"  # Maps to "Usage Category" in Workbench (commercial, non-commercial, testing)
  trained_on_copyrighted_data: false  # Maps to "Was the model trained on copyrighted data?" in Workbench

  # Trainability information
  is_trainable: false  # Maps to "Is this model trainable?" in Workbench
  is_fine_tunable: true  # Maps to "Is this model fine-tunable?" in Workbench

  # Task capabilities aligned with Workbench Task categories
  tasks:
    "2D":
      - "concept_art"
    "VFX":
      - "generative_FX"

  # Model-specific technical information
  technical_dependencies: [
    "pytorch>=2.0.0",
    "diffusers>=0.21.4",
    "transformers>=4.30.2",
    "accelerate>=0.20.0"
  ]  # Maps to "Known model technical dependencies" in Workbench

  # User sentiment information
  user_sentiment: "positive"  # Maps to "What is the general user sentiment surrounding the quality of this model?" in Workbench

Domain-Specific Output Specifications

The SPI manifest includes detailed output specifications that align with the Workbench catalog categories:

# Video Output Specifications (aligns with Workbench Video Output fields)
output_specifications:
  video:
    resolution:
      - "1920x1080 (HD)"
      - "3840x2160 (4K)"
    frame_rate:
      - "24 fps"
      - "30 fps"
    bit_depth:
      - "8-bit"
    color_space:
      - "Rec. 709"
    color_profile:
      - "sRGB"

  # Text Output Specifications (if applicable)
  text:
    plain_text:
      - "natural_language"
    structured_text:
      - "JSON"

  # Audio Output Specifications (if applicable)
  audio:
    channels:
      - "stereo"
    sample_rate:
      - "48 kHz"
    bit_depth:
      - "24-bit"

Catalog Description Information

The SPI manifest includes fields for catalog display content:

# Catalog display information
catalog:
  thumbnail: "/app/assets/thumbnail.jpg"  # Will be extracted from container during ingestion
  summary: "High-quality text-to-video generation model capable of creating realistic videos from text prompts with optional image conditioning."
  media_carousel:
    - "/app/assets/example1.mp4"
    - "/app/assets/example2.mp4"
    - "/app/assets/example3.jpg"
  long_description: |
    This state-of-the-art text-to-video generation model creates high-quality, 
    temporally consistent videos from text prompts. The model supports various 
    resolutions up to 4K and can generate videos of up to 30 seconds in length.

    Key features:
    - High-quality video generation from text prompts
    - Support for image conditioning to guide video style
    - Adjustable parameters for motion strength and video quality
    - Fast generation using optimized diffusion techniques
    - Support for both creative and realistic video styles

  # Scorecard information for benchmark results
  scorecard:
    summary: "Excellent performance in visual quality and prompt adherence, with good temporal consistency."

Container Structure

Standard directory structure within the model container:

/app/
  ├── run.py                 # Main entry point script
  ├── manifest.yml           # SPI manifest
  ├── model/                 # Model code and weights
  |   ├── weights/           # Model weights
  |   └── code/              # Model-specific code
  ├── inputs/                # Input data directory (mounted at runtime)
  |   ├── files/             # Input files (optional, defined by manifest)
  |   └── config/            # Configuration files (optional)
  └── outputs/               # Output data directory (mounted at runtime)

Python SDK Structure

The SDK provides utilities for developing, testing, and validating SPI-compliant models:

# Example SDK Structure
import nova_spi

# Create a new SPI project
project = walle_spi.Project.create("my-model")

# Define model metadata
project.set_metadata(
    name="my-model",
    version="1.0.0",
    description="Example text generation model",
    domain="text"
)

# Define input parameters
project.add_input(
    name="prompt",
    type="string",
    description="Input prompt",
    required=True
)

# Define output parameters
project.add_output(
    name="generated_text",
    type="string",
    description="Generated text output"
)

# Save manifest
project.save_manifest()

# Validate the project
validation_result = project.validate()
print(validation_result.summary())

# Build container
container = project.build_container()

# Test locally
test_result = container.test(inputs={"prompt": "Hello, world!"})
print(test_result.outputs["generated_text"])

# Submit to platform
submission = container.submit()
print(f"Submission ID: {submission.id}")

SPI Runtime Library

The SPI runtime library provides utilities for model execution within the container:

# /app/spi/utils.py
import json
import os
import sys
import logging
from pathlib import Path
from typing import Dict, Any, Optional, List, Union

class SPIRuntime:
    """SPI runtime utilities for model execution."""

    def __init__(self, manifest_path="/app/spi/manifest.yml"):
        """Initialize SPI runtime with manifest path."""
        self.manifest = self._load_manifest(manifest_path)
        self.logger = self._setup_logger()

    def parse_inputs(self) -> Dict[str, Any]:
        """Parse inputs from environment and files."""
        inputs = {}
        for input_spec in self.manifest["inputs"]:
            name = input_spec["name"]
            env_name = f"INPUT_{name.upper()}"

            if input_spec["type"] == "file":
                mount_path = input_spec.get("mount_path", "/app/inputs")
                pattern = input_spec.get("file_pattern", "*")
                file_paths = list(Path(mount_path).glob(pattern))
                inputs[name] = [str(p) for p in file_paths]
            else:
                # Parse other input types from environment variables
                if env_name in os.environ:
                    value = os.environ[env_name]
                    inputs[name] = self._parse_value(value, input_spec["type"])
                elif input_spec.get("required", False):
                    self.logger.error(f"Required input '{name}' not provided")
                    sys.exit(1)
                elif "default" in input_spec:
                    inputs[name] = input_spec["default"]

        return inputs

    def write_outputs(self, outputs: Dict[str, Any]):
        """Write outputs according to manifest specifications."""
        output_file = os.environ.get("SPI_OUTPUT_FILE", "/app/outputs/output.json")

        # Validate outputs against manifest
        output_data = {}

        for output_spec in self.manifest["outputs"]:
            name = output_spec["name"]
            if name not in outputs and output_spec.get("required", False):
                self.logger.error(f"Required output '{name}' not provided")
                sys.exit(1)

            if name in outputs:
                if output_spec["type"] == "file":
                    # Handle file outputs - collect files matching pattern
                    pattern = output_spec.get("file_pattern", "*")
                    output_data[name] = self._collect_output_files(pattern)
                else:
                    # Handle scalar outputs
                    output_data[name] = outputs[name]

        # Write output file
        os.makedirs(os.path.dirname(output_file), exist_ok=True)
        with open(output_file, "w") as f:
            json.dump(output_data, f)

        self.logger.info(f"Outputs written to {output_file}")

    def _load_manifest(self, path):
        """Load manifest from file."""
        import yaml
        try:
            with open(path) as f:
                return yaml.safe_load(f)
        except Exception as e:
            print(f"Error loading manifest: {e}", file=sys.stderr)
            sys.exit(1)

    def _setup_logger(self):
        """Set up logging."""
        logger = logging.getLogger("spi")
        logger.setLevel(logging.INFO)
        handler = logging.StreamHandler(sys.stdout)
        handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
        logger.addHandler(handler)
        return logger

    def _parse_value(self, value, type_name):
        """Parse string value to appropriate type."""
        try:
            if type_name == "string":
                return value
            elif type_name == "integer":
                return int(value)
            elif type_name == "float":
                return float(value)
            elif type_name == "boolean":
                return value.lower() in ("true", "yes", "1")
            elif type_name == "array":
                return json.loads(value)
            elif type_name == "object":
                return json.loads(value)
            else:
                return value
        except Exception as e:
            self.logger.error(f"Error parsing value '{value}' as {type_name}: {e}")
            sys.exit(1)

    def _collect_output_files(self, pattern):
        """Collect output files matching pattern."""
        output_dir = os.environ.get("SPI_OUTPUT_DIR", "/app/outputs")
        paths = list(Path(output_dir).glob(pattern))
        return [str(p) for p in paths]

UI/UX

CLI Interface

The CLI interface is designed for ML Engineers and Workbench Engineers who need to develop, test, and submit models from their workstations:

┌──────────────────────────────────────────────────────────┐
│ WALL-E SPI CLI                                           │
├──────────────────────────────────────────────────────────┤
│ Commands:                                                │
│   login    Authenticate with Wall-E infrastucture        │
|   create      Create a new SPI project                   │
│   init-env    Initialize development environment         │
│   validate    Validate manifest or project               │
│   scan        Run security and quality scans             │
│   build       Build container image                      │
│   test        Run tests on local container               │
│   benchmark   Run performance benchmarks                 │
│   model [op]    Model-related operations                 |
|   container [op]  Container-related operations           │
│   submit      Submit model to platform                   │
│   status      Check status of submitted model            │
│   help        Show help for commands                     │
└──────────────────────────────────────────────────────────┘

Sample CLI commands arguments

# core features
Arguments:
  -v, --verbose    Verbose logging
  -vv, --very-verbose  Very-verbose logging
  -s, --silent    Silent, only output errors
  -h, --help      Show help and exit
  --version      Show version and exit

# nova-spi login
Description: Authenticate with Wall-E infrastructure
Arguments:
  --username STRING       Username for authentication (optional, will prompt if not provided)
  --password STRING       Password for authentication (optional, will prompt if not provided)
  --token STRING          Authentication token (alternative to username/password)
  --server URL            Server URL (default: https://aiportal.icvr.io)

# nova-spi create
Description: Create a new SPI project
Arguments:
  NAME                    Project name (required)
  --domain STRING         Model domain [text|image|video|audio|multimodal] (required)
  --template STRING       Project template to use (optional)
  --description STRING    Project description (optional)
  --version STRING        Initial version (default: 0.1.0)
  --output-dir PATH       Directory to create project in (default: current directory)

# nova-spi init-env
Description: Initialize development environment & run the container
Arguments:
  --cuda-version STRING   CUDA version to use (default: 11.8)
  --python-version STRING Python version to use (default: 3.12)
  --base-image STRING     Docker base image (default: determined by domain)
  --force                 Force re-initialization of existing environment
  --with-jupyter          Include Jupyter notebook support
  --attach                Run VSCode and attach to the running container

# nova-spi validate
Description: Validate manifest or project
Arguments:
  PATH                    Path to manifest file or project directory (required)
  --strict                Perform strict validation (fail on warnings)
  --fix                   Attempt to fix validation issues
  --output PATH           Write validation report to file

# nova-spi scan
Description: Run security and quality scans
Arguments:
  PATH                    Path to project directory (required)
  --scan-type STRING      Type of scan [security|quality|all] (default: all)
  --severity STRING       Minimum issue severity to report [critical|high|medium|low|info] (default: medium)
  --output PATH           Write scan report to file
  --fix                   Attempt to fix issues automatically

# nova-spi build
Description: Build container image
Arguments:
  --tag STRING            Image tag (default: derived from manifest)
  --no-cache              Disable Docker build cache
  --platform STRING       Target platform (default: linux/amd64)
  --push                  Push image to registry after building
  --registry URL          Registry URL (default: from configuration)

# nova-spi test
Description: Run tests on local container
Arguments:
  --input KEY=VALUE       Input parameter (can be specified multiple times)
  --input-file PATH       File containing input parameters in JSON format
  --case STRING           Predefined test case from manifest
  --output PATH           Directory to save outputs (default: ./test_outputs)
  --debug                 Enable debug mode with verbose logging
  --interactive           Start an interactive session with the container

# nova-spi benchmark
Description: Run performance benchmarks
Arguments:
  --dataset PATH          Path to benchmark dataset (optional)
  --preset STRING         Benchmark preset [quick|standard|thorough] (default: standard)
  --iterations INT        Number of iterations (default: 10)
  --output PATH           Directory to save results (default: ./benchmark_results)
  --compare STRING        Compare against another model or baseline

# nova-spi submit
Description: Submit model to platform
Arguments:
  --ticket-type STRING    Ticket type [ingest|update] (default: ingest)
  --description STRING    Submission description (optional)
  --notes STRING          Additional notes for reviewers (optional)
  --skip-validation       Skip pre-submission validation
  --priority STRING       Submission priority [low|normal|high] (default: normal)

# nova-spi status
Description: Check status of submitted model
Arguments:
  ID                      Submission or ticket ID (required)
  --watch                 Continuously monitor status updates
  --output PATH           Write status report to file

# nova-spi help
Description: Show help for commands
Arguments:
  COMMAND                 Command to show help for (optional)
  --verbose               Show detailed help with examples

Sample CLI model command arguments

Note: this functionality is planned for v2 pass on SPI. The design is not final and will be revisited after v1 is complete.

$ nova-spi help model

WALL-E SPI - Model Management Commands
======================================

Model commands manage AI models within the WALL-E platform, allowing you to list, inspect,
execute, and manage available models.

USAGE:
  nova-spi model [COMMAND] [OPTIONS]

COMMANDS:
  list        List available models
  info        Get detailed information about a model
  run         Execute a model
  pull        Download a model for local use
  push        Upload a local model to the registry
  versions    List available versions of a model
  api-info    Get API access information for a model
  logs        View execution logs for a model

EXAMPLES:

  # List all available models
  $ nova-spi model list

  # List models filtered by domain
  $ nova-spi model list --domain video

  # Get detailed information about a model
  $ nova-spi model info model-34567

  # Get detailed information about a specific model version
  $ nova-spi model info text-generator:1.2.0

  # Execute a model with required inputs
  $ nova-spi model run model-34567 --input prompt="A serene mountain lake at sunset"

  # Execute a model with multiple parameters and save to output directory
  $ nova-spi model run model-34567 \
    --input prompt="A serene mountain lake at sunset" \
    --input resolution="1080p" \
    --input duration=10 \
    --output ./outputs

  # Upload a file for use as model input
  $ nova-spi model run model-34567 \
    --upload-input ./reference.mp4 \
    --input prompt="Enhance this video" \
    --output ./outputs

  # Stream logs during model execution
  $ nova-spi model run model-34567 --input prompt="Test" --output ./outputs --watch-logs

  # Get API access information for a model
  $ nova-spi model api-info model-34567

  # Pull a model for local use
  $ nova-spi model pull text-generator:1.2.0

  # Get model runtime logs
  $ nova-spi model logs model-34567 --execution-id exec-12345

  # List available versions of a model
  $ nova-spi model versions text-generator

COMMAND DETAILS:

LIST OPTIONS:
  --domain STRING       Filter by model domain [text|image|video|audio|multimodal]
  --status STRING       Filter by status [active|inactive|deprecated]
  --approved            Show only approved models
  --tags TAG[,TAG...]   Filter by tags
  --limit NUMBER        Limit number of results (default: 20)
  --output FORMAT       Output format [table|json|yaml] (default: table)
  --sort-by FIELD       Sort results by field [name|date|rating] (default: name)

INFO OPTIONS:
  MODEL_ID              Model ID or name:version (required)
  --format FORMAT       Output format [detailed|summary|json|yaml] (default: detailed)
  --show-manifest       Include full manifest in output
  --show-benchmarks     Include benchmark results in output

RUN OPTIONS:
  MODEL_ID              Model ID or name:version (required)
  --input KEY=VALUE     Input parameter (can be specified multiple times)
  --input-file PATH     Read inputs from JSON/YAML file
  --upload-input PATH   Upload file to use as input (can be specified multiple times)
  --temp-storage        Use temporary storage for artifacts (auto-cleanup)
  --output PATH         Directory to save outputs (default: ./outputs)
  --watch-logs          Stream logs during execution
  --execution-id ID     Specify custom execution ID for tracking
  --async               Run asynchronously and return immediately
  --timeout SECONDS     Execution timeout in seconds (default: 600)
  --gpu-count NUMBER    Number of GPUs to use (default: from manifest)
  --memory STRING       Memory allocation (default: from manifest)

PULL OPTIONS:
  MODEL_ID              Model ID or name:version (required)
  --dest PATH           Destination directory (default: ./models)
  --with-weights        Include model weights (may be large)
  --cache               Use cached version if available

PUSH OPTIONS:
  PATH                  Path to model directory (required)
  --name STRING         Model name (required if not in manifest)
  --version STRING      Version (required if not in manifest)
  --force               Force push even if model exists
  --skip-validation     Skip pre-push validation

VERSIONS OPTIONS:
  MODEL_NAME            Model name (required)
  --limit NUMBER        Limit number of versions (default: 10)
  --include-deprecated  Include deprecated versions

API-INFO OPTIONS:
  MODEL_ID              Model ID or name:version (required)
  --format FORMAT       Output format [curl|python|javascript|json] (default: curl)
  --generate-token      Generate temporary access token
  --token-expires MINS  Token expiration in minutes (default: 60)

LOGS OPTIONS:
  MODEL_ID              Model ID or name:version (required)
  --execution-id ID     Execution ID (required)
  --follow              Stream logs in real-time
  --tail NUMBER         Show only last N lines (default: all)
  --filter STRING       Filter logs by pattern
  --level LEVEL         Minimum log level [debug|info|warning|error] (default: info)

Example CLI workflows

Creating and Testing a New Model:

# please note: user has to be authenticated with nova-spi login
# Create new project
$ nova-spi create my-text-model --domain text
Created new SPI project 'my-text-model'

# Initialize development environment
$ cd my-text-model
$ nova-spi init-env
Setting up development environment...
Created Docker development environment
Installed dependencies
Created template files
Development environment ready!

# Validate manifest
$ nova-spi validate manifest.yml
✓ Manifest is valid
ℹ Suggestions:
  - Add more detailed description
  - Consider adding example test cases

# Run security scan
$ nova-spi scan .
Running security scan...
✓ No critical vulnerabilities found
ℹ 3 medium vulnerabilities detected (see report.html)

# Build container
$ nova-spi build
Building container image...
Step 1/15 : FROM pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime
...
Successfully built my-text-model:1.0.0

# Test locally
$ nova-spi test --input prompt="Hello, world!"
Running test with inputs:
  prompt: Hello, world!
✓ Test passed
Output:
  generated_text: "Hello, world! It's a beautiful day..."

# Run benchmarks
$ nova-spi benchmark
Running benchmarks...
✓ Completed 100 requests in 25.3s
  Average latency: 253ms
  P95 latency: 312ms
  Throughput: 3.95 req/s
  Memory usage: 4.2GB
  Full report saved to 'benchmark_report.html'

# Submit to platform
$ nova-spi submit
Uploading model (23MB)... Done!
Submitting to Workbench... Done!
✓ Model submitted successfully
  Submission ID: WB-20250423-127
  Status: Pending review

VSCode Integration

The SPI SDK includes VSCode integration for an improved development experience:

  • Syntax highlighting for manifest files
  • Schema validation for manifest editing
  • Code snippets for common patterns
  • Integrated testing and validation
  • Container management commands

Comfy UI integration

A module (adapter) will be developed to execute any available model as a node in the ComfyUI interface. This will make our models available in the AI workflow pipelines.

Application Architecture

System Components

The SPI framework consists of the following components:

  1. SPI Core Library
    • Manifest parsing and validation
    • Container management
    • Execution orchestration
    • Parameter handling
  2. SPI SDK
    • Python library for model integration
    • Development utilities
    • Local testing framework
    • Submission client
  3. SPI CLI
    • Command-line interface to SDK
    • Development workflow automation
    • Container and environment management
    • Platform integration
  4. Validation Services
    • Security scanning
    • Template testing
    • Performance benchmarking
    • Code quality assessment
  5. Containerization Layer
    • Docker-based model packaging
    • Standard base images
    • Resource management
    • Execution environment

Architecture Diagram

--- config: layout: elk --- flowchart TD subgraph subGraph0["ML Engineer Workstation"] CLI["SPI CLI"] SDK["SPI SDK"] Docker["Local Docker Engine"] end subgraph aiportal["AI.Portal Platform"] API["API Gateway"] WB["Workbench"] VS["Validation Services"] TM["Task Manager"] BS["Benchmark Service"] MC["Model Catalog"] DS["Deployment Service"] CR["Container Registry"] end CLI --> SDK SDK --> Docker API --> WB WB --> VS & TM VS --> BS TM --> MC DS --> MC & CR CLI -. RBAC 🔒 .-> API Docker -. RBAC 🔒 .-> API & CR

Container Execution Flow

The execution flow for running an SPI-compliant model:

  1. Preparation Stage
    • Container image is retrieved from registry
    • Input parameters are validated against manifest
    • Input files are staged to appropriate locations
    • Resource requirements are allocated
  2. Execution Stage
    • Container is launched with appropriate mounts and environment variables
    • Entry point command is executed
    • Standard output and error streams are captured
    • Resource usage is monitored
  3. Output Processing Stage
    • Output files are collected based on manifest patterns
    • Output data is validated against specified schemas
    • Results are packaged according to output specifications
    • Execution metrics are recorded

Technical Implementation

SPI Manifest Schema

The SPI manifest schema defines the structure and constraints for model metadata, execution configuration, and input/output parameters. See Appendix E: SPI Manifest Schema

Python SDK Implementation

See Appendix F: Python SDK Implementation for sample code structure of the core SDK classes.

SPI Integration with AI.Portal Workbench Workflow

The SPI framework is designed to integrate seamlessly with the Workbench workflow, supporting the complete model evaluation and ingestion process. This integration ensures that models processed through the SPI framework meet all the requirements for inclusion in the AI.Portal catalog.

Ticket Creation and Initial Evaluation

During the Ticket Creation and Initial Evaluation phases of the Workbench workflow, the SPI framework extracts relevant metadata from the model container and manifest to pre-populate fields in the Workbench interface:

# Example code for extracting Workbench-compatible metadata from SPI manifest
def extract_workbench_metadata(manifest_path):
    """Extract metadata from SPI manifest for Workbench ticket creation."""
    import yaml

    with open(manifest_path, 'r') as f:
        manifest = yaml.safe_load(f)

    # Extract basic model information
    metadata = {
        "model_name": manifest["model"]["name"],
        "model_version": manifest["model"]["version"],
        "model_company": manifest["model"].get("company", ""),
        "model_url": manifest["model"].get("repository_url", ""),
        "model_type": manifest["model"].get("model_type", ""),
        "requested_use_cases": manifest["model"].get("tasks", {}),
        "usage_category": manifest["model"].get("usage_category", ""),
        "usage_description": manifest["model"].get("description", ""),

        # Additional URLs
        "model_repository_url": manifest["model"].get("repository_url", ""),
        "model_weights_url": manifest["model"].get("weights_url", ""),
        "model_documentation_url": manifest["model"].get("documentation_url", ""),
        "model_forums_url": manifest["model"].get("forum_url", ""),
        "model_example_outputs_url": manifest["model"].get("example_outputs_url", ""),

        # Model I/O information
        "model_inputs": manifest["model"].get("inputs", []),
        "model_outputs": manifest["model"].get("outputs", []),

        # Licensing and copyright information
        "model_license_type": manifest["model"].get("license_type", ""),
        "trained_on_copyrighted_data": manifest["model"].get("trained_on_copyrighted_data", "Unsure"),

        # Output specifications
        "output_specifications": manifest.get("output_specifications", {}),

        # Additional technical information
        "is_trainable": manifest["model"].get("is_trainable", "Unsure"),
        "is_fine_tunable": manifest["model"].get("is_fine_tunable", "Unsure"),
        "user_sentiment": manifest["model"].get("user_sentiment", "Unknown"),
        "technical_dependencies": manifest["model"].get("technical_dependencies", [])
    }

    return metadata

The Workbench interface can use this extracted metadata to pre-populate the Initial Evaluation forms, allowing the Workbench Engineer to verify and supplement the information as needed.

Technical Evaluation with SPI

The Technical Evaluation phase of the Workbench workflow leverages the SPI framework for container-based model validation:

  1. Container Preparation
    • SPI container is loaded with model code and weights
    • SPI manifest is validated against schema requirements
    • Environment variables are configured based on model specifications
  2. Automated Testing
    • SPI test cases from manifest are executed
    • Input/output validation is performed
    • Resource utilization is monitored and recorded
  3. Template Testing
    • Standard domain-specific test templates are applied
    • Results are captured and analyzed
    • Comparison to reference outputs is performed
  4. Integration with Workbench UI
    • Test results are displayed in the Workbench interface
    • Pass/fail indicators for each test case
    • Resource utilization metrics and performance statistics

The SPI CLI provides commands specifically for the Workbench Technical Evaluation workflow:

# Execute the technical evaluation for a model
$ nova-spi evaluate /path/to/model/container
Running technical evaluation...
✓ Manifest validation passed
✓ Security scan passed (2 low-severity issues found)
Running test cases:
  - basic_generation: ✓ Passed
  - complex_prompt: ✓ Passed
  - edge_case_long_input: ✓ Passed
Resource utilization:
  - Peak memory: 8.4 GB
  - Average GPU utilization: 78%
  - Total execution time: 128.5s
Technical evaluation passed successfully!

Model Scoring and Benchmarking

Note: The detailed implementation of benchmarking services is still in development. The current design outlines the conceptual framework, but specific implementation details, metrics, and evaluation methodologies will be refined in future iterations. The benchmarking system will initially focus on core metrics like performance, resource utilization, and output quality across various model domains, with domain-specific metrics added progressively.

The SPI framework integrates with the Model Scoring phase of the Workbench workflow:

  1. Benchmark Configuration
    • Benchmark parameters are loaded from SPI manifest
    • Domain-specific metrics are configured based on model type
    • Test cases are prepared based on benchmark specifications
  2. Benchmark Execution
    • Benchmark tests are executed in standardized environment
    • Results are captured and normalized
    • Comparison to baseline models is performed
  3. Score Generation
    • Individual metrics are calculated and weighted
    • Overall score is generated based on weighted metrics
    • Performance relative to similar models is determined
  4. Scorecard Generation
    • Visual representation of benchmark results
    • Detailed breakdown by metric category
    • Recommendations for model usage

Example output from the SPI benchmarking tool for a text-to-video model:

Text-to-Video Model Benchmark Results
Overall Score: 89% (Excellent)
Ranked #2 in generative video models

Detail Scores:
┌────────────────────────────┬───────────┬──────────┬──────────────┐
│ Metric                     │ Score     │ Weight   │   Weighted   │
├────────────────────────────┼───────────┼──────────┼──────────────┤
│ Visual Quality             │ 92%       │ 30%      │ 27.6%        │
│ Temporal Consistency       │ 84%       │ 20%      │ 16.8%        │
│ Prompt Adherence           │ 93%       │ 30%      │ 27.9%        │
│ Generation Speed           │ 87%       │ 10%      │ 8.7%         │
│ Resource Efficiency        │ 78%       │ 10%      │ 7.8%         │
└────────────────────────────┴───────────┴──────────┴──────────────┘

Resource Utilization:
- Average generation time: 15.2s per 5s video
- Peak memory usage: 18.2 GB
- GPU utilization: 92%

Strengths:
- Excellent visual quality and detail
- Strong adherence to text prompts
- Good handling of complex scenes

Areas for Improvement:
- Occasional temporal flickering in high-motion scenes
- Higher than average memory requirements
- Limited support for longer video generation

Benchmark completed in 32m 48s with 50 test prompts.

Metadata Extraction for Catalog

The SPI framework includes utilities for extracting catalog-ready metadata from the model container:

# Example code for extracting catalog metadata from SPI container
def extract_catalog_metadata(container_id):
    """Extract catalog metadata from SPI container."""
    import docker
    import json
    import tempfile
    import os

    client = docker.from_env()
    container = client.containers.get(container_id)

    # Create temporary directory for extracted files
    with tempfile.TemporaryDirectory() as temp_dir:
        # Extract manifest
        manifest_path = os.path.join(temp_dir, 'manifest.yml')
        with open(manifest_path, 'wb') as f:
            manifest_data, _ = container.get_archive('/app/spi/manifest.yaml')
            for chunk in manifest_data:
                f.write(chunk)

        # Extract catalog information from manifest
        manifest_metadata = extract_workbench_metadata(manifest_path)

        # Extract thumbnail and media files
        thumbnail_path = os.path.join(temp_dir, 'thumbnail.jpg')
        try:
            thumbnail_data, _ = container.get_archive('/app/assets/thumbnail.jpg')
            with open(thumbnail_path, 'wb') as f:
                for chunk in thumbnail_data:
                    f.write(chunk)
            manifest_metadata['thumbnail_path'] = thumbnail_path
        except:
            # No thumbnail found
            pass

        # Extract media carousel files
        media_paths = []
        if 'catalog' in manifest_metadata and 'media_carousel' in manifest_metadata['catalog']:
            for media_file in manifest_metadata['catalog']['media_carousel']:
                try:
                    media_path = os.path.join(temp_dir, os.path.basename(media_file))
                    media_data, _ = container.get_archive(media_file)
                    with open(media_path, 'wb') as f:
                        for chunk in media_data:
                            f.write(chunk)
                    media_paths.append(media_path)
                except:
                    # File not found or access error
                    pass

        manifest_metadata['media_paths'] = media_paths

        return manifest_metadata

This extracted metadata can be used to populate the AI.Portal catalog entry for the model, ensuring consistency between the SPI manifest and the catalog display.

SPI Command-Line Integration with Workbench

The SPI CLI includes specific commands for interacting with the Workbench workflow:

# Submit model to Workbench for evaluation
$ nova-spi submit --ticket-type ingest text-to-video-gen:1.0.0
Uploading model (345MB)... Done!
Submitting to Workbench... Done!
✓ Model submitted successfully
  Ticket ID: WB-20250423-127
  Status: Pending initial evaluation
  Access URL: https://aiportal.icvr.io/workbench/tickets/WB-20250423-127

# Check status of Workbench ticket
$ nova-spi ticket status WB-20250423-127
Ticket: WB-20250423-127
Model: text-to-video-gen:1.0.0
Status: In technical evaluation
Current step: Template testing
Last updated: 2025-04-23 14:32:21
Assigned to: Chris Swiatek

# Attach additional information to ticket
$ nova-spi ticket update WB-20250423-127 --add-file examples/high_res_sample.mp4
Uploading file (28MB)... Done!
File attached to ticket WB-20250423-127

These commands enable seamless integration between local model development using the SPI framework and the Workbench evaluation workflow.

Automatic SPI integration using AI assistant

We plan to add this functionality in later versions. The functionality will include:

  • Automatic analysis of the code base & relevant information resources to identify:
    • Input & output parameters
    • Model metadata & limitations
    • Inference code
    • Hardware & software requirements
    • Dependencies
  • Based on this analysis we can generate:
    • Development environment for this specific model
    • SPI integration files
    • Tips on how to proceed with the model ingest