Kling Video Models

Kling 2.6 Pro Image-to-Video

Latest image-to-video model from Kuaishou with sound generation, enhanced dynamics, and cinematic quality.

$0.0595/s video

Kling Video O1 Text-to-video

Kling Omni Video O1 is Kuaishou's first unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Text-to-Video mode generates cinematic videos from text prompts with subject consistency, natural physics simulation, and precise semantic understanding. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

$0.0952/s video

Kling Video O1 Reference-to-video

Kling Omni Video O1 Reference-to-Video generates creative videos using character, prop, or scene references from multiple viewpoints. Extracts subject features and creates new video content while maintaining identity consistency across frames. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

$0.0952/s video

Kling Video O1 Image-to-video

Kling Omni Video O1 Image-to-Video transforms static images into dynamic cinematic videos using MVL (Multi-modal Visual Language) technology. Maintains subject consistency while adding natural motion, physics simulation, and seamless scene dynamics. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

$0.0952/s video

Kling v2.5 Turbo Pro Text-to-video

Delivers high-speed text-to-video generation with cinematic motion precision and enhanced temporal stability.

$0.0595/s video

Kling v2.5 Turbo Pro Image-to-video

Transforms stills into lifelike video clips at 2× faster speed while preserving fine texture and lighting consistency.

$0.0595/s video

Kling v2.1 i2v Pro Start-end-frame

Supports start-to-end frame conditioning for controlled motion continuity and smoother scene transitions.

Kling v1.6 Multi i2v Pro

Generates multi-subject video from images with improved coherence and advanced motion-tracking accuracy.

Kling v1.6 Multi i2v Standard

A cost-efficient option for basic image-to-video generation with balanced speed and detail.

Kling Effects

Adds post-processing and stylistic motion effects, expanding creative editing within Kling’s video suite.

kling v2.0 i2v Master

Produces cinematic 1080p clips with refined lighting, camera realism, and cross-frame character stability.

Kling Lipsync Text-to-Video

Animates lip movements directly from text, enabling natural dialogue and speech-aligned video synthesis.

$0.0238/s video

Kling v2.1 t2v Master

Interprets complex text prompts with advanced motion logic and enhanced dynamic-camera rendering.

Kling v2.0 t2v Master

The foundational cinematic model combining high-fidelity visuals with realistic human motion generation.

Kling Lipsync audio-to-video

Synchronizes facial motion with real audio input for expressive, speech-driven video avatars.

$0.1275/s video

Kling v2.1 i2v Master

Delivers professional-grade image-to-video generation with precise motion continuity and visual depth.

Kling v2.1 i2v Pro

Balances generation speed and fidelity, producing sharp, fluid image-to-video results for general creative use.

Kling v1.6 t2v Standard

Entry-level text-to-video generator offering stable motion and prompt alignment for short-form outputs.

$0.0382/s video

Kling v1.6 i2v Pro

Upgraded image-to-video variant with smoother motion blending and improved texture realism.

Kling v2.1 i2v Standard

A fast, reliable 720p model optimized for quick visual drafts and efficient prototyping.

Kling v1.6 i2v Standard

Lightweight early-generation model providing foundational image-to-video conversion at minimal cost.

Kling Video O1 Video-edit-fast

Kling Omni Video O1 Video-Edit enables conversational video editing through natural language commands. Remove objects, change backgrounds, modify styles, adjust weather/lighting, and transform scenes with simple text instructions like 'remove pedestrians' or 'change daytime to dusk'. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

$0.3825/s video

Kling Video O1 Video-edit

$0.7140/s video

特徴 - Kling Video Models

Advanced Prompt Comprehension

Accurately interprets complex text, actions, and camera cues for coherent, story-driven output.

Fluid, Realistic Motion

Enhanced spatiotemporal modeling produces natural character movement and cinematic flow.

4K-Level Visual Quality

Generates detailed 1080p and early-4K clips with stable lighting, texture, and depth.

Dynamic Scene Editing

Add, swap, or remove subjects and objects using simple text or image inputs.

Precise Frame Control

Adjust camera angles, timing, and transitions with frame-level accuracy.

Unified t2v + i2v Pipeline

Integrates text-to-video and image-to-video generation with seamless temporal consistency.

できること - Kling Video Models

Generate realistic video sequences from simple text prompts.

Transform photos into expressive video clips with motion continuity.

Achieve scene-level coherence ideal for storytelling, advertising, and visual effects.

Produce 16:9, 9:16, or square-format cinematic outputs for social or production use.

Iterate fast between Standard, Pro, and Master modes to balance speed and quality.

Run Kling Models

Atlas CloudでKling Video Modelsを使用する理由

高度なKling Video ModelsモデルとAtlas CloudのGPU加速プラットフォームを組み合わせることで、比類のないパフォーマンス、スケーラビリティ、開発者エクスペリエンスを提供。

Kling Effects run on Atlas Cloud showcasing how AI transforms a single frame into diverse motion styles.

パフォーマンスと柔軟性

低レイテンシ：
リアルタイム推論のためのGPU最適化推論。

統合API：
1つの統合でKling Video Models、GPT、Gemini、DeepSeekを実行。

透明な料金：
サーバーレスオプション付きの予測可能なtoken単位の課金。

エンタープライズとスケール

開発者エクスペリエンス：
SDK、分析、ファインチューニングツール、テンプレート。

信頼性：
99.99%の稼働率、RBAC、コンプライアンス対応ロギング。

セキュリティとコンプライアンス：
SOC 2 Type II、HIPAA準拠、米国内のデータ主権。

さらにファミリーを探索

Z.ai LLM Models

The Z.ai LLM family pairs strong language understanding and reasoning with efficient inference to keep costs low, offering flexible deployment and tooling that make it easy to customize and scale advanced AI across real-world products.

Seedance 1.5 Video Models

Seedance is ByteDance’s family of video generation models, built for speed, realism, and scale. Its AI analyzes motion, setting, and timing to generate matching ambient sounds, then adds creative depth through spatial audio and atmosphere, making each video feel natural, immersive, and story-driven.

Moonshot LLM Models

The Moonshot LLM family delivers cutting-edge performance on real-world tasks, combining strong reasoning with ultra-long context to power complex assistants, coding, and analytical workflows, making advanced AI easier to deploy in production products and services.

Wan2.6 Video Models

Wan 2.6 is Alibaba’s state-of-the-art multimodal video generation model, capable of producing high-fidelity, audio-synchronized videos from text or images. Wan 2.6 will let you create videos of up to 15 seconds, ensuring narrative flow and visual integrity. It is perfect for creating YouTube Shorts, Instagram Reels, Facebook clips, and TikTok videos.

Flux.2 Image Models

The Flux.2 Series is a comprehensive family of AI image generation models. Across the lineup, Flux supports text-to-image, image-to-image, reconstruction, contextual reasoning, and high-speed creative workflows.

Nano Banana Image Models

Nano Banana is a fast, lightweight image generation model for playful, vibrant visuals. Optimized for speed and accessibility, it creates high-quality images with smooth shapes, bold colors, and clear compositions—perfect for mascots, stickers, icons, social posts, and fun branding.

Image and Video Tools

Open, advanced large-scale image generative models that power high-fidelity creation and editing with modular APIs, reproducible training, built-in safety guardrails, and elastic, production-grade inference at scale.

Ltx-2 Video Models

LTX-2 is a complete AI creative engine. Built for real production workflows, it delivers synchronized audio and video generation, 4K video at 48 fps, multiple performance modes, and radical efficiency, all with the openness and accessibility of running on consumer-grade GPUs.

Qwen Image Models

Qwen-Image is Alibaba’s open image generation model family. Built on advanced diffusion and Mixture-of-Experts design, it delivers cinematic quality, controllable styles, and efficient scaling, empowering developers and enterprises to create high-fidelity media with ease.

Open AI Model Families

Explore OpenAI’s language and video models on Atlas Cloud: ChatGPT for advanced reasoning and interaction, and Sora-2 for physics-aware video generation.

Hailuo Video Models

MiniMax Hailuo video models deliver text-to-video and image-to-video at native 1080p (Pro) and 768p (Standard), with strong instruction following and realistic, physics-aware motion.

Wan2.5 Video Models

Wan 2.5 is Alibaba’s state-of-the-art multimodal video generation model, capable of producing high-fidelity, audio-synchronized videos from text or images. It delivers realistic motion, natural lighting, and strong prompt alignment across 480p to 1080p outputs—ideal for creative and production-grade workflows.

Open AI Model Families

Explore OpenAI’s language and video models on Atlas Cloud: ChatGPT for advanced reasoning and interaction, and Sora-2 for physics-aware video generation.

Hailuo Video Models

MiniMax Hailuo video models deliver text-to-video and image-to-video at native 1080p (Pro) and 768p (Standard), with strong instruction following and realistic, physics-aware motion.