图视频模型 LLM

画廊 AI 视频与图像 LLM 音频 3D 工作流

文档 Coding Plan MCP & CLINEW

Nano Banana 2 Reference to Image

图生图

Nano Banana 2 Reference to Image

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

Nano Banana 2 Reference to Image

Seedance 2.0 Reference-to-Video

Seed3D 2.0 Image-to-3D

Grok Imagine Image Quality Text-to-Image

Wan-2.7 Text-to-video

分类

折扣模型 (130)

模型功能

模型系列

共 247 个模型，当前显示 48 个

最新

Seed3D 2.0 Image-to-3D

Seed3D 2.0 Image-to-3D

ByteDance Seed3D 2.0 — generates a textured, PBR-shaded 3D model (glb/obj/usd/usdz) from a single input image. Returns a downloadable .zip archive containing the 3D file.

Midjourney V8.1 Image-to-Video

Midjourney V8.1 Image-to-Video

Midjourney V8.1 animates an input image into four 5-second videos at 480p or 720p.

Midjourney V8.1 Text-to-Image

Midjourney V8.1 Text-to-Image

Midjourney V8.1 generates four images from a text prompt, with optional native 2K HD, a style reference, and aspect-ratio / stylize / chaos / weird controls.

Hunyuan 3D Rapid Image-to-3D

Hunyuan 3D Rapid Image-to-3D

Tencent Hunyuan 3D Rapid (Express) — fast lightweight 3D mesh generation from a single image, with optional PBR materials. Outputs GLB/OBJ/USDZ/FBX/STL/MP4.

Hunyuan 3D Rapid Text-to-3D

Hunyuan 3D Rapid Text-to-3D

Tencent Hunyuan 3D Rapid (Express) — fast lightweight 3D mesh generation from a text prompt, with optional PBR materials. Outputs GLB/OBJ/USDZ/FBX/STL/MP4.

Hunyuan 3D Pro Image-to-3D

Hunyuan 3D Pro Image-to-3D

Tencent Hunyuan 3D Pro (v3.1) — high-quality textured 3D mesh generation from a single image, with optional PBR materials and custom face count. Outputs GLB/OBJ/USDZ/FBX/STL.

Hunyuan 3D Pro Text-to-3D

Hunyuan 3D Pro Text-to-3D

Tencent Hunyuan 3D Pro (v3.1) — high-quality textured 3D mesh generation from a text prompt, with optional PBR materials and custom face count. Outputs GLB/OBJ/USDZ/FBX/STL.

Nano Banana 2 Reference to Image

Nano Banana 2 Reference to Image

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

Nano Banana 2 Reference to Image Developer

Nano Banana 2 Reference to Image Developer

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

Grok Imagine Video v1.5 Image-to-Video

Grok Imagine Video v1.5 Image-to-Video

xAI Grok Imagine Video v1.5 animates a starting frame image with natural-language motion prompts at 480p or 720p.

Grok Imagine Image Quality Text-to-Image

Grok Imagine Image Quality Text-to-Image

xAI Grok Imagine generates polished visuals from natural-language prompts at 1K or 2K resolution, with 14 aspect ratios.

Grok Imagine Image Quality Edit

Grok Imagine Image Quality Edit

xAI Grok Imagine edits one or more reference images with natural-language instructions at 1K or 2K resolution. Supports single image and multi-image (<IMAGE_0>, <IMAGE_1>) reference editing.

Gemini Omni Flash Image-to-Video Developer

Gemini Omni Flash Image-to-Video Developer

Gemini Omni Flash is Google's multimodal video generation model. This image-to-video variant creates subject-consistent videos from up to 7 reference images combined with a text prompt, preserving visual identity across the full generated video.

Gemini Omni Flash Text-to-Video Developer

Gemini Omni Flash Text-to-Video Developer

Gemini Omni Flash is Google's multimodal video generation model. This text-to-video variant generates high-quality cinematic videos from text prompts with support for multiple resolutions, aspect ratios, and controllable duration.

HappyHorse-1.0 Text-to-video

HappyHorse-1.0 Text-to-video

Generates videos from text prompts with HappyHorse 1.0, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

HappyHorse-1.0 Image-to-video

HappyHorse-1.0 Image-to-video

Animates a first-frame image into video with optional prompt guidance, 720P or 1080P output, and durations from 3 to 15 seconds.

HappyHorse-1.0 Reference-to-video

参考生视频

HappyHorse-1.0 Reference-to-video

Generates videos from one to nine reference images and a text prompt, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

HappyHorse-1.0 Video-edit

视频转视频

HappyHorse-1.0 Video-edit

Edits an input video with text instructions and optional reference images, supporting 720P or 1080P output.

Openai GPT Image 2 Text-to-Image

Openai GPT Image 2 Text-to-Image

GPT Image 2 text to image is OpenAI's fast, cost-efficient text-to-image generator powered by GPT-5 guidance. Create photorealistic shots, product renders, concept art, and stylized graphics from natural-language prompts (optionally conditioned with an image). Supports custom aspect ratios, seeds, negative prompts, hex color hints, and style presets. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Openai GPT Image 2 Edit

Openai GPT Image 2 Edit

GPT Image 2 Edit is OpenAI's image model for precise, natural-language edits. Add/remove objects, swap backgrounds, retouch faces, adjust colors/lighting, edit text/graphics, crop/resize, and apply hex color control. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Baidu ERNIE Image Turbo Text-to-image

Baidu ERNIE Image Turbo Text-to-image

A fast, low-latency version of ERNIE Image by Baidu, optimized for rapid iteration and scalable image generation.Balances speed and quality, ideal for real-time and high-throughput scenarios.

Seedance 2.0 Text-to-Video

Seedance 2.0 Text-to-Video

Generate videos from text prompts with native audio and optional web search.

From≈$0.112/秒

Seedance 2.0 Image-to-Video

Seedance 2.0 Image-to-Video

Generate videos from a first-frame image (and optional last-frame) with native audio.

From≈$0.112/秒

Seedance 2.0 Reference-to-Video

Seedance 2.0 Reference-to-Video

Multimodal video generation from reference images, videos, and audio. Supports video editing and extension.

From≈$0.112/秒

Seedance 2.0 Fast Text-to-Video

Seedance 2.0 Fast Text-to-Video

Fast video generation from text prompts with native audio.

From≈$0.09/秒

Seedance 2.0 Fast Image-to-Video

Seedance 2.0 Fast Image-to-Video

Fast video generation from first-frame image (and optional last-frame) with native audio.

From≈$0.09/秒

Seedance 2.0 Fast Reference-to-Video

Seedance 2.0 Fast Reference-to-Video

Fast multimodal video generation from reference images, videos, and audio. Supports video editing and extension.

From≈$0.09/秒

Wan-2.7 Text-to-video

Wan-2.7 Text-to-video

Generates videos from text prompts with multi-shot narrative, audio generation, and sound-image synchronization.

Wan-2.7 Image-to-video

Wan-2.7 Image-to-video

Animates images into videos with first-frame, first-and-last-frame, video continuation, and audio-driven modes.

Wan-2.7 Reference-to-video

视频转视频

Wan-2.7 Reference-to-video

Generates character-driven videos from reference images and videos, with multi-subject and voice-cloning support.

Wan-2.7 Video-edit

视频转视频

Wan-2.7 Video-edit

Edits videos using text instructions, reference images, and style transfer with multi-modal input support.

Veo 3.1 Lite Text-to-video

Veo 3.1 Lite Text-to-video

High-efficiency Veo 3.1 Lite text-to-video: create video with synchronized audio from text prompts. Targets high-volume applications with strong price efficiency; 720p/1080p and flexible duration options. Does not support 4K outputs or Extension.

Veo 3.1 Lite Start-End Frame to Video

Veo 3.1 Lite Start-End Frame to Video

Veo 3.1 Lite start-end frame to video: generate motion between a first and last frame with audio. Lightweight, developer-oriented option with 8s duration and 720p/1080p. Does not support 4K outputs or Extension.

Veo 3.1 Lite Image-to-video

Veo 3.1 Lite Image-to-video

High-efficiency Veo 3.1 Lite image-to-video: animate an input image into video with synchronized audio. Cost-effective for scalable workflows; supports 720p/1080p and common aspect ratios. Does not support 4K outputs or Extension.

Vidu Q3-Mix Reference to Video

Vidu Q3-Mix Reference to Video

Vidu Q3-Mix Reference-to-Video generates videos from 1-4 reference images with consistent subjects. Offers strong visual quality with intelligent scene transitions, smooth dynamic effects, and audio support up to 1080p.

Vidu Q3 Reference to Video

Vidu Q3 Reference to Video

Vidu Q3 Reference-to-Video generates videos from 1-4 reference images with consistent subjects. Features intelligent camera switching with better consistency across multiple camera positions, audio support, and resolutions up to 1080p.

Wan-2.2-turbo-spicy Image-to-video Lora

Wan-2.2-turbo-spicy Image-to-video Lora

Fast image-to-video generation with custom LoRA support. Powered by Wan 2.2 rCM turbo with high/low noise LoRA injection. Supports 480p, 720p, and 1080p output.

Wan-2.2-turbo-spicy Image-to-video

Wan-2.2-turbo-spicy Image-to-video

Fast image-to-video generation powered by Wan 2.2 with rCM turbo acceleration. Supports 480p, 720p, and 1080p (via VSR upscaling) output with 5s or 8s duration.

Wan-2.7 Text-to-image

Wan-2.7 Text-to-image

Generates images from text prompts with Wan 2.7 image, supporting fast iteration and strong prompt fidelity for illustration and photorealistic outputs.

Wan-2.7 Image-to-image

Wan-2.7 Image-to-image

Edits and recomposes images with Wan 2.7 image using text instructions, multi-image references, and optional interaction boxes.

Wan-2.7 Pro Text-to-image

Wan-2.7 Pro Text-to-image

Generates images from text prompts with Wan 2.7 image pro, supporting higher fidelity outputs and 4K-ready workflows.

Wan-2.7 Pro Image-to-image

Wan-2.7 Pro Image-to-image

Edits and recomposes images with Wan 2.7 image pro using text instructions and multi-image references for higher quality outputs.

Nano Banana 2 Text-to-Image Developer

Nano Banana 2 Text-to-Image Developer

Google's lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts.

Nano Banana 2 Text-to-Image

Nano Banana 2 Text-to-Image

Google's lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts.

Nano Banana 2 Edit Developer

Nano Banana 2 Edit Developer

Google's advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words.

Nano Banana 2 Edit

Nano Banana 2 Edit

Google's advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words.

Qwen Image 2.0 Text-to-image

Qwen Image 2.0 Text-to-image

Qwen Image 2.0 is an advanced text-to-image model with enhanced image quality and improved prompt understanding. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Qwen Image 2.0 Edit

Qwen Image 2.0 Edit

Qwen Image 2.0 Edit is an advanced image-editing model with improved quality and better understanding of instructions. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.