feat(apionly): add Alibaba Cloud Models#9055
Open
Pfannkuchensack wants to merge 57 commits intoinvoke-ai:mainfrom
Open
feat(apionly): add Alibaba Cloud Models#9055Pfannkuchensack wants to merge 57 commits intoinvoke-ai:mainfrom
Pfannkuchensack wants to merge 57 commits intoinvoke-ai:mainfrom
Conversation
…pi_keys.yaml Add 'external', 'external_image_generator', and 'external_api' to Zod enum schemas (zBaseModelType, zModelType, zModelFormat) to match the generated OpenAPI types. Remove redundant union workarounds from component prop types and Record definitions. Fix type errors in ModelEdit (react-hook-form Control invariance), parsing.tsx (model identifier narrowing), buildExternalGraph (edge typing), and ModelSettings import/export buttons. Move external_gemini_base_url and external_openai_base_url into api_keys.yaml alongside the API keys so all external provider config lives in one dedicated file, separate from invokeai.yaml.
Add combined resolution preset selector for external models that maps aspect ratio + image size to fixed dimensions. Gemini 3 Pro and 3.1 Flash now send imageConfig (aspectRatio + imageSize) via generationConfig instead of text-based aspect ratio hints used by Gemini 2.5 Flash. Backend: ExternalResolutionPreset model, resolution_presets capability field, image_size on ExternalGenerationRequest, and Gemini provider imageConfig logic. Frontend: ExternalSettingsAccordion with combo resolution select, dimension slider disabling for fixed-size models, and panel schema constraint wiring for Steps/Guidance/Seed controls.
- Remove negative_prompt, steps, guidance, reference_image_weights, reference_image_modes from external model nodes (unused by any provider) - Remove supports_negative_prompt, supports_steps, supports_guidance from ExternalModelCapabilities - Add provider_options dict to ExternalGenerationRequest for provider-specific parameters - Add OpenAI-specific fields: quality, background, input_fidelity - Add Gemini-specific fields: temperature, thinking_level - Add new OpenAI starter models: GPT Image 1.5, GPT Image 1 Mini, DALL-E 3, DALL-E 2 - Fix OpenAI provider to use output_format (GPT Image) vs response_format (DALL-E) and send model ID in requests - Add fixed aspect ratio sizes for OpenAI models (bucketing) - Add ExternalProviderRateLimitError with retry logic for 429 responses - Add provider-specific UI components in ExternalSettingsAccordion - Simplify ParamSteps/ParamGuidance by removing dead external overrides - Update all backend and frontend tests
Add AlibabaCloudProvider supporting Qwen Image and Wan model families via the DashScope API. Includes sync (multimodal-generation) and async (image-generation with task polling) request modes, five starter models (Qwen Image 2.0 Pro, 2.0, Max, Wan 2.6 T2I, Qwen Image Edit Max), config fields for API key and base URL, and frontend registration.
# Conflicts: # invokeai/backend/model_manager/starter_models.py
…ope config fields
- Gemini recall: write temperature, thinking_level, image_size to image metadata; wire external graph as metadata receiver; add recall handlers. - Canvas: gate regional guidance, inpaint mask, and control layer for external models. - Canvas: throw a clear error on outpainting for external models (was falling back to inpaint and hitting an API-side mask/image size mismatch). - Workflow editor: add ui_model_provider_id filter so OpenAI and Gemini nodes only list their own provider's models. - Workflow editor: silently drop seed when the selected model does not support it instead of raising a capability error. - Remove the legacy external_image_generation invocation and the graph-builder fallback; providers must register a dedicated node. - Regenerate schema.ts. - remove Gemini debug dumps to outputs/external_debug
…rnal graph - Export imageSizeChanged from paramsSlice (required by the new ImageSize recall handler). - Emit the external graph's metadata model entry via zModelIdentifierField since ExternalApiModelConfig is not part of the AnyModelConfig union.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds Alibaba Cloud DashScope as an external image generation provider on top of the external-models base (PR #8884). Introduces five starter models covering Qwen Image 2.0 Pro / 2.0 / Max / Edit Max and Wan 2.6 Text-to-Image.
What:
invokeai/app/services/external_generation/providers/alibabacloud.pyExternalProvidersForm.tsxandproviders/__init__.pyapi_keys.yaml)Why:
DashScope exposes Alibaba's Qwen Image family (strong bilingual text rendering) and Wan 2.6 (photorealistic T2I), expanding the set of external providers users can pick from.
How:
Implements the external-provider interface from PR #8884. All five models share similar capability shapes; Qwen Image Edit Max is the only img2img entry.
Related Issues / Discussions
QA Instructions
api_keys.yaml.max_images_per_request=4works for each model.Merge Plan
Merge after #8884 lands. No DB migrations. Config-default changes are additive.
Checklist
What's Newcopy (if doing a release after this PR)