You might receive this error when using StableDiffusionInpaintPipeline
:
The config of pipeline.unet: FrozenDict([(‘sample_size’, 32), (‘in_channels’, 4), (‘out_channels’, 4), (‘center_input_sample’, False), (‘flip_sin_to_cos’, True), (‘freq_shift’, 0), (‘down_block_types’, [‘CrossAttnDownBlock2D’, ‘CrossAttnDownBlock2D’, ‘CrossAttnDownBlock2D’, ‘DownBlock2D’]), (‘up_block_types’, [‘UpBlock2D’, ‘CrossAttnUpBlock2D’, ‘CrossAttnUpBlock2D’, ‘CrossAttnUpBlock2D’]), (‘only_cross_attention’, False), (‘block_out_channels’, [320, 640, 1280, 1280]), (‘layers_per_block’, 2), (‘downsample_padding’, 1), (‘mid_block_scale_factor’, 1), (‘act_fn’, ‘silu’), (‘norm_num_groups’, 32), (‘norm_eps’, 1e-05), (‘cross_attention_dim’, 768), (‘attention_head_dim’, 8), (‘dual_cross_attention’, False), (‘use_linear_projection’, False), (‘num_class_embeds’, None), (‘upcast_attention’, False), (‘_class_name’, ‘UNet2DConditionModel’), (‘_diffusers_version’, ‘0.11.1’), (‘class_embed_type’, None), (‘mid_block_type’, ‘UNetMidBlock2DCrossAttn’), (‘resnet_time_scale_shift’, ‘default’), (‘_name_or_path’, ‘/root/.cache/huggingface/diffusers/models–xxx–xxx/snapshots/f28aa3dadbeed18d88a15c35566db357d657b43b/unet’)]) expects 4 but received
num_channels_latents
: 4 +num_channels_mask
: 1 +num_channels_masked_image
: 4 = 9. Please verify the config ofpipeline.unet
or yourmask_image
orimage
input.