稳定的扩散管始终输出 512*512 图像,无论输入分辨率如何

问题描述 投票:0回答:1

我正在制作一个修复应用程序,我几乎得到了想要的结果,除了管道对象输出 512*512 图像,无论我传入什么分辨率。我在 CPU 上运行它,它是 onnx 转换的, AMD 友好版本的稳定扩散。

这是我认为相关的代码:

class CustomDiffuser:
    def __init__(self, provider:Literal['CPUExecutionProvider', 'DmlExecutionProvider']='CPUExecutionProvider'):

        self.pipe_text2image = None
        self.pipe_inpaint = None
        self.image = None
        self.sam = None
        self.provider = provider


    def load_model_for_inpainting(
            self, 
            path: str = '../stable_diffusion_onnx_inpainting', 
            safety_checker=None
    ):
        self.pipe_inpaint = OnnxStableDiffusionInpaintPipeline.from_pretrained(path, provider=self.provider, revision='onnx', safety_checker=safety_checker)        


    def inpaint_with_prompt(
            self, 
            image: cv2.typing.MatLike | Image.Image, 
            mask: cv2.typing.MatLike | Image.Image,
            height: int, 
            width: int,             
            prompt: str = '', 
            negative: str = '',
            steps: int = 10, 
            cfg: float =  7.5,
            noise: float = 0.75
    ):

        pipe = self.pipe_inpaint

        image = image.resize((width, height))
        mask = mask.resize((width, height))

        output_image = pipe(
            prompt,
            image,
            mask,
            #strength=noise,
            guidance_scale=cfg
        )

        return output_image

  
diffuser = CustomDiffuser('CPUExecutionProvider')
    
diffuser.load_model_for_inpainting('C:/path/to/repository/stable_diffusion_onnx_inpainting')

output = diffuser.inpaint_with_prompt(
    Image.open(image_path),
    Image.fromarray(headless_selfie_mask.astype(np.uint8)),
    576, #height first 
    384,                
    'a picture of a man dressed in a darth vader costume, full body shot, front view, light saber',
    ''
)
 
python pytorch onnx stable-diffusion
1个回答
0
投票

您需要按如下方式传递高度和宽度:

output_image = pipe(
    prompt,
    image,
    mask,
    height,
    width,
    #strength=noise,
    guidance_scale=cfg
)

如果您检查源代码 - heightwidth 默认为 512。

© www.soinside.com 2019 - 2024. All rights reserved.