Stable Diffusion (SD) And Stable Diffusion XL (SDXL)

  • Image Generation API

    This API allows you to generate images from textual descriptions using the Stable Diffusion model. Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text description.

  • Try it out

    Get your API key at the following site: https://airight.io/api-key

  • How to use

  • Parameters:

    • callback: (required) an api url that we can call by POST method to return the result after the image is generated.


    • modelId: (required) The Model ID is what we use to set default values for optional parameters. Its value is 65 for SD and 71 for SDXL


    • prompt: (required) A textual description of the desired image. More specific and detailed prompts yield better results.


    • negative_prompt: (optional) A textual description of what should not be included in the image. Helps in preventing unwanted elements or artifacts.



    • lora: (optional) a powerful technique used in Stable Diffusion to customize the model's behavior without requiring extensive retraining. By introducing a small, trainable component to the model, Lora can significantly alter its output style or content.

      ex: ‘{name}:{weight},{name}:{weight},…’

      name is the name of the LoRA model. It can be different from the filename. Its value is taken from the api https://developers.airight.io/nft-market-backend-service/models/prompt/lora15 with SD and https://developers.airight.io/nft-market-backend-service/models/prompt/lora-sdxl with SDXL.

      weight is the emphasis applied to the Lora model. It is similar to a keyword weight. The default is 1. Setting it to 0 disables the model.


    • steps: (optional) The number of diffusion steps. More steps result in higher quality images but increase generation time. Its value is a int number from 0 to 50.


    • cfg_scale: (optional) Controls how closely the generated image adheres to the prompt. Higher values lead to images more aligned with the prompt, while lower values allow for more creative variations. Its value is a float number from 0 to 15. Default is 7,5.


    • seed: (optional) A random number that acts as a starting point for the generation process. Changing the seed produces different variations of the same prompt. Default is -1.


    • sampler: (optional) The algorithm used for sampling during the generation process. Different samplers offer trade-offs between speed, quality, and diversity. Common samplers include Euler a, LMS, DPM++ 2M. Default is DPM++ 2M.

      Its value is one of the following: 'Euler', 'LMS', 'Heun', 'DPM2', 'DPM2 a', 'DPM++ 2S a', 'DPM++ 2M', 'DPM++ SDE', 'DPM++ 2M SDE', 'DPM++ 2M SDE Heun', 'DPM++ 2M SDE Heun', 'DPM++ 2M SDE Heun Exponential', 'DPM++ 3M SDE', 'DPM++ 3M SDE', 'DPM++ 3M SDE Exponential', 'DPM fast', 'DPM adaptive', 'LMS', 'DPM2', 'DPM2 a', 'DPM++ 2S a', 'Restart', 'DDIM', 'PLMS', 'UniPC'


    • width, height: (optional) The dimensions of the generated image. Default is 512.


    • enable_hires: (optional) An option that allows the model to generate higher resolution images than the default setting. Default is false.


    • batch_size: (optional) Number of images generated. Default is 1 and ranges from 1 to 4.


    • adetailer: (optional) is a Stable Diffusion Automatic11111 web-UI extension that automates inpainting and more. It saves you time and is great for quickly fixing common issues like garbled faces. Default is false.


    • vae: (optional) stands for variational auto encoder. It is part of the neural network model that encodes and decodes the images to and from the smaller latent space, so that computation can be faster. Your images will come out looking more sharper and crispier. Its value is one of the following:

      'Automatic', 'vae-ft-mse-840000-ema-pruned.ckpt', 'kl-f8-anime.ckpt', 'kl-f8-anime2.ckpt', 'YOZORA.vae.pt', 'orangemix.vae.pt', 'blessed2.vae.pt', 'animevae.pt', 'ClearVAE.safetensors'


    • embedding: (optional) is the result of textual inversion, a method to define new keywords in a model without modifying it. The method has gained attention because its capable of injecting new styles or objects to a model with as few as 3 -5 sample images.

      Its value is one of the following:

      'face_yolov8s', 'face_yolov8m', 'face_yolov8n', 'hand_yolov8s.pt', 'person_yolov8n-seg.pt', 'person_yolov8m-seg.pt', 'person_yolov8s-seg.pt', 'mediapipe_face_full', 'DPM++ 2M SDE', 'Restart', 'DPM++ 2M SDE Exponential', 'DPM++ 2M SDE Heun', 'DPM++ 2M SDE Heun Exponential',


    • hr_upscaler: (optional) is an AI upscaler option to enhance the detail of the image.

      Its value is one of the following:

      'Latent', 'Latent (antialiased)', 'Latent (bicubic)', 'Latent (bicubic antialiased)', 'Latent (nearest)', 'Latent (nearest-exact)', 'None', 'Lanczos', 'Nearest', '4x-UltraSharp', 'LDSR', 'R-ESRGAN 4x+', 'R-ESRGAN 4x+ Anime6B', 'ScuNET GAN', 'ScuNET PSNR', 'SwinIR 4x'


    • hr_denoising_strength: (optional) as the balance between preserving the original image and creating a completely new image. Denoising strength determines how much noise is added to an image before the sampling steps. It is a common setting in image-to-image applications in Stable Diffusion.

      The value of denoising strength ranges from 0 to 1. 0 means no noise is added to the input image. Default is 1 means the input image is completely replaced with noise.


    • hr_steps: (optional) similar to steps but applied to ai upscaler when enable_hires is enabled

  • Response: The API returns the job ID which you can get the result by calling the api https://developers.airight.io/nft-market-backend-service/sdk/art/{id} if we are unable to call the callback url to send you the response.

Last updated