This is an asynchronous API; only the task_id is returned initially. Utilize this task_id to query the Task Result API at Get Subject Training Result API to retrieve the results of the image generation.
This parameter controls the extent of model parameter updates during each iteration. A higher learning rate results in larger updates, potentially speeding up the learning process but risking overshooting the optimal solution. Conversely, a lower learning rate ensures smaller, more precise adjustments, which may lead to a more stable convergence at the cost of slower training.
Enum: 1e-4, 1e-5, 1e-6, 2e-4, 5e-5
This parameter specifies the maximum number of training steps to be executed before halting the training process. It sets a limit on the duration of training, ensuring that the model does not continue to train indefinitely. If the max_train_steps set to 2000 and images amount in parameter image_dataset_items is 10, the number of training steps per graph is 200. Minimum value is 1.
A seed is a number from which Stable Diffusion generates noise, which, makes training deterministic. Using the same seed and set of parameters will produce identical LoRA each time, Minimum 1.
This parameter specifies the type of learning rate scheduler to be used during the training process. The scheduler dynamically adjusts the learning rate according to one of the specified strategies.
Enum: constant, linear, cosine, cosine_with_restarts, polynomial, constant_with_warmup
This parameter determines the number of initial training steps during which the learning rate increases gradually, effective only when the lr_scheduler is set to one of the following modes: linear, cosine, cosine_with_restarts, polynomial, or constant_with_warmup. The warmup phase helps in stabilizing the training process before the main learning rate schedule begins. The minimum value for this parameter is 0, indicating no warmup, Minimum 0.
This parameter specifies a prompt that best describes the images associated with an instance. It is essential for accurately conveying the content or theme of the images, facilitating better context or guidance for operations such as classification, tagging, or generation.
This parameter is used to specify a prompt that focuses the training process on a specific subject, in this case, a person. It guides the model to tailor its learning and output generation towards this defined class, enhancing specificity and relevance in tasks such as image recognition or generation related to human features or activities.
Enum: person
This parameter enables the option to preserve prior knowledge or settings in a model. When set to true, it ensures that existing configurations or learned patterns are maintained during updates or further training, enhancing the model’s stability and consistency over time.
This parameter specifies the weight assigned to the prior loss in the model’s loss function. It must be greater than 0 to have an effect. Setting this parameter helps control the influence of prior knowledge on the training process, balancing new data learning with the retention of previously learned information.
This parameter determines whether the text encoder component of the model should undergo training. Enabling this setting (true) allows the text encoder to adapt and improve its understanding of textual input based on the specific data and tasks at hand, potentially enhancing overall model performance.
This parameter specifies the rank for the LoRA (Low-Rank Adaptation) modification. Valid values range from 4 to 128. Adjusting this parameter allows for tuning the complexity and capacity of the LoRA layers within the model, impacting both performance and computational efficiency. Range [4 , 128].
This parameter sets the scaling factor (alpha) for the Low-Rank Adaptation (LoRA) layers within the model. It accepts values ranging from 4 to 128. Adjusting lora_alpha modifies the degree of adaptation applied to the pre-trained layers, influencing the learning capability and the granularity of the adjustments made during training. Range [4 , 128].
This parameter specifies the rank of the LoRA (Low-Rank Adaptation) modification applied specifically to the text encoder component of the model. Valid values range from 4 to 128. By setting this parameter, you can tune the complexity and impact of the LoRA adjustments on the text encoder, potentially enhancing its performance and adaptability to new textual data. Range [4 , 128].
This parameter defines the scaling factor (alpha) for Low-Rank Adaptation (LoRA) specifically applied to the text encoder component of the model. It accepts values ranging from 4 to 128. The lora_text_encoder_alpha parameter adjusts the degree of adaptation applied, allowing for finer control over how the text encoder processes and learns from textual input, thereby impacting the overall effectiveness and efficiency of the model. Range [4 , 128].
Represents the current status of a task, particularly useful for monitoring and managing the progress of training tasks. Each status indicates a specific phase in the task’s lifecycle.
Enum: UNKNOWN, QUEUING, TRAINING, SUCCESS, CANCELED, FAILED
Currently we only supports uploading images in png / jpeg / webp format.
Each task supports uploading up to 50 images. In order to make the final effect good, the images uploaded should meet some basic conditions, such as: “portrait in the center”, “no watermark”, “clear picture”, etc.
In this step, we will begin the model training process, which is expected to take approximately 10 minutes, depending on the actual server’s availability.
There are four types of parameters for model traning: Model info parameters, dataset parameters, components parameters,expert parameters, you can set them according to our tables below.
Here are some tips to train a good model:
At least 10 photos of faces that meet the requirements.
For parameters instance_prompt, we suggests using “a close photo of ohwx <man|woman>”
For parameters base_model, value v1-5-pruned-emaonly has better generalization ability and can be used in combination with various Base models, such as dreamshaper 2.5D, value epic-realism has a strong sense of reality.
Type
Parameters
Description
Model info parameters
name
Name of your training model
Model info parameters
base_model
base_model type
Model info parameters
width
Target image width
Model info parameters
height
Target image height
dataset parameters
image_dataset_items
Array: consist of imageUrl and image caption
dataset parameters
- image_dataset_items.assets_id
images assets_id, which can be found in step Get image upload URL
components parameters
components
Array: consist of name and args, this is a common parameters configured for training.
components parameters
- components.name
Type of components, Enum: face_crop_region, resize, face_restore
components parameters
- components.args
Detail values of components.name
expert parameters
expert_setting
expert parameters.
expert parameters
- instance_prompt
Captions for all the training images, here is a guidance of how to make a effective prompt : Click Here
expert parameters
- batch_size
batch size of training.
expert parameters
- max_train_steps
Max train steps, 500 is enought for lora model training.
expert parameters
- …
More expert parameters can be access at api reference.
After model deployed successfully, we can download the model files or generate images directly.
3.2.1 Use the generated models to create images
In order to use the trained lora models, We need to add model_name into the request of endpoint /v3/async/txt2img or /v3/async/img2img. Currently trained lora model can not be used in /v3 endpoint.
Below is a example of how to generate images with trained model:
Please set the Content-Type header to application/json in your HTTP request to indicate that you are sending JSON data. Currently, only JSON format is supported.
HTTP status codes in the 2xx range indicate that the request has been successfully accepted, while status codes in the 5xx range indicate internal server errors.