Back to Templates Library
Can't be used normally? Clickhereto let us know.
GLM-Image
Favorite
Copy Link
Official template
novitalabs/glm-image:v0.01
Updated time: 31 Mar 2026
README
Configuration

Run GLM-Image on Novita

What is GLM-Image

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture, effectively pushing the upper bound of visual fidelity and fine-grained details. In general image generation quality, it aligns with industry-standard LDM-based approaches, while demonstrating significant advantages in knowledge-intensive image generation scenarios.

You can check the details on the official website of the project.

Run GLM-Image on Novita

Step 1: Console Entry
Launch the GPU interface and select Get Started to access deployment management.
图片 Step 2: Package Selection
Locate GLM-image in the template repository and begin installation sequence.
图片 Step 3: Infrastructure Setup
Configure computing parameters including memory allocation, storage requirements, and network settings. Select Deploy to implement.
图片 Step 4: Review and Create
Double-check your configuration details and cost summary. When satisfied, click Deploy to start the creation process.
图片 Step 5: Wait for Creation
After initiating deployment, the system will automatically redirect you to the instance management page. Your instance will be created in the background.
图片 Step 6: Monitor Download Progress
Track the image download progress in real-time. Your instance status will change from Pulling to Running once deployment is complete. You can view detailed progress by clicking the arrow icon next to your instance name. 图片 Step 7: Verify Instance Status
Click the Logs button to view instance logs and confirm that the InvokeAI service has started properly.
图片

How to get started

example text2image.py

1import torch 2from diffusers.pipelines.glm_image import GlmImagePipeline 3pipe = GlmImagePipeline.from_pretrained("zai-org/GLM-Image", torch_dtype=torch.bfloat16, device_map="cuda") 4prompt = "A beautifully designed modern food magazine style dessert recipe illustration, themed around a raspberry mousse cake. The overall layout is clean and bright, divided into four main areas: the top left features a bold black title 'Raspberry Mousse Cake Recipe Guide', with a soft-lit close-up photo of the finished cake on the right, showcasing a light pink cake adorned with fresh raspberries and mint leaves; the bottom left contains an ingredient list section, titled 'Ingredients' in a simple font, listing 'Flour 150g', 'Eggs 3', 'Sugar 120g', 'Raspberry puree 200g', 'Gelatin sheets 10g', 'Whipping cream 300ml', and 'Fresh raspberries', each accompanied by minimalist line icons (like a flour bag, eggs, sugar jar, etc.); the bottom right displays four equally sized step boxes, each containing high-definition macro photos and corresponding instructions, arranged from top to bottom as follows: Step 1 shows a whisk whipping white foam (with the instruction 'Whip egg whites to stiff peaks'), Step 2 shows a red-and-white mixture being folded with a spatula (with the instruction 'Gently fold in the puree and batter'), Step 3 shows pink liquid being poured into a round mold (with the instruction 'Pour into mold and chill for 4 hours'), Step 4 shows the finished cake decorated with raspberries and mint leaves (with the instruction 'Decorate with raspberries and mint'); a light brown information bar runs along the bottom edge, with icons on the left representing 'Preparation time: 30 minutes', 'Cooking time: 20 minutes', and 'Servings: 8'. The overall color scheme is dominated by creamy white and light pink, with a subtle paper texture in the background, featuring compact and orderly text and image layout with clear information hierarchy." 5image = pipe( 6prompt=prompt, 7height=32 * 32, 8width=36 * 32, 9num_inference_steps=30, 10guidance_scale=1.5, 11generator=torch.Generator(device="cuda").manual_seed(42), 12).images[0] 13image.save("output_t2i.png")

You can modify the prompt in text2image.py to run or use existing examples to run directly.

python3 text2image

1$ python3 text2image.py 2Couldn't connect to the Hub: Cannot reach https://huggingface.co/api/models/zai-org/GLM-Image: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.. 3Will try to load from local cache. 4Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00, 1.47it/s] 5Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 111/111 [00:00<00:00, 1391.52it/s, Materializing param=shared.weight] 6Loading pipeline components...: 71%|██████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 5/7 [00:02<00:00, 2.91it/s]Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. 7Loading weights: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 1011/1011 [00:02<00:00, 359.59it/s, Materializing param=model.vqmodel.quantize.embedding.weight] 8Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:07<00:00, 1.02s/it] 9100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:11<00:00, 2.69it/s]

output_t2i.png 图片

example image2image.py

1import torch 2from diffusers.pipelines.glm_image import GlmImagePipeline 3from PIL import Image 4 5pipe = GlmImagePipeline.from_pretrained("zai-org/GLM-Image", torch_dtype=torch.bfloat16, device_map="cuda") 6image_path = "cond.jpg" 7prompt = "Replace the background of the snow forest with an underground station featuring an automatic escalator." 8image = Image.open(image_path).convert("RGB") 9image = pipe( 10 prompt=prompt, 11 image=[image], # can input multiple images for multi-image-to-image generation such as [image, image1] 12 height=33 * 32, 13 width=32 * 32, 14 num_inference_steps=30, 15 guidance_scale=1.5, 16 generator=torch.Generator(device="cuda").manual_seed(42), 17).images[0] 18 19image.save("output_i2i.png")

You can modify the prompt and image in text2image.py to run or use existing examples to run directly.

1$ python3 image2image.py 2 3Couldn't connect to the Hub: Cannot reach https://huggingface.co/api/models/zai-org/GLM-Image: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.. 4Will try to load from local cache. 5Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. 6Loading weights: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 1011/1011 [00:02<00:00, 360.88it/s, Materializing param=model.vqmodel.quantize.embedding.weight] 7Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00, 1.46it/s] 8Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 111/111 [00:00<00:00, 1426.62it/s, Materializing param=shared.weight] 9Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:07<00:00, 1.03s/it] 10100%|███████████████████████████████████████████████████████████████████████████████████| 30/30 [00:10<00:00, 2.97it/s]