Guide

Everything you need to know about the controls in picoDiffusion. Each section explains what a control does, what values to start with, and how to experiment.

All of this information is also available inside the tool itself โ€” every control has a ๐Ÿ’ฌ button that shows the same help text. This page is here as a general learning resource you can read through at your own pace, without needing to have the tool open. It is a bit dry, but if that is your thing, have at it.

Checkpoint

The checkpoint is the core model that generates your images. Each checkpoint has learned from different training data, so they each have their own strengths and visual style.

Some are great at photorealism, others excel at illustration or anime. The checkpoint name often hints at what it does best. You can find checkpoints on sites like civitai.com along with example images so you can see what each one produces.

You need to select a checkpoint before generating. The first time you pick one, it takes a moment to load into memory. After that, generating is much faster. If you switch to a different checkpoint, it will need to load again.

Different checkpoints can produce wildly different results from the same prompt. If you are not happy with your images, try a different checkpoint before changing anything else โ€” it can make a huge difference.

Important: picoDiffusion uses Stable Diffusion 1.5. Make sure you download checkpoints built for sd1.5 (or 1.4 โ€” they are compatible). SDXL or SD 2.x checkpoints will not work here.

VAE

The VAE (Variational Auto-Encoder) handles the final conversion from the model's internal representation into the actual pixels you see. It has a noticeable impact on colour accuracy and sharpness.

Every checkpoint has a VAE built in, and most of the time it works well. You can leave this set to none and the built-in one will be used.

If your images look washed out, dull, or have muted colours, try selecting a different VAE. A popular choice is vae-ft-mse-840000, which tends to produce vibrant, accurate colours with most checkpoints.

Not every VAE works well with every checkpoint. If switching VAE makes things look worse, go back to none and let the checkpoint use its own.

LoRA

A LoRA (Low-Rank Adaptation) is a small add-on file that adjusts the checkpoint's behaviour. It can add a new visual style, teach the model a specific character or concept, or fine-tune how it renders certain things.

LoRAs are completely optional. You can generate great images without one. When you do use one, the lora weight slider controls how much influence it has.

A good starting point is a weight of 0.7 to 1.0. If the effect is too subtle, increase the weight. If the image looks distorted or over-styled, lower it. Each LoRA is different, so a little experimentation goes a long way.

LoRAs are often designed for a specific checkpoint. A LoRA that looks amazing with one checkpoint might do very little with another, or even make things worse. Check the LoRA's description to see which checkpoint it was trained with for the best results.

Trigger words

Many LoRAs require a specific word or phrase in your prompt to activate their effect. For example, a Silent Movie Frame LoRA might need the word "silentmovie" in your prompt.

Without the trigger word, the LoRA is still applied to the model weights, but the text encoder does not know to aim at the concept the LoRA learned. The trigger word is the link between the text and the visual concept. Always check the LoRA's download page for its trigger word and include it in your prompt.

Important: Like checkpoints, LoRAs are version-specific. picoDiffusion uses sd1.5, so make sure you download LoRAs made for sd1.5 (or 1.4). SDXL LoRAs will not work here.

LoRA Weight

This controls how much influence the selected LoRA has on your image.

Weight Effect
0.0 No effect at all, same as not using a LoRA.
0.5 to 0.8 A gentle touch. The style is present but blends naturally with the checkpoint. Good for subtle adjustments.
0.8 to 1.0 The recommended starting range for most LoRAs. The style is clearly visible and this is where many LoRAs are designed to work best.
Above 1.0 Pushes the effect harder. Can produce striking, heavily stylised results. Can also cause distortion or artefacts, so adjust gradually.

Every LoRA responds differently to weight changes. Some are subtle even at 1.0 and need to be pushed higher. Others are very strong and look best at 0.5. Try a few values and see what you like. A good starting point is 0.8.

Sampler

The sampler is the algorithm that builds your image step by step, starting from random noise and gradually refining it into a picture. Different samplers take slightly different approaches, which can affect the look and feel of the result.

Sampler Description
DPM++ 2M SDE Karras The default and our recommendation. Produces detailed, high-quality results with good variety. The "SDE" part adds a small amount of randomness at each step, which helps produce richer textures and more natural-looking images.
DPM++ 2M Karras Same family as the default, but without the randomness. Produces cleaner, more predictable results. Good when you want more control and less variation between runs.
Euler a The "a" stands for ancestral, which means it adds randomness like SDE does. Produces a softer, slightly painterly look. A popular choice for artistic and illustrative styles.
Euler The deterministic version of Euler a. No randomness, so the same seed gives very consistent results even if you change the step count. Good for fine-tuning.
DDIM One of the oldest and most reliable samplers. Fast, predictable, and produces clean results even at lower step counts (15 to 20). A good pick for quick, consistent results.
UniPC Designed to get good results in fewer steps than other samplers. If generation time matters and you want to keep steps low, this is worth trying.

There is no single best sampler โ€” it depends on the checkpoint, the prompt, and personal taste. A fun way to explore is to lock your seed, then switch between samplers to see how each one interprets the same prompt differently.

Prompt

The prompt describes what you want to see in the image. The more detail you include, the more control you have over the result.

Describe the subject, the setting, the style, the lighting, the mood โ€” anything that matters to you. Separate ideas with commas. Things mentioned earlier in the prompt tend to have a bit more influence.

For example:

welsh pembroke corgi, sitting on wooden floor, small laptop, warm lighting, soft shadows, cute, stylized

Every word in your prompt is something the model tries to include. Concrete descriptions like "warm lighting" or "soft shadows" work well because the model has seen many images with those properties during training and knows what they look like.

You will see people use quality tags like "highly detailed", "sharp focus", or "4k" in their prompts. These are not magic switches โ€” they are suggestions, and how much they help depends on the checkpoint you are using. Some checkpoints respond well to them, others mostly ignore them.

What works brilliantly with one checkpoint might do nothing with another. The best way to learn is to try things, change one thing at a time, and see what happens.

Negative Prompt

The negative prompt tells the model what you want to avoid in the image. It steers the generation away from unwanted elements.

Common things to include: "blurry", "low quality", "bad anatomy", "watermark", "extra limbs", "deformed".

A good negative prompt helps prevent common issues like distorted hands, weird faces, or blurry details. You do not need a huge list โ€” a handful of key terms usually does the job well.

Here is a solid starting negative prompt you can use and adjust:

bad anatomy, low quality, blurry, watermark, deformed, extra limbs, disfigured, poorly drawn face, poorly drawn hands, duplicate, signature

Like everything else, what works in the negative prompt can vary between checkpoints. If you are getting artefacts that the negative prompt is not fixing, try rewording it or adding more specific terms for what you are seeing.

Steps

Steps controls how many rounds of refinement the model does. Each step takes the image from noisy and rough to clearer and more detailed.

Range When to use
10 to 20 Fast results. Good for quickly testing a prompt idea before committing to a full render.
24 to 30 The sweet spot for most images. Good quality without a long wait. Start here.
40+ Diminishing returns. The image may get slightly more refined, but each extra step adds generation time and the improvement becomes harder to notice.

More steps means longer generation time. A good workflow: test your prompt at 15 to 20 steps first. Once you find a composition you like, lock the seed and re-render at 24 to 30 steps for the final version.

CFG Scale

CFG (Classifier-Free Guidance) controls how closely the model follows your prompt versus how much creative freedom it takes.

Range Effect
1 to 5 Very loose. The model takes creative liberties and the result may drift from your prompt, but can produce interesting surprises.
7 to 10 The sweet spot. The model follows your prompt faithfully while still filling in natural-looking details on its own. Start at 7.
12 to 20 Very strict. The model pushes hard to match your prompt, which can result in oversaturated colours, harsh contrast, or visual artefacts.

If your image looks too generic, try increasing CFG. If it looks harsh or over-processed, lower it. Some checkpoints prefer lower CFG values than others, so if a checkpoint's results look consistently overcooked, try dropping the CFG before changing anything else.

Seed

The seed is a number that sets the random starting point for generation. Every image begins from random noise, and the seed determines what that noise looks like.

The same seed with the same settings and prompt will produce the same image every time. This makes it easy to reproduce a result you like, or to make small tweaks without changing the whole composition.

Leave the seed blank and a random one will be chosen each time. After generating, the seed that was used is shown below the image so you can always find it again.

The lock seed checkbox keeps the same seed across multiple generations. This is really useful when you want to fine-tune your prompt, adjust the CFG, or try different samplers while keeping the same basic image.

Note: the same seed will produce different images if you change the checkpoint, sampler, or image size. The seed controls the starting noise, but everything else affects how that noise gets turned into a picture.

Output Size

Width and height set the dimensions of the generated image in pixels. The sliders move in steps of 32 pixels โ€” this is not arbitrary. Stable Diffusion's internals work in blocks, and dimensions that are multiples of certain numbers (8, 32, 64) play nicely with the model's architecture. Steps of 32 give you fine-grained control while keeping the dimensions on boundaries the model likes.

512x512 is the resolution Stable Diffusion 1.5 was trained at. It produces the most reliable and well-composed results.

Smaller sizes (256 to 384) generate much faster and are perfect for experimenting with prompts and settings.

Larger sizes (640 to 768) give more detail and room for complex scenes, but use more GPU memory and take longer. If you go too large for your GPU, you will get an out-of-memory error โ€” just lower the size and try again. For reference, on our Quadro P2000 (4GB VRAM), 512x512 generates in about 20 to 25 seconds at 24 steps. 768x768 is the maximum our card can handle and takes about 80 seconds. Your results will vary depending on your card.

Non-square sizes are great for specific compositions. Try 512x768 for portraits or 768x512 for landscapes.

A good workflow: Generate at a small size first to explore ideas quickly. When you get an image you like, lock the seed, increase the width and height, and generate again. You will get the same composition with more detail.

General Tips

Change one thing at a time

When you are experimenting, change one setting per generation. If you change the prompt, sampler, and CFG all at once, you will not know which change made the difference.

Use seed lock for comparisons

Lock the seed, then change one setting. This keeps the basic composition the same so you can clearly see the effect of your change.

Start small, finish big

Generate at 256x256 or 384x384 while experimenting. When you find something you like, lock the seed and regenerate at 512x512 or larger for the final version.

The checkpoint matters most

If your images are not looking how you want, switching checkpoints will make a bigger difference than any other setting. Each checkpoint has its own personality.

Negative prompts are not magic

They help steer the model away from common problems, but they cannot fix everything. If you are getting persistent issues, it is usually a prompt or checkpoint problem, not a negative prompt problem.