Question 1

What's the difference between Fine-Tuning and prompt engineering?

Accepted Answer

Prompt engineering changes the instructions you give a model at runtime (no model weights change). Fine-tuning updates the model’s weights using training data so it learns your preferred patterns (tone, labels, formats, domain terms). Prompting is usually faster and cheaper to start; fine-tuning is useful when you need consistent behavior that prompts alone can’t reliably achieve.

Question 2

When should I use Fine-Tuning?

Accepted Answer

Use fine-tuning when you have enough high-quality examples and you need consistent, repeatable outputs—such as strict JSON formats, domain-specific terminology, or a specialized classification/extraction task. Start with retrieval-augmented generation (RAG) or prompt engineering if your main goal is to inject up-to-date knowledge or reference internal documents; fine-tune when the model needs to learn a behavior, not just look up facts.

Question 3

How much does Fine-Tuning cost?

Accepted Answer

Cost depends on (1) model size, (2) number of training tokens/examples, (3) training method (full fine-tune vs parameter-efficient methods like LoRA), (4) training duration/epochs, and (5) where you run it (managed service vs self-managed GPUs). You typically pay for compute during training plus storage for datasets/artifacts, and then pay for inference when serving the tuned model. Parameter-efficient fine-tuning can reduce training cost significantly compared with updating all model parameters.

Fine-Tuning

Definition

Use Cases

Provider Equivalents

Frequently Asked Questions

Related Terms

See Also