Blog & Insights

Sharing my experiences building scalable systems, working with AI, and lessons learned from real-world engineering challenges.

🧠 Fine-Tune Your Own LLM
AI / ML Feb 11, 2026 2 min read
How to Train an Open-Source LLM with Your Own Dataset — A High-Level Guide
New to AI development and unsure where to start? This step-by-step guide walks you through preparing a custom dataset, fine-tuning Llama 3.2 on Hugging Face, and using the trained model in your own codebase.

Introduction

Are you new to AI development? Struggling with where to start? Or do you want to train a model on your own dataset and actually use it? In this post, I'll give you a high-level overview so you can get started right away.

Here's our goal: we want to build a question-answering system. Note — this is not RAG (Retrieval-Augmented Generation). Instead, we will fine-tune an open-source model on our own dataset, host it on a server, and access it from anywhere using credentials. Keep in mind that this is a high-level overview; I can't explain every detail in a single post, but I'm confident you'll walk away with a clear understanding of how to train and use a model.

Step 1 — Prepare Your Dataset

First, start with a PDF file. Extract all the text from the PDF and split it into chunks of roughly 1,000 words each — so every 1,000 words becomes one chunk.

Next, use any LLM (such as GPT or Gemini): send each chunk to the model and ask it to generate five possible questions based on that chunk. Now you have your questions from the LLM, and you have the chunk data as the corresponding answers.

Create a dataset in JSONL (JSON Lines) format. Here's an example of what the output looks like:

{"answer":"The Amazon rainforest is the largest tropical rainforest in the world. It covers much of northwestern Brazil.","question":"Which country contains most of the Amazon rainforest?"}
{"answer":"The Amazon rainforest is the largest tropical rainforest in the world. It covers much of northwestern Brazil.","question":"What type of forest is the Amazon rainforest?"}

Step 2 — Fine-Tune the Model on Hugging Face

Now it's time to fine-tune an open-source model. Follow these steps:

  1. Go to HuggingFace.co.
  2. Search for "Llama 3.2 3B Instruct" and open the model page.
  3. Log in (or sign up), then click Request Access.
  4. Wait approximately 10 minutes for Meta's approval.

After approval, start fine-tuning via AutoTrain:

  1. On the model page (top right), click Train → AutoTrain.
  2. Click Create New Project.

Configure the AutoTrain Space:

  • Space Name: e.g., test-model-llama-3.2
  • Description: e.g., Testing Llama 3.2
  • Space SDK: Docker
  • Docker Template: AutoTrain (leave as default)

Select Hardware (this is important):

  • Choose NVIDIA A10G Small (~$1/hour). Llama 3B needs several GB of RAM, and this GPU works reliably.

Set Important Parameters:

  • Pause on failure → Set to 0 (this enables detailed error logs).
  • Visibility → Private
  • License → Leave empty

Then click Create Space. Hugging Face will spin up the Docker container and prepare the training environment.

Configure Training Inside AutoTrain:

  1. Set the Project Name (e.g., test-llama).
  2. Select the Base Model: Llama 3.2 3B Instruct.
  3. Upload your training file — it must be a .jsonl file (the dataset you prepared earlier).
  4. Set Number of Epochs to 1.
  5. Leave all other hyperparameters (learning rate, batch size, optimizer, scheduler) as default — do not change them.

Click Start Training. Once training completes, the fine-tuned model will appear under your Hugging Face profile as a new model.

Step 3 — Use Your Model in Code

Now you can load and use your fine-tuned model directly in Python:

from huggingface_hub import login

login(token="YOUR_HF_TOKEN")

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "YOUR_MODEL_NAME"

tokenizer = AutoTokenizer.from_pretrained(model_path)

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype="auto"
).eval()

messages = [
    {"role": "user", "content": "Which country contains most of the Amazon rainforest?"}
]

input_ids = tokenizer.apply_chat_template(
    conversation=messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)

output_ids = model.generate(
    input_ids.to("cuda"),
    max_new_tokens=500
)

response = tokenizer.decode(
    output_ids[0][input_ids.shape[1]:],
    skip_special_tokens=True
)

print(response)

Final Thoughts

Your model will now answer questions based on your training data. You might face a few difficulties along the way, but trust me — if you use any AI assistant (Gemini, ChatGPT, etc.), it will help you resolve errors and clear up any confusion. Happy coding! 🚀