How are we going to fine tune
-
We would be using unsloth for finetuning
-
other options to fine tune private models
What is a GPU (Graphical processing unit)
- This is massively parallel processor designed to perform thousands of mathematical operations at the same time
- Originally built for
- Rendering Video game graphics
- 3D Transformations
- Pixel Shading
- Texture mapping
-
Today used for
- Deep learning
- LLM Training
- Fine-Tuning
- Simulations
- Scientific computing
-
GPU will have thousads of smaller cores
- We measure GPU Power in FLOPS (Floating point operations per second)
- 1 FLOP = 1 decimal math calculation
- 1 TFLOP = 1 trillion calculations per second
- 1 PFLOP = 1 quadrillion calcuations per second
-
What makes GPU Powerful
- CUDA Cores
- Tensor Cores
- VRAM (Video RAM – GPU Memory)
- Memory Bandwidth
- Interconnect (Multi-GPU)
-
CUDA: This is a parallel computing platform and programming model created by NVIDIA. It allows developers to use NVIDIA GPUs for general-purpose computing not just graphics
- CUDA Cores are simple math processors desinged for floating point operations (Thousands exist inside modern GPUS)
- A Tensor core is a specialize hardware unit inside modern NVIDIA GPUS designed specifically for accelare:
- Matrix multiplications operations used in deep learning
- VRAM = Video Random access memory and this memory attached to the GPU and VRAMS is extreemly fast
-
When fine tuning, VRAM stores
- Model Weights
- Activations
- Gradients
- Optimizer states (Adam etc)
- Temporary buffers

-
TPU (Tensor Processing Unit): A TPU is custom AI accelarator chip designed by google built specifically to accelerate
- Tensor operations
