Transformers

To perform language translations & other nlp activities RNN’s were used which used to process text sequentially.
Transformers were introduced with attention concept Refer Here for gpt2 visualization.
Datasets used in LLAMA training
CommonCrawl
Wikipedia
Books
Github Code
ArXiv
Once the model is trained with all of the data, this model is referred as pre-trained model. At this model behaves like a scholar (very good at predicting next token)
Models is trained further with SFT (Supervised Fine tuning) After this tuning the model becomes instruct model (capable of chatting/conversation )
Every LLM derives from transformer.

Business Usecases

I want chatgpt (gpt models)
- to restart my servers when there is 100% cpu utilization for 10 mins straight
- to send happy new year message to all of my colleagues with acheivements over last year attached.
- To give me current sales information of my product lines
- To help answering my customers who have questions on my products