Gen-AI Developer Classroom notes 03/Jan/2026

Token, Tokenizer, Embedding

  • LLM’s dont understand Text.
  • A sentence will be broken into multiple tokens, Program which does this is referred as tokenizer.
    Preview
  • Tokenizer creates token ids
    Preview

  • Each token will be converted into a vector (array of decimal points). A vector is larger dimension numerical value

  • The way embeddings works are

    • It has vocabulary of tokens in a vector form with each token having a token id
      Preview
  • Points closer to each other in this larger dimensional space are considered similar

Parameters

  • In neural networks each neuron has an activation function which is calculate depending on weights and biases
    Preview

  • In a LLM, parameters are based on weights and biases.

  • More parameters increase the model capability to learn more patterns.
  • Model effectiveness depends on training.

Next topic

  • Transformer
  • Experiment with RNN
Act as RNN, in simple terms show me how will you convert the following word into hindi "Kids are playing on a river bank. It is extremely breezy out there."
  • Experiment with transformer
Now act as transformer with attention and solve the same problem

By continuous learner

enthusiastic technology learner

Leave a Reply

Discover more from Direct AI Powered By Quality Thought

Subscribe now to keep reading and get access to the full archive.

Continue reading