Concepts Every Professional Should Know
Core Definitions
- Large Language Model (LLM): A neural network trained to predict the next term of an input sequence.
- Example: If you type “Please click the button to submit your…” the LLM predicts “…application.” (or “…form.”)
- Tokenization: The process of breaking input text into discrete, smaller units called tokens (words or sub-words).
- Example: Input Word: “Walking”, Tokens:
Walk+ing.
In this case, the AI separates the action (walk) from the tense (ing). This allows the model to mathematically relate “walking,” “swimming,” and “running” because they all share the sameingtoken suffix.
- Example: Input Word: “Walking”, Tokens:
- Vectors: The mapping of a word in an n-dimensional space such that similar meaning words are clustered together.
- Example: In an AI’s internal “map,” the vector for “Dog” will be mathematically closer to “Puppy” than it is to “Airplane.”
- Attention: A mechanism that adds context to a word by looking at nearby words to derive exact meaning.
- Example: In the sentence “The bank was overflowing with water,” attention focuses on “water” to know you mean a river bank, not a financial bank.
- Self-Supervised Learning: A training method where the model hides a section of input and tries to predict it, learning the underlying structure without human labels.
- Transformer: A specific algorithm/architecture that uses attention blocks to process sequences and predict the next token.
- Example: The underlying technology that allows ChatGPT to understand the relationship between the first and last sentence of a long essay.
- Fine-Tuning: Taking a base model and training it on specific question-answer sets to make it an expert in a specific field.
- Retrieval Augmented Generation (RAG): Fetching relevant documents from a database and adding them to the prompt to give the AI real-time, private context.
We will dive deep into these concepts in the upcoming pages. My goal is to break down these complex technical pillars into clear, actionable insights.
