Common AI Terms

A comprehensive glossary of common AI terms covering machine learning, deep learning, NLP, generative AI, and more. Essential for anyone working in or interested in AI.

A/B Testing: A method for comparing different versions of a model or application to evaluate their performance against real-world data. Commonly used in online settings.

AI Agents: An artificial intelligence (AI) agent is a software program that can interact with its environment, collect data, and use the data to perform self-determined tasks to meet predetermined goals. Humans set goals, but an AI agent independently chooses the best actions it needs to perform to achieve those goals. For example, consider a contact center AI agent that wants to resolves customer queries. The agent will automatically ask the customer different questions, look up information in internal documents, and respond with a solution. Based on the customer responses, it determines if it can resolve the query itself or pass it on to a human.

AI Alignment: The research area focused on ensuring that AI systems pursue goals that are beneficial to humans and aligned with human values.

AI Ethics: AI ethics refers to the issues that AI stakeholders such as engineers and government officials must consider to ensure that the technology is developed and used responsibly. This means adopting and implementing systems that support a safe, secure, unbiased, and environmentally friendly approach to artificial intelligence.

Accuracy: Accuracy represents the overall correctness of the model’s predictions. It’s the ratio of correctly classified instances (both positive and negative) to the total number of instances. While accuracy is a useful metric, it can be misleading when dealing with imbalanced datasets (where one class is significantly more frequent than others).

Active Learning: A type of machine learning where the algorithm actively queries the user or information source to obtain new data labels for training. This is particularly useful when labeling data is expensive or time-consuming.

Advanced Analytics: Advanced analytics is the process of using complex machine learning (ML) and visualization techniques to derive data insights beyond traditional business intelligence. Modern organizations collect vast volumes of data and analyze it to discover hidden patterns and trends. They use the information to improve business process efficiency and customer satisfaction. With advanced analytics, you can take this one step further and use data for future and real-time decision-making. Advanced analytics techniques also derive meaning from unstructured data like social media comments or images. They can help your organization solve complex problems more efficiently. Advancements in cloud computing and data storage have made advanced analytics more affordable and accessible to all organizations.

Adversarial Attacks: Techniques for manipulating input data to deceive or mislead AI models. Important for understanding vulnerabilities and improving robustness.

Algorithm: An algorithm is a sequence of rules given to an AI machine to perform a task or solve a problem. Common algorithms include classification, regression, and clustering.

Application Programming Interface (API): An API, or application programming interface, is a set of protocols that determine how two software applications will interact with each other. APIs tend to be written in programming languages such as C++ or JavaScript.

Artificial General Intelligence (AGI): Artificial general intelligence (AGI) is a field of theoretical AI research that attempts to create software with human-like intelligence and the ability to self-teach. The aim is for the software to be able to perform tasks that it is not necessarily trained or developed for.

Current artificial intelligence (AI) technologies all function within a set of pre-determined parameters. For example, AI models trained in image recognition and generation cannot build websites. AGI is a theoretical pursuit to develop AI systems that possess autonomous self-control, a reasonable degree of self-understanding, and the ability to learn new skills. It can solve complex problems in settings and contexts that were not taught to it at the time of its creation. AGI with human abilities remains a theoretical concept and research goal.

Artificial Intelligence (AI): Artificial intelligence is a field of science concerned with building computers and machines that can reason, learn, and act in such a way that would normally require human intelligence or that involves data whose scale exceeds what humans can analyze.

AI is a broad field that encompasses many different disciplines, including computer science, data analytics and statistics, hardware and software engineering, linguistics, neuroscience, and even philosophy and psychology.

On an operational level for business use, AI is a set of technologies that are based primarily on machine learning and deep learning, used for data analytics, predictions and forecasting, object categorization, natural language processing, recommendations, intelligent data retrieval, and more.

Artificial Intelligence for IT Operations (AIOps): Artificial intelligence for IT operations (AIOps) is a process where you use artificial intelligence (AI) techniques maintain IT infrastructure. You automate critical operational tasks like performance monitoring, workload scheduling, and data backups. AIOps technologies use modern machine learning (ML), natural language processing (NLP), and other advanced AI methodologies to improve IT operational efficiency. They bring proactive, personalized, and real-time insights to IT operations by collecting and analyzing data from many different sources.

Artificial Superintelligence (ASI): Hypothetical AI systems that surpass human intelligence in all aspects, including creativity, problem-solving, and general wisdom. ASI is a purely speculative concept with potential implications that are widely debated, ranging from utopian advancements to existential risks. There is no scientific consensus on the feasibility or timeline for achieving ASI

Attention Mechanism: A technique used in deep learning, particularly in Transformers, that allows the model to focus on different parts of the input when generating the output. This enables the model to weigh the importance of different words or features in the input sequence.

Audio-to-Text Converter: An audio-to-text converter is a transcription software that automatically recognizes speech and transcribes what is being said into its equivalent written format. Traditionally, a human would listen to the audio file and type it into a text file to repurpose the spoken content for different media. But now, using artificial intelligence, computers can easily convert audio to text in a short time and make the content usable for different purposes like search, subtitles, and insights. An audio-to-text converter is a transcription software that automatically recognizes speech and transcribes what is being said into its equivalent written format. Traditionally, a human would listen to the audio file and type it into a text file to repurpose the spoken content for different media. But now, using artificial intelligence, computers can easily convert audio to text in a short time and make the content usable for different purposes like search, subtitles, and insights.

Autonomous: A machine is described as autonomous if it can perform its task or tasks without needing human intervention.

Autoregressive Models: Autoregressive models are a class of machine learning (ML) models that automatically predict the next component in a sequence by taking measurements from previous inputs in the sequence. Autoregression is a statistical technique used in time-series analysis that assumes that the current value of a time series is a function of its past values. Autoregressive models use similar mathematical techniques to determine the probabilistic correlation between elements in a sequence. They then use the knowledge derived to guess the next element in an unknown sequence. For example, during training, an autoregressive model processes several English language sentences and identifies that the word “is” always follows the word “there.” It then generates a new sequence that has “there is” together.

Backward Chaining: A method where the model starts with the desired output and works in reverse to find data that might support it.

Bias: Assumptions made by a model that simplify the process of learning to do its assigned task. Most supervised machine learning models perform better with low bias, as these assumptions can negatively affect results.

Big Data: Big data refers to the large data sets that can be studied to reveal patterns and trends to support business decisions. It’s called “big” data because organizations can now gather massive amounts of complex data using data collection tools and systems. Big data can be collected very quickly and stored in a variety of formats.

Black Box: In AI, a “black box” refers to a system or model whose internal workings are opaque or not easily understood. While you can see the inputs and outputs, the processes that transform the input into the output remain hidden or difficult to interpret. This lack of transparency can make it challenging to understand why an AI system made a particular decision, raising concerns about trust, accountability, and debugging. Deep learning models, particularly large neural networks, are often considered black boxes due to their complex architectures and numerous parameters. The opposite of a black box is a “white box” or “glass box” model, where the internal logic and decision-making processes are transparent and explainable.

Boosting: Boosting is a method used in machine learning to reduce errors in predictive data analysis. Data scientists train machine learning software, called machine learning models, on labeled data to make guesses about unlabeled data. A single machine learning model might make prediction errors depending on the accuracy of the training dataset. For example, if a cat-identifying model has been trained only on images of white cats, it may occasionally misidentify a black cat. Boosting tries to overcome this issue by training multiple models sequentially to improve the accuracy of the overall system.

Bot: A bot is an automated software application that performs repetitive tasks over a network. It follows specific instructions to imitate human behavior but is faster and more accurate. A bot can also run independently without human intervention. For example, bots can interact with websites, chat with site visitors, or scan through content. While most bots are useful, outside parties design some bots with malicious intent. Organizations secure their systems from malicious bots and use helpful bots for increased operational efficiency.

Bounding Box: Commonly used in image or video tagging, this is an imaginary box drawn on visual information. The contents of the box are labeled to help a model recognize it as a distinct type of object.

Chain-of-Density Prompting: Chain-of-density prompting is an iterative prompting technique that generates a sequence of summaries, either progressively condensing a text into shorter versions or progressively expanding it into longer, more detailed versions, ultimately achieving the desired level of conciseness or detail.

Chain-of-Thought Prompting: A prompting technique that encourages LLMs to break down complex problems into smaller, more manageable steps, leading to more accurate and reasoned responses.

Chatbot: A chatbot is a program or application that users can converse with through voice or text. Chatbots were first developed in the 1960s, and the technology powering them has changed over time. Chatbots traditionally use predefined rules to converse with users and provide scripted answers. Contemporary chatbots use natural language processing (NLP) to understand users, and they can respond to complex questions with great depth and accuracy. Your organization can use chatbots to scale, personalize, and improve communication in everything from customer service workflows to DevOps management.

Classification: A supervised learning technique that assigns data points to predefined categories or classes.

Clustering: An unsupervised learning technique that groups similar data points together based on their inherent patterns.

Cognitive Computing: Cognitive computing is essentially the same as AI. It’s a computerized model that focuses on mimicking human thought processes such as pattern recognition and learning. Marketing teams sometimes use this term to eliminate the sci-fi mystique of AI.

Cognitive Search: Cognitive search is a search engine technology that uses artificial intelligence (AI) to quickly find relevant and accurate search results for various types of queries. Modern enterprises store vast information—like manuals, FAQs, research reports, customer service guides, and human resources documentation—across various systems. Cognitive search technologies scan large databases of disparate information and correlate data to discover answers to users’ questions. For example, you can ask a question such as, “How much was spent on machinery repairs last year?” Then, cognitive search maps the question to the relevant documents and returns a specific answer.

Common Objects in Context (COCO): The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

Computational Learning Theory: A field within artificial intelligence that is primarily concerned with creating and analyzing machine learning algorithms.

Computer Vision: Computer vision is a technology that machines use to automatically recognize images and describe them accurately and efficiently. Today, computer systems have access to a large volume of images and video data sourced from or created by smartphones, traffic cameras, security systems, and other devices. Computer vision applications use artificial intelligence and machine learning (AI/ML) to process this data accurately for object identification and facial recognition, as well as classification, recommendation, monitoring, and detection.

Computer Vision Tasks: Specific tasks within computer vision, such as object detection, image segmentation, image classification, and facial recognition.

Conversational AI: Conversational artificial intelligence (AI) is a technology that makes software capable of understanding and responding to voice-based or text-based human conversations. Traditionally, human chat with software has been limited to preprogrammed inputs where users enter or speak predetermined commands. Conversational AI goes much beyond that. It can recognize all types of speech and text input, mimic human interactions, and understand and respond to queries in various languages. Organizations use conversational AI for various customer support use cases, so the software responds to customer queries in a personalized manner.

Corpus: A large dataset of written or spoken material that can be used to train a machine to perform linguistic tasks.

Data Augmentation: Data augmentation is the process of artificially generating new data from existing data, primarily to train new machine learning (ML) models. ML models require large and varied datasets for initial training, but sourcing sufficiently diverse real-world datasets can be challenging because of data silos, regulations, and other limitations. Data augmentation artificially increases the dataset by making small changes to the original data. Generative artificial intelligence (AI) solutions are now being used for high-quality and fast data augmentation in various industries.

Data Cleansing: Data cleansing is an essential process for preparing raw data for machine learning (ML) and business intelligence (BI) applications. Raw data may contain numerous errors, which can affect the accuracy of ML models and lead to incorrect predictions and negative business impact.

Key steps of data cleansing include modifying and removing incorrect and incomplete data fields, identifying and removing duplicate information and unrelated data, and correcting formatting, missing values, and spelling errors.

Data Drift: The phenomenon where the statistical properties of the input data change over time, leading to a decrease in model performance. This is a crucial concept for maintaining AI systems in production.

Data Governance: The overall management of the availability, usability, integrity, and security of data used in an enterprise. Crucial for responsible AI development.

Data Lakehouse: A combined data storage and processing architecture that combines the flexibility of data lakes with the structure and performance of data warehouses. Increasingly relevant for AI applications.

Data Lineage: The process of tracking and visualizing the flow of data from its origin to its destination, including transformations and dependencies. Important for understanding and debugging AI systems.

Data Mining: Data mining is the process of sorting through large data sets to identify patterns that can improve models or solve problems.

Data Preparation: Data preparation is the process of preparing raw data so that it is suitable for further processing and analysis. Key steps include collecting, cleaning, and labeling raw data into a form suitable for machine learning (ML) algorithms and then exploring and visualizing the data. Data preparation can take up to 80% of the time spent on an ML project. Using specialized data preparation tools is important to optimize this process.

Data Science: Data science is an interdisciplinary field of technology that uses algorithms and processes to gather and analyze large amounts of data to uncover patterns and insights that inform business decisions.

Dataset: A collection of related data points, usually with a uniform order and tags.

Deep Learning: Deep learning is a method in artificial intelligence (AI) that teaches computers to process data in a way that is inspired by the human brain. Deep learning models can recognize complex patterns in pictures, text, sounds, and other data to produce accurate insights and predictions. You can use deep learning methods to automate tasks that typically require human intelligence, such as describing images or transcribing a sound file into text.

Diffusion Models: A class of generative models that gradually add noise to training data and then learn to reverse this process to generate new data samples. Stable Diffusion is a prominent example.

Edge AI: Deploying AI models directly on edge devices (e.g., smartphones, sensors) for faster processing and reduced latency.

Embeddings: Embeddings are numerical representations of real-world objects that machine learning (ML) and artificial intelligence (AI) systems use to understand complex knowledge domains like humans do. As an example, computing algorithms understand that the difference between 2 and 3 is 1, indicating a close relationship between 2 and 3 as compared to 2 and 100. However, real-world data includes more complex relationships. For example, a bird-nest and a lion-den are analogous pairs, while day-night are opposite terms. Embeddings convert real-world objects into complex mathematical representations that capture inherent properties and relationships between real-world data. The entire process is automated, with AI systems self-creating embeddings during training and using them as needed to complete new tasks.

Emergent Behavior: Emergent behavior, also called emergence, is when an AI system shows unpredictable or unintended capabilities.

Enterprise AI: Enterprise artificial intelligence (AI) is the adoption of advanced AI technologies within large organizations. Taking AI systems from prototype to production introduces several challenges around scale, performance, data governance, ethics, and regulatory compliance. Enterprise AI includes policies, strategies, infrastructure, and technologies for widespread AI use within a large organization. Even though it requires significant investment and effort, enterprise AI is important for large organizations as AI systems become more mainstream.

Entity Annotation: The process of labeling unstructured sentences with information so that a machine can read them. This could involve labeling all people, organizations and locations in a document, for example.

Entity Extraction: An umbrella term referring to the process of adding structure to data so that a machine can read it. Entity extraction may be done by humans or by a machine learning model.

Explainable AI (XAI): A set of techniques and methods that aim to make AI decision-making more transparent and understandable to humans. This is important for building trust and ensuring responsible use of AI.

Explainable AI (XAI): Techniques that make AI decisions more transparent and understandable. Helps build trust and ensures responsible AI development.

F1 Score: The F1 score is the harmonic mean of precision and recall. It provides a balanced measure of the model’s performance, especially when dealing with imbalanced datasets. It’s particularly useful when you want to find a good balance between minimizing both false positives and false negatives. An F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0.

Facial Recognition: A face analyzer is software that identifies or confirms a person’s identity using their face. It works by identifying and measuring facial features in an image. Facial recognition can identify human faces in images or videos, determine if the face in two images belongs to the same person, or search for a face among a large collection of existing images. Biometric security systems use facial recognition to uniquely identify individuals during user onboarding or logins as well as strengthen user authentication activity. Mobile and personal devices also commonly use face analyzer technology for device security.

Feature Engineering: Model features are the inputs that machine learning (ML) models use during training and inference to make predictions. ML model accuracy relies on a precise set and composition of features. For example, in an ML application that recommends a music playlist, features could include song ratings, which songs were listened to previously, and song listening time. It can take significant engineering effort to create features. Feature engineering involves the extraction and transformation of variables from raw data, such as price lists, product descriptions, and sales volumes so that you can use features for training and prediction. The steps required to engineer features include data extraction and cleansing and then feature creation and storage.

Federated Learning: A distributed machine learning approach where models are trained on decentralized datasets held by multiple devices or organizations without sharing the raw data. Preserves privacy and enables collaborative learning.

Few-Shot Learning: The ability of a model to learn new tasks with only a limited number of labeled examples, as opposed to requiring large training datasets.

Forecast: A forecast is a prediction made by studying historical data and past patterns. Businesses use software tools and systems to analyze large amounts of data collected over a long period. The software then predicts future demand and trends to help companies make more accurate financial, marketing, and operational decisions.

Foundation Model: Trained on massive datasets, foundation models (FMs) are large deep learning neural networks that have changed the way data scientists approach machine learning (ML). Rather than develop artificial intelligence (AI) from scratch, data scientists use a foundation model as a starting point to develop ML models that power new applications more quickly and cost-effectively. The term foundation model was coined by researchers to describe ML models trained on a broad spectrum of generalized and unlabeled data and capable of performing a wide variety of general tasks such as understanding language, generating text and images, and conversing in natural language.

General AI ( (Strong AI or AGI – Artificial General Intelligence): Hypothetical AI systems with human-level cognitive abilities. They would possess the capacity to understand, learn, and apply knowledge across a wide range of tasks, just like humans. AGI remains a theoretical concept, and no such systems exist currently.

Generative AI: Generative AI is a category of artificial intelligence algorithms that can create new content, ranging from text and code to images, music, and even videos. Instead of simply analyzing or classifying existing data, generative AI learns the underlying patterns and structure of the input data and then uses this knowledge to generate similar but novel outputs. This creative capability distinguishes it from other forms of AI, like discriminative AI, which focuses on classifying or distinguishing between different input data. Generative AI often utilizes techniques like deep learning, particularly generative adversarial networks (GANs) and variational autoencoders (VAEs), to achieve this generative process.

Generative Adversarial Network (GAN): A generative adversarial network (GAN) is a deep learning architecture. It trains two neural networks to compete against each other to generate more authentic new data from a given training dataset. For instance, you can generate new images from an existing image database or original music from a database of songs. A GAN is called adversarial because it trains two different networks and pits them against each other. One network generates new data by taking an input data sample and modifying it as much as possible. The other network tries to predict whether the generated data output belongs in the original dataset. In other words, the predicting network determines whether the generated data is fake or real. The system generates newer, improved versions of fake data values until the predicting network can no longer distinguish fake from original.

Generative Pre-trained Transformers (GPT): Generative Pre-trained Transformers, commonly known as GPT, are a family of neural network models that uses the transformer architecture and is a key advancement in artificial intelligence (AI) powering generative AI applications such as ChatGPT. GPT models give applications the ability to create human-like text and content (images, music, and more), and answer questions in a conversational manner. Organizations across industries are using GPT models and generative AI for Q&A bots, text summarization, content generation, and search.

Guardrails: Guardrails refers to restrictions and rules placed on AI systems to make sure that they handle data appropriately and don’t generate unethical content.

Hallucination: Hallucination refers to an incorrect response from an AI system, or false information in an output that is presented as factual information.

Human-in-the-Loop: AI systems that involve human interaction and feedback in their operation, often for tasks that require human judgment or oversight.

Hyperparameter Tuning: When you’re training machine learning models, each dataset and model needs a different set of hyperparameters, which are a kind of variable. The only way to determine these is through multiple experiments, where you pick a set of hyperparameters and run them through your model. This is called hyperparameter tuning. In essence, you’re training your model sequentially with different sets of hyperparameters. This process can be manual, or you can pick one of several automated hyperparameter tuning methods.

Whichever method you use, you need to track the results of your experiments. You’ll have to apply some form of statistical analysis, such as the loss function, to determine which set of hyperparameters gives the best result. Hyperparameter tuning is an important and computationally intensive process.

Intelligent Automation: Intelligent automation (IA) is the process of using artificial intelligence (AI) to make self-improving software automation. Robotic process automation (RPA) is a software technology that automates repetitive and labor-intensive back-office workflows like filling in forms, searching for information, or sorting invoices. RPA robots are software robots that interact with any digital system like people do. Intelligent automation technologies make RPA bots smarter to self-learn more complex tasks and use cases. Intelligent automation combines AI technologies like natural language processing (NLP), generative AI, and optical character recognition (OCR) to streamline business operations.

Intelligent Document Processing (IDP): Intelligent document processing (IDP) is automating the process of manual data entry from paper-based documents or document images to integrate with other digital business processes. For example, consider a business process workflow that automatically issues orders to suppliers when stock levels are low. Although the process is automated, no order is shipped until the supplier receives payment. The supplier sends an invoice via email, and the accounts team enters the data manually before completing payment—introducing manual checkpoints that create bottlenecks or errors. Instead, IDP systems automatically extract invoice data and enter it in the required format in the accounting system. You can use document processing to automate document management with the use of machine learning (ML) and various artificial intelligence (AI) technologies.

Intent: Commonly used in training data for chatbots and other natural language processing tasks, this is a type of label that defines the purpose or goal of what is said. For example, the intent for the phrase “turn the volume down” could be “decrease volume”.

LLMOps: Short for Large Language Model Operations, it refers to the set of processes and tools for developing, deploying, and managing LLM-based applications. It’s analogous to MLOps but specifically tailored to the unique challenges of LLMs.

Label: A part of training data that identifies the desired output for that particular piece of data.

LangChain: LangChain is an open source framework for building applications based on large language models (LLMs). LLMs are large deep-learning models pre-trained on large amounts of data that can generate responses to user queries—for example, answering questions or creating images from text-based prompts. LangChain provides tools and abstractions to improve the customization, accuracy, and relevancy of the information the models generate. For example, developers can use LangChain components to build new prompt chains or customize existing templates. LangChain also includes components that allow LLMs to access new data sets without retraining.

Large Language Models: Large language models, also known as LLMs, are very large deep learning models that are pre-trained on vast amounts of data. The underlying transformer is a set of neural networks that consist of an encoder and a decoder with self-attention capabilities. The encoder and decoder extract meanings from a sequence of text and understand the relationships between words and phrases in it.

Transformer LLMs are capable of unsupervised training, although a more precise explanation is that transformers perform self-learning. It is through this process that transformers learn to understand basic grammar, languages, and knowledge.

Unlike earlier recurrent neural networks (RNN) that sequentially process inputs, transformers process entire sequences in parallel. This allows the data scientists to use GPUs for training transformer-based LLMs, significantly reducing the training time.

Transformer neural network architecture allows the use of very large models, often with hundreds of billions of parameters. Such large-scale models can ingest massive amounts of data, often from the internet, but also from sources such as the Common Crawl, which comprises more than 50 billion web pages, and Wikipedia, which has approximately 57 million pages.

Limited Memory: Limited memory is a type of AI system that receives knowledge from real-time events and stores it in the database to make better predictions.

Linear Regression: Linear regression is a data analysis technique that predicts the value of unknown data by using another related and known data value. It mathematically models the unknown or dependent variable and the known or independent variable as a linear equation. For instance, suppose that you have data about your expenses and income for last year. Linear regression techniques analyze this data and determine that your expenses are half your income. They then calculate an unknown future expense by halving a future known income.

MLOps (Machine Learning Operations): A set of practices for automating and streamlining the lifecycle of machine learning models, from development to deployment and monitoring.

Machine Learning (ML): Machine learning is a subset of artificial intelligence that enables a system to autonomously learn and improve using neural networks and deep learning, without being explicitly programmed, by feeding it large amounts of data.

Machine learning allows computer systems to continuously adjust and enhance themselves as they accrue more “experiences.” Thus, the performance of these systems can be improved by providing larger and more varied datasets to be processed.

Machine Translation: Machine translation is the process of using artificial intelligence to automatically translate text from one language to another without human involvement. Modern machine translation goes beyond simple word-to-word translation to communicate the full meaning of the original language text in the target language. It analyzes all text elements and recognizes how the words influence one another.

Meta-Prompting: Meta-prompting is a prompt engineering technique where you ask a generative AI tool to help you design or refine prompts, effectively using the AI to assist in creating more effective prompts for your specific needs.

Model: A broad term referring to the product of AI training, created by running a machine learning algorithm on training data.

Model Deployment: The process of making a trained machine learning model available for use in applications or systems.

Model Monitoring: Continuously tracking the performance of deployed models to detect and address issues like data drift, concept drift, and performance degradation.

Multimodal AI: AI systems that can process and integrate information from multiple modalities, such as text, images, audio, and video.

Narrow AI: Also called “Weak AI” – AI systems designed and trained for a specific task or a narrow range of tasks. They excel in their designated area but lack the ability to generalize their knowledge to other domains. Examples include image recognition systems, spam filters, and recommendation engines. Most AI systems in use today fall under this category.

Natural Language Generation (NLG): This refers to the process by which a machine turns structured data into text or speech that humans can understand. Essentially, NLG is concerned with what a machine writes or says as the end part of the communication process.

Natural Language Processing Tasks: Specific tasks within NLP, such as sentiment analysis, named entity recognition, machine translation, and text summarization.

Natural Language Understanding (NLU): As a subset of natural language processing, natural language understanding deals with helping machines to recognize the intended meaning of language — taking into account its subtle nuances and any grammatical errors.

Natural language processing (NLP): Natural language processing (NLP) is a machine learning technology that gives computers the ability to interpret, manipulate, and comprehend human language. Organizations today have large volumes of voice and text data from various communication channels like emails, text messages, social media newsfeeds, video, audio, and more. They use NLP software to automatically process this data, analyze the intent or sentiment in the message, and respond in real time to human communication.

Neural Network: A neural network is a method in artificial intelligence (AI) that teaches computers to process data in a way that is inspired by the human brain. It is a type of machine learning (ML) process, called deep learning, that uses interconnected nodes or neurons in a layered structure that resembles the human brain. It creates an adaptive system that computers use to learn from their mistakes and improve continuously. Thus, artificial neural networks attempt to solve complicated problems, like summarizing documents or recognizing faces, with greater accuracy.

Optical Character Recognition (OCR): Optical Character Recognition (OCR) is the process that converts an image of text into a machine-readable text format. For example, if you scan a form or a receipt, your computer saves the scan as an image file. You cannot use a text editor to edit, search, or count the words in the image file. However, you can use OCR to convert the image into a text document with its contents stored as text data.

Overfitting: Overfitting is an undesirable machine learning behavior that occurs when the machine learning model gives accurate predictions for training data but not for new data. When data scientists use machine learning models for making predictions, they first train the model on a known data set. Then, based on this information, the model tries to predict outcomes for new data sets. An overfit model can give inaccurate predictions and cannot perform well for all types of new data.

Parameter: A configurable value within a model that is learned during training. These are distinct from hyperparameters. This distinction is important to clarify.

Pre-training: The initial phase of training a large language model on a massive dataset, typically unsupervised, to learn general language patterns. This is a key step in creating foundation models.

Precision: In the context of classification, precision measures the proportion of correctly predicted positive instances out of all instances predicted as positive. In simpler terms, it answers the question: “Of all the items the model identified as positive, how many were actually positive?” A high precision means that the model has a low false positive rate.

Prescriptive Analytics: Prescriptive analytics is a type of analytics that uses technology to analyze data for factors such as possible situations and scenarios, past and present performance, and other resources to help organizations make better strategic decisions.

Prompt: prompt is an input that a user feeds to an AI system in order to get a desired result or output.

Prompt Engineering: Prompt engineering is the process where you guide generative artificial intelligence (generative AI) solutions to generate desired outputs. Even though generative AI attempts to mimic humans, it requires detailed instructions to create high-quality and relevant output. In prompt engineering, you choose the most appropriate formats, phrases, words, and symbols that guide the AI to interact with your users more meaningfully. Prompt engineers use creativity plus trial and error to create a collection of input texts, so an application’s generative AI works as expected.

Python: A popular programming language used for general programming.

Quantum Computing: Quantum computing is the process of using quantum-mechanical phenomena such as entanglement and superposition to perform calculations. Quantum machine learning uses these algorithms on quantum computers to expedite work because it performs much faster than a classic machine learning program and computer.

Recall (Sensitivity or True Positive Rate): Recall measures the proportion of correctly predicted positive instances out of all actual positive instances. It answers the question: “Of all the items that are truly positive, how many did the model correctly identify?” A high recall means that the model has a low false negative rate.

Recurrent Neural Network (RNN): A recurrent neural network (RNN) is a deep learning model that is trained to process and convert a sequential data input into a specific sequential data output. Sequential data is data—such as words, sentences, or time-series data—where sequential components interrelate based on complex semantics and syntax rules. An RNN is a software system that consists of many interconnected components mimicking how humans perform sequential data conversions, such as translating text from one language to another. RNNs are largely being replaced by transformer-based artificial intelligence (AI) and large language models (LLM), which are much more efficient in sequential data processing.

Regression: A supervised learning technique that predicts a continuous value, such as price or temperature.

Reinforcement Learning: Reinforcement learning (RL) is a machine learning (ML) technique that trains software to make decisions to achieve the most optimal results. It mimics the trial-and-error learning process that humans use to achieve their goals. Software actions that work towards your goal are reinforced, while actions that detract from the goal are ignored.

RL algorithms use a reward-and-punishment paradigm as they process data. They learn from the feedback of each action and self-discover the best processing paths to achieve final outcomes. The algorithms are also capable of delayed gratification. The best overall strategy may require short-term sacrifices, so the best approach they discover may include some punishments or backtracking along the way. RL is a powerful method to help artificial intelligence (AI) systems achieve optimal outcomes in unseen environments.

Reinforcement Learning from Human Feedback (RLHF): Reinforcement learning from human feedback (RLHF) is a machine learning (ML) technique that uses human feedback to optimize ML models to self-learn more efficiently. Reinforcement learning (RL) techniques train software to make decisions that maximize rewards, making their outcomes more accurate. RLHF incorporates human feedback in the rewards function, so the ML model can perform tasks more aligned with human goals, wants, and needs. RLHF is used throughout generative artificial intelligence (generative AI) applications, including in large language models (LLM).

Retrieval-Augmented Generation (RAG): Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large Language Models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. RAG extends the already powerful capabilities of LLMs to specific domains or an organization’s internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts.

Sentiment Analysis: Sentiment analysis is the process of analyzing digital text to determine if the emotional tone of the message is positive, negative, or neutral. Today, companies have large volumes of text data like emails, customer support chat transcripts, social media comments, and reviews. Sentiment analysis tools can scan this text to automatically determine the author’s attitude towards a topic. Companies use the insights from sentiment analysis to improve customer service and increase brand reputation.

Speech-to-Text: Speech to text is a speech recognition software that enables the recognition and translation of spoken language into text through computational linguistics. It is also known as speech recognition or computer speech recognition. Specific applications, tools, and devices can transcribe audio streams in real-time to display text and act on it.

Stable Diffusion: Stable Diffusion is a generative artificial intelligence (generative AI) model that produces unique photorealistic images from text and image prompts. It originally launched in 2022. Besides images, you can also use the model to create videos and animations. The model is based on diffusion technology and uses latent space. This significantly reduces processing requirements, and you can run the model on desktops or laptops equipped with GPUs. Stable Diffusion can be fine-tuned to meet your specific needs with as little as five images through transfer learning.

Stable Diffusion is available to everyone under a permissive license. This differentiates Stable Diffusion from its predecessors.

Structured Data: Structured data is data that is defined and searchable. This includes data like phone numbers, dates, and product SKUs.

Supervised Learning: Supervised learning is a type of machine learning in which classified output data is used to train the machine and produce the correct algorithms. It is much more common than unsupervised learning.

Synthetic Data: Synthetic data is non-human-created data that mimics real-world data. It is created by computing algorithms and simulations based on generative artificial intelligence technologies. A synthetic data set has the same mathematical properties as the actual data it is based on, but it does not contain any of the same information. Organizations use synthetic data for research, testing, new development, and machine learning research. Recent innovations in AI have made synthetic data generation efficient and fast but have also increased its importance in data regulatory concerns.

Temperature: A parameter in LLMs that controls the randomness of the generated output. Higher temperatures lead to more creative and unpredictable text, while lower temperatures result in more deterministic and focused responses.

Text Analysis: Text analysis is the process of using computer systems to read and understand human-written text for business insights. Text analysis software can independently classify, sort, and extract information from text to identify patterns, relationships, sentiments, and other actionable knowledge. You can use text analysis to efficiently and accurately process multiple text-based sources such as emails, documents, social media content, and product reviews, like a human would.

Text Classification: Text classification is the process of assigning predetermined categories to open-ended text documents using artificial intelligence and machine learning (AI/ML) systems. Many organizations have large document archives and business workflows that continually generate documents at scale—like legal documents, contracts, research documents, user-generated data, and email. Text classification is the first step to organize, structure, and categorize this data for further analytics. It allows automatic document labeling and tagging. This saves your organization thousands of hours you’d otherwise need to read, understand, and classify documents manually.

Text-to-Speech (TTS): Text-to-speech, also known as TTS, is a technology that converts written words into audible speech. An AI voice generator communicates with users when reading a screen is impossible or inconvenient. Text-to-speech technology opens up applications and information to be used in new ways, improving accessibility for individuals who cannot read text on a screen.
Text-to-speech technology has evolved over the last few decades. Deep learning makes it possible to produce very natural-sounding speech that includes pitch, rate, pronunciation, and inflection changes. Today, computer-generated speech is used in various use cases and is becoming ubiquitous in user interfaces. Newsreaders, gaming, public announcement systems, e-learning, telephony, IoT apps and devices, and personal assistants are just starting points.

Time Series Analysis: Methods for analyzing data collected over time to identify trends, seasonality, and other patterns.

Token: A token is a basic unit of text that an LLM uses to understand and generate language. A token may be an entire word or parts of a word.

Top-k Sampling: A decoding strategy in LLMs that selects the next token from the k most probable tokens based on the model’s predictions.

Top-p (Nucleus) Sampling: A decoding strategy in LLMs that selects the next token from the smallest set of tokens whose cumulative probability exceeds a threshold p.

Training Data: Training data is the information or examples given to an AI system to enable it to learn, find patterns, and create new content.

Transfer Learning: Transfer learning (TL) is a machine learning (ML) technique where a model pre-trained on one task is fine-tuned for a new, related task. Training a new ML model is a time-consuming and intensive process that requires a large amount of data, computing power, and several iterations before it is ready for production. Instead, organizations use TL to retrain existing models on related tasks with new data. For example, if a machine learning model can identify images of dogs, it can be trained to identify cats using a smaller image set that highlights the feature differences between dogs and cats.

Transformers: Transformers are a type of neural network architecture that transforms or changes an input sequence into an output sequence. They do this by learning context and tracking relationships between sequence components. For example, consider this input sequence: “What is the color of the sky?” The transformer model uses an internal mathematical representation that identifies the relevancy and relationship between the words color, sky, and blue. It uses that knowledge to generate the output: “The sky is blue.”

Organizations use transformer models for all types of sequence conversions, from speech recognition to machine translation and protein sequence analysis.

Turing Test: The Turing test was created by computer scientist Alan Turing to evaluate a machine’s ability to exhibit intelligence equal to humans, especially in language and behavior. When facilitating the test, a human evaluator judges conversations between a human and machine. If the evaluator cannot distinguish between responses, then the machine passes the Turing test.

Unstructured Data: Unstructured data is data that is undefined and difficult to search. This includes audio, photo, and video content. Most of the data in the world is unstructured.

Unsupervised Learning: Unsupervised learning is a type of machine learning in which an algorithm is trained with unclassified and unlabeled data so that it acts without supervision.

Validation Data: Structured like training data with an input and labels, this data is used to test a recently trained model against new data and to analyze performance, with a particular focus on checking for overfitting

Variance: The amount that the intended function of a machine learning model changes while it’s being trained. Despite being flexible, models with high variance are prone to overfitting and low predictive accuracy because they are reliant on their training data.

Variation: Also called queries or utterances, these work in tandem with intents for natural language processing. The variation is what a person might say to achieve a certain purpose or goal. For example, if the intent is “pay by credit card,” the variation might be “I’d like to pay by card, please.”

Voice Recognition: Voice recognition, also called speech recognition, is a method of human-computer interaction in which computers listen and interpret human dictation (speech) and produce written or spoken outputs. Examples include Apple’s Siri and Amazon’s Alexa, devices that enable hands-free requests and tasks.

Weak AI: Also called narrow AI, this is a model that has a set range of skills and focuses on one particular set of tasks. Most AI currently in use is weak AI, unable to learn or perform tasks outside of its specialist skill set.

Zero-Shot Learning: The ability of a model to perform tasks it has never seen before, without any specific training data for those tasks. This is a highly sought-after capability in AI.