Skip to content

Glossary – AI and RAG Systems

AI and RAG Systems Glossary

AI and RAG Diagram

This glossary explains key terms related to Artificial Intelligence (AI), Machine Learning (ML), and Retrieval-Augmented Generation (RAG) systems. These definitions support our clients and partners as they explore AI-based solutions for data retrieval, predictive modeling, fraud detection, and more.

Backtesting

The process of using historical data to evaluate how well a predictive model or trading strategy would have performed. It is a key step in validating the accuracy and reliability of a machine learning or financial forecasting model.

Clustering

A machine learning technique that groups similar data points into clusters based on predefined measures of similarity. Clustering is typically unsupervised, meaning the algorithm discovers groupings without labeled training data.

  • K-Means Clustering: A widely used algorithm that partitions data into k clusters based on proximity to cluster centroids.
  • Elbow Method: A visual technique to determine the optimal number of clusters in K-Means.
  • DBSCAN (Density-Based Spatial Clustering of Applications with Noise): Clusters data based on density, identifying outliers as noise.
  • PCA (Principal Component Analysis): A dimensionality reduction method often used to simplify datasets before clustering.
  • K-Distance Graph: Used with DBSCAN to estimate the best epsilon (neighborhood size) parameter.

Censoring (Survival Analysis)

Censoring occurs when the outcome of interest is only partially observed. This is common in time-to-event data analysis, such as medical studies or loan default timelines.

  • Right-Censoring: The most common type. The event hasn’t occurred by the end of observation (e.g., loan hasn’t defaulted yet).
  • Left-Censoring: The event occurred before tracking began, so the exact time is unknown.
  • Interval-Censoring: The event occurred within a known time range, but the exact timing is unclear.

Coming Soon

We are actively expanding this glossary to include additional AI and RAG terms such as:

  • Embedding Models
  • Vector Databases (e.g., Weaviate, FAISS)
  • Prompt Engineering
  • Fine-Tuning
  • Semantic Search
  • Retriever-Reader Architecture

Check back soon or contact us to request a term be added.

Verified by MonsterInsights