AI and RAG Systems Glossary
This glossary explains key terms related to Artificial Intelligence (AI), Machine Learning (ML), and Retrieval-Augmented Generation (RAG) systems. These definitions support our clients and partners as they explore AI-based solutions for data retrieval, predictive modeling, fraud detection, and more.
Backtesting
The process of using historical data to evaluate how well a predictive model or trading strategy would have performed. It is a key step in validating the accuracy and reliability of a machine learning or financial forecasting model.
Clustering
A machine learning technique that groups similar data points into clusters based on predefined measures of similarity. Clustering is typically unsupervised, meaning the algorithm discovers groupings without labeled training data.
- K-Means Clustering: A widely used algorithm that partitions data into k clusters based on proximity to cluster centroids.
- Elbow Method: A visual technique to determine the optimal number of clusters in K-Means.
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise): Clusters data based on density, identifying outliers as noise.
- PCA (Principal Component Analysis): A dimensionality reduction method often used to simplify datasets before clustering.
- K-Distance Graph: Used with DBSCAN to estimate the best epsilon (neighborhood size) parameter.
Censoring (Survival Analysis)
Censoring occurs when the outcome of interest is only partially observed. This is common in time-to-event data analysis, such as medical studies or loan default timelines.
- Right-Censoring: The most common type. The event hasn’t occurred by the end of observation (e.g., loan hasn’t defaulted yet).
- Left-Censoring: The event occurred before tracking began, so the exact time is unknown.
- Interval-Censoring: The event occurred within a known time range, but the exact timing is unclear.
Coming Soon
We are actively expanding this glossary to include additional AI and RAG terms such as:
- Embedding Models
- Vector Databases (e.g., Weaviate, FAISS)
- Prompt Engineering
- Fine-Tuning
- Semantic Search
- Retriever-Reader Architecture
Check back soon or contact us to request a term be added.