Help CenterModel Lab

Model Lab

Upload your data and Model Lab benchmarks 8 ML algorithms simultaneously — with automated ranking, diagnostics, SHAP feature importance, and a one-click research report. No ML expertise required.

Overview

8 ML algorithms

Random Forest, GBM, XGBoost, SVM, KNN, LDA, Naive Bayes, and Decision Tree — all compared in one run.

Automatic ranking

A composite score (Accuracy × 0.7 + CV × 0.3) ranks every model. The winner is highlighted automatically.

AI recommendation & report

Get a plain-language model recommendation plus a one-click HTML report with diagnostics and SHAP importance.

Leakage guardrails

Suspicious perfect scores and target leakage are flagged and excluded from the Best Model — so a high score you cannot trust never wins.

Tune, predict & simulate

Tune the winning model with Optuna, predict new data, compare models, and explore what-if scenarios — all from the saved model.

What you need

Data file: CSV or Excel — one file for all models
Target variable: The column you want to predict (classification or regression)
Feature variables: The predictors — numeric or categorical

What you get

Model ranking: All 8 models compared by composite score
Best model card: Winner, accuracy, CV score, generalization gap
Auto recommendation: AI picks the best model with plain-language reasoning
Diagnostics: Overfitting, CV stability, sample size, performance tier
SHAP importance: Feature influence scores for the winning model
Report: One-click HTML download — print to PDF

How it works

Results save automatically — no button to press

Every time you run an ML analysis in Statistical Lab and results appear on screen, the result is saved to Model Lab automatically in the background. You do not upload anything here — Model Lab accumulates results from Statistical Lab and ranks them.

1

Go to Statistical Lab and run an ML analysis

Open any ML analysis — Random Forest, GBM, XGBoost, SVM, KNN, LDA, Naive Bayes, or Decision Tree. Upload data, select variables, and run. The moment results appear, they are saved to Model Lab automatically.

Go to Statistical Lab and run an ML analysis
Go to Statistical Lab and run an ML analysis
2

Run more models on the same data

Go back and run a different algorithm on the same dataset. Each result saves automatically. Run as many as you want — Model Lab keeps them all.

Task TypeModels in Statistical Lab
Classification
Random ForestGBMXGBoostSVMKNNLDANaive BayesDecision Tree
Regression
Random ForestGBMXGBoostSVRKNNDecision TreeRidgeLasso
Clustering
K-MeansK-MedoidsDBSCANHDBSCANGMMHCA
3

Open Model Lab to compare

Open Model Lab from the dashboard. All saved results appear ranked by composite score. Each panel below shows what you will see.

Model Lab — comparison view (clustering example)
Model Lab — comparison view (clustering example)

Winner Summary

The top card shows the best-performing model with its Accuracy, CV Score, Generalization Gap, and a plain-language verdict on deployment readiness.

AccuracyPercentage of correct predictions on the held-out test set.
CV ScoreMean accuracy across k cross-validation folds — a more stable estimate of generalisation.
Generalization GapTrain accuracy minus CV score. Gap < 5% is acceptable; > 10% signals overfitting.
Composite ScoreWeighted rank metric: Accuracy × 0.7 + CV × 0.3. Used to order all models.

Model Rank Table

All models sorted by composite score. Columns show Accuracy, F1, AUC-ROC (classification), and CV Score ± standard deviation. The winning row is highlighted.

Auto Recommendation

Shows WHY THIS MODEL and WATCH OUT FOR based on your dataset — n, variable count, class balance.

Model Diagnostics

Each model is checked against four criteria — ✅ pass, ⚠️ warning, ❌ fail.

No overfittingGeneralization gap below threshold.
Adequate sample sizen meets the minimum for the number of predictors.
Stable CV varianceFold-level standard deviation below 5%.
High performanceAccuracy or R² meets the Excellent / Good / Moderate / Poor threshold.

SHAP Feature Importance

Mean absolute SHAP values for the winning model. A longer bar = more influence. Features near zero can often be removed without affecting accuracy.

Deployment Readiness

A 5-point checklist: performance threshold, generalization gap, CV stability, sample adequacy, feature signal. Each is pass / warn / fail.

Click into any model to open its full detail card — metrics, variables, feature importance, confusion matrix, and model-health checks:

Per-model detail card
Per-model detail card
Model Lab — detail card (continued)
Model Lab — detail card (continued)
4

Export the report

Click Download Report at the top of the results panel to get a formatted HTML file — or use AI Chat to ask follow-up questions about the results.

Download Report

Click Download Report. An HTML file is generated with winner summary, rank table, diagnostics, SHAP, AI interpretation, and deployment readiness.

To save as PDF: Open in browser → Ctrl + P (Cmd + P on Mac) → Save as PDF.

AI Chat

Open AI Chat to ask follow-up questions about the model ranking, feature importance, or next analytical steps. The AI has access to the current run's metrics.

Task types

Model Lab supports three task types. Each produces different results and uses a different set of metrics.

Classification

Auto-detected

Predicts which category an observation belongs to. Used when your target variable is categorical — e.g. spam/not-spam, species, customer segment.

Results shown

AccuracyCorrect predictions / total predictions
F1 ScoreHarmonic mean of precision and recall
AUC-ROCArea under the receiver operating curve
CV Score ± SDCross-validation mean and stability
Confusion MatrixPer-class prediction breakdown
SHAP ImportanceFeature influence on predicted class

Diagnostics

Overfitting check (train vs CV gap)
CV stability (fold variance)
Sample size adequacy (≥200 recommended)
Performance tier (Excellent/Good/Moderate/Poor)

Regression

Auto-detected

Predicts a continuous numeric value. Used when your target is a number — e.g. price, temperature, sales volume.

Results shown

Variance explained by the model (0–1)
RMSERoot mean squared error — lower is better
MAEMean absolute error — lower is better
CV Score ± SDCross-validation R² mean and stability
SHAP ImportanceFeature contribution to predicted value

Diagnostics

Overfitting check (train vs CV R² gap)
CV stability (fold variance)
Sample size adequacy
R² performance tier

Clustering

No target variable

Groups observations into natural clusters without a predefined label. Used for segmentation, pattern discovery, or anomaly detection. No target variable is required — only feature variables.

Results shown

Silhouette ScoreCluster cohesion vs separation (−1 to 1; ≥0.5 good)
Davies-BouldinAverage similarity between clusters (lower = better)
Calinski-HarabaszCluster density ratio (higher = better)
k (clusters)Number of clusters found or specified
Cluster SizesObservations per cluster
Cluster ProfilesMean feature values per cluster

Diagnostics

Silhouette quality (strong/reasonable/weak)
Davies-Bouldin separation check (< 1.0 good)
Sample size adequacy (≥50 recommended)
Cluster balance check

Guardrails — when a score is too good to trust

A perfect or near-perfect score is usually a warning, not a win. Model Lab automatically checks every saved model for target leakage, duplicated targets, and suspicious perfect scores. Flagged models are excluded from the Best Model, recommendations, and comparisons — so a score you cannot trust never rises to the top.

Example: R² = 1.000

If a model scores a perfect 1.000, it almost always means one of your feature columns secretly contains the answer (leakage). Model Lab marks it red, removes it from the leaderboard's top spot, and tells you to remove the offending column and re-run.

What gets flagged

Target leakage — a feature that encodes the answer
Duplicated target — the target copied into a feature
Suspicious perfect score — 100% accuracy / R² = 1.000
Class imbalance & overfitting gap (warnings)

What Model Lab does

Shows a red warning on the model card (Model Health)
Excludes the model from Best Model selection
Keeps it out of recommendations & comparisons
Re-checks even after tuning — gains can’t hide leakage

Model card showing a leakage warning (Model Health)

red 'leakage suspected' banner on a flagged model

Tuning the winner

Once the quick comparison finds your top model, you can squeeze out more performance by tuning its hyperparameters. Model Lab uses Optuna to search within a time budget, then re-checks guardrails so a higher score never hides leakage. Only tree-based winners (XGBoost, Random Forest, GBM) can be tuned.

1

Run Auto Compare and find the winner

When the leaderboard settles and the top model is XGBoost, Random Forest, or GBM, a "Tune the winner" panel appears below it.

2

Pick a preset

Fast (~1 min), Balanced (~3 min), or Thorough (~4 min). Longer presets search more hyperparameter combinations and are more likely to find a better setting.

3

Read the result

Model Lab shows the before → after score gain, the best parameters found, the number of trials, and a fresh guardrail re-check verdict.

4

Use the tuned model

The tuned model is saved automatically — use it right away with Predict or the What-if simulator.

The trust difference: after tuning, guardrails run again. If the gain came from leakage, you get a red warning instead of false confidence — the higher number alone is never treated as success.

Tune the winner panel — presets and before/after gain

Fast / Balanced / Thorough buttons and the result card

Predict & compare

Every trained model is saved and reusable — it does not vanish when the analysis ends. Score new data with the saved model, or run the same data through several models to see where they agree and disagree.

Predict

Click Predict on any saved model card. Enter a single row, or upload a CSV for batch scoring. Download the predictions as a CSV.

Single row — fill in the feature values
CSV batch — up to tens of thousands of rows
Missing or unseen categories handled gracefully
Only the training features are required (extras ignored)

Compare predictions

Use Compare predictions in the group header. Select two or more models, upload one CSV, and see each model's prediction side by side.

Rows where models disagree are highlighted
Classification — disagreement rate shown
Regression — prediction range per row
Download the full comparison as CSV

Predict dialog — single row and CSV batch

Compare predictions — models side by side

What-if simulator

Open What-if on any saved model to explore how its prediction responds as you change the inputs. Move a slider or pick a category and the prediction updates live.

Read it correctly: this shows how the model's predictionreacts to inputs — it is not the real-world effect of changing that variable. The model learned associations, not causes. Use it to understand the model, not to conclude “changing X causes Y.”

1

Start from the average, or load a row

Sliders and dropdowns start at the data’s average / most-common values. Or upload a CSV row to explore one specific case.

2

Change inputs and watch the prediction

Drag a slider or pick a category — the predicted value updates live as the model re-scores.

3

Sweep one variable

Pick a numeric variable and plot the model’s response curve across its whole range, holding the others fixed (like a PDP).

Needs a freshly saved model. The simulator reads variable ranges stored when a model is saved. Models saved before this feature was added show a short note — just re-run and save the model again to enable What-if.

What-if simulator — sliders, live prediction, and response curve