Model Lab

Run every model.
Find the best one automatically.

Upload your data and Model Lab benchmarks 8 ML algorithms simultaneously — with leakage guardrails, hyperparameter tuning, prediction, what-if simulation, and a one-click research report.

Open Model Lab View All Features

Read the documentation →

8 ML AlgorithmsLeakage GuardrailsHyperparameter TuningPredict & SimulateAI Report

Model Lab — benchmark, tune, and ship the best model

ML Algorithms

Auto

Leakage Guardrails

Optuna

Hyperparameter Tuning

What-if

Predict & Simulate

01 · MODEL COMPARISON

Run 8 models at once. Pick the winner.

One dataset. Eight algorithms. Automatic ranking.

Upload your data and Model Lab runs all applicable models simultaneously — classification or regression. A composite score ranks every model, highlighting the best performer without manual trial and error.

8 ML algorithms — RF, GBM, XGBoost, SVM, KNN, LDA, NB, DT
Composite score ranking — objective, reproducible
Side-by-side metrics: Accuracy, F1, AUC, CV, Gap
Auto-detects classification vs regression, auto-selects variables

Model Ranking · iris.csvClassification

#	MODEL	ACCURACY	F1	CV
🥇	LDA	100.0%	1.000	98.0±2.7%
🥈	Random Forest	97.3%	0.973	96.1±3.1%
🥉	SVM	96.7%	0.967	95.3±4.5%
4	SVM	96.7%	0.967	95.3±4.5%
5	Decision Tree	93.3%	0.933	95.3±3.4%
6	Random Forest	90.0%	0.900	94.7±2.7%

02 · GUARDRAILS

A 100% score is usually a problem, not a win.

We check whether the score is real — not just high.

Most tools celebrate a perfect score. Model Lab does the opposite: it flags target leakage, duplicated targets, and suspicious perfect scores, then excludes those models from the leaderboard and recommendations. The trust layer no chatbot gives you.

Detects target leakage, duplicated targets, suspicious perfect scores
Leaky models excluded from Best Model, recommendations, and comparisons
Class imbalance and overfit-gap warnings, severity-tagged
Re-checked even after tuning — a higher score never hides leakage

R² = 1.000

Leakage suspected — excluded from Best Model

03 · DIAGNOSTICS & READINESS

Overfitting? CV instability? Ready to ship?

Every model checked, then a clear go / no-go verdict.

Model Lab evaluates overfitting risk, CV stability, sample adequacy, and predictive performance — then rolls them into a deployment-readiness verdict. Green, amber, or red. No ambiguity.

Overfitting check: train-test gap threshold by category
CV stability: fold-level variance flagged if σ > 5%
Sample-size adequacy: dynamic threshold per variable count
Deployment verdict: Ready / Needs Review / Not Ready, with what to fix

Model Health · LDA✅ Healthy

✅No overfitting detected

⚠️Moderate sample size (n=150, recommended ≥200)

⚠️Moderate CV variance (±2.7%)

✅High predictive performance

Auto Recommendation · iris.csv

Recommended Model

Linear Discriminant Analysis (LDA)

With n=150 and 4 numeric features, LDA handles linear boundaries exceptionally well.

WHY THIS MODEL

✓Strong CV Score: 98.0%

✓Low overfitting gap: 2.0%

✓F1 Score: 100.0%

WATCH OUT FOR

⚠Small sample (n=150, rec. ≥200)

04 · ALGORITHM ADVISOR

Which model fits your data — and what to analyze next.

Context-aware suggestions, not generic advice.

Model Lab reads your dataset — sample size, variable count, class balance — and recommends the right algorithm. After you find the winner, it points you to the next analysis (feature importance, decision tree, segmentation) and jumps you there in one click.

Reads n, variables, class balance, data structure
Plain-language justification for every recommendation
Best model → next analysis, with variables pre-filled
Flags edge cases: imbalanced classes, small n, high dimensionality

Feature Importance (SHAP) · LDA

1Petal.Length

0.5

2Petal.Width

0.3

3Sepal.Length

0.1

4Sepal.Width

0.1

Mean |SHAP| values · n=150

05 · FEATURE IMPORTANCE & COMPARISON

Which variable actually moves the needle?

SHAP values, permutation importance — and cross-model agreement.

See model-native importance and SHAP values for every predictor. Then compare importance across all models in one table — variables that every model agrees on are the signals you can trust.

SHAP mean |φ| — model-agnostic, consistent across algorithms
Permutation importance — direct effect on accuracy
Cross-model importance table — "robust" badge for agreed-upon variables
Sortable — find disagreements between models fast

06 · HYPERPARAMETER TUNING

Squeeze more out of your winning model.

Optuna search within a time budget — then re-checked for leakage.

Pick a preset — Fast, Balanced, or Thorough — and Model Lab tunes the winning model with Optuna, searching hyperparameters within a time budget. It shows the before/after gain, the best parameters, and re-runs guardrails so a higher score never hides leakage.

Presets map to time budgets — Fast (~1m) / Balanced (~3m) / Thorough (~4m)
Bayesian (TPE) search via Optuna — smarter than grid/random
Before → after gain, best parameters, trials run
Post-tuning guardrail re-check — the gain is real, not leakage

0.90 → 0.927

Thorough preset · 120 trials · re-checked clean

07 · PREDICT & SIMULATE

Your model doesn’t vanish when the analysis ends.

Save the model, predict new data, and explore what-ifs.

Every trained model is saved and reusable. Score new data one row at a time or in bulk, compare how different models predict the same rows, and explore how predictions respond as you change inputs — clearly labeled as model response, not causation.

Trained models persisted — reuse without re-training
Predict new data: single row or CSV batch, downloadable
Compare predictions across models — see where they disagree
What-if simulator: watch predictions react (not a causal claim)

What-if

Model response to inputs — not a causal claim

Auto Report · iris.csv

Download HTML

Winner Summary

🏆 Linear Discriminant Analysis (LDA)

Composite Score: 99.4 · Accuracy: 100.0% · CV: 98.0%±2.7%

AI Interpretation

The LDA model achieved excellent performance (Accuracy=100.0%, CV=98.0%±2.7%) with strong generalization (Gap=2.0%). Petal.Length was the most influential feature (0.5 SHAP)...

Rank TableWinner SummaryDiagnosticsSHAPDeployment

08 · AUTO REPORT

One click. A research-grade report.

Download a complete model comparison report.

Model Lab generates a structured report covering winner summary, full rank table, diagnostic results, guardrail checks, feature importance, and an AI-written interpretation.

Winner summary, rank table, diagnostics, guardrails, importance
AI-written model interpretation
One-click download — print to PDF from browser
Clean, professional layout — no extra formatting needed

Skari

Run every model.
Find the best one automatically.

Run 8 models at once. Pick the winner.

A 100% score is usually a problem, not a win.

Overfitting? CV instability? Ready to ship?

Which model fits your data — and what to analyze next.

Which variable actually moves the needle?

Squeeze more out of your winning model.

Your model doesn’t vanish when the analysis ends.

One click. A research-grade report.

More analysis tools.

Analysis Recommendation

Statistical Lab

Customer Insight Lab

Industry & Policy Lab

Operations Research Lab

Data Editor

Build your model in minutes.

Run every model. Find the best one automatically.

Run 8 models at once. Pick the winner.

A 100% score is usually a problem, not a win.

Overfitting? CV instability? Ready to ship?

Which model fits your data — and what to analyze next.

Which variable actually moves the needle?

Squeeze more out of your winning model.

Your model doesn’t vanish when the analysis ends.

One click. A research-grade report.

More analysis tools.

Analysis Recommendation

Statistical Lab

Customer Insight Lab

Industry & Policy Lab

Operations Research Lab

Data Editor

Build your model in minutes.

Run every model.
Find the best one automatically.