Documentation

Docs / Optimizer

Optimizer

The LatticeZero Optimizer automatically tunes scoring profile weights to maximize discrimination between known actives and decoys for your specific target. It uses differential evolution with holdout cross-validation to produce robust, generalizable profiles.

How It Works

The Problem

The default scoring profile uses generic weights for all 14 physics terms. But different targets respond differently — a kinase cares about hinge hydrogen bonds, while a metalloprotease needs metal coordination. The Optimizer finds the weights that best separate your known actives from decoys.

The Algorithm

The Optimizer uses SciPy Differential Evolution (DE) — a robust global optimization algorithm:

  1. Initialization — Start from target-class priors (not random). Each class has physics-informed starting weights based on known binding mechanisms.
  2. Multi-seed holdout — Data is split into 5 train/test folds. The objective minimizes the negative mean holdout AUC across all folds.
  3. Differential evolution — DE explores the weight space, guided by the holdout objective. Parameters: maxiter=500, Sobol initialization, mutation range (0.5, 1.0).
  4. L2 regularization — A penalty term prevents extreme weights, keeping profiles physically interpretable.
  5. Polish step — After DE converges, a local optimizer (L-BFGS-B) fine-tunes the result.
  6. Validation — Final profile is evaluated on all holdout folds with bootstrap confidence intervals.

Target-Class Priors

Instead of starting from scratch, the Optimizer uses class-specific priors as warm-starts:

Target Class Key Prior Weights
Kinase High E_hbond, moderate burial, strain penalty
Metalloprotease High E_coul, metal coordination, moderate E_disp
Nuclear receptor High aromaticBurial, depth, moderate E_disp
Protease Balanced E_coul and E_disp, moderate strain
PPI High burial, aromaticBurial, contactArea
GPCR High E_hbond, burial, moderate E_desolv
Reductase High E_coul, depth, moderate E_hbond

15 target classes are supported. If your target doesn't match any class, the default prior uses equal weights.

Using the Optimizer

Prerequisites

You need:

  • A prepared target with compiled scoring grid
  • Known actives — SDF file with confirmed binders (minimum 20 recommended)
  • Decoys — SDF file with non-binders (minimum 200 recommended, ideally 30x actives)
  • Pre-docked 3D poses for all compounds

Running Optimization

  1. Navigate to the Optimizer page
  2. Select your prepared target
  3. Upload actives SDF and decoys SDF
  4. Select the target class (or "auto-detect")
  5. Click Optimize

The optimization typically takes 30-60 seconds. Progress is displayed in real-time.

Understanding Results

After optimization, you'll see:

Performance metrics:

  • Holdout AUC — mean AUC across all cross-validation folds (the key metric)
  • Bootstrap CI — 95% confidence interval from bootstrap resampling
  • Holdout variance — spread across folds (low = stable, high = data-sensitive)

Profile visualization:

  • Weight bar chart — relative contribution of each scoring term
  • Before/After comparison — default vs. optimized AUC
  • ROC curve — with confidence band from bootstrap

Credibility gates — automatic checks that the result is trustworthy:

  • Bootstrap AUC lower bound > 0.5 (better than random)
  • Holdout variance < 0.01 (stable across folds)
  • No single weight > 50% of total (no degenerate solutions)
  • Improvement over default > 0.05 AUC (meaningful gain)

Saving the Profile

If the result passes credibility gates:

  1. Review the optimized weights
  2. Click Save as Profile
  3. Name your profile (e.g., "My Target v1")
  4. The profile is now available in IsoDock and IsoScore

Performance Examples

Results from DEKOIS2 benchmark targets:

Target Class Default AUC Optimized AUC Time
HIVRT Viral 0.506 0.944 35s
PPARG Nuclear receptor 0.707 0.934 32s
HMGR Reductase 0.721 0.967 29s
CATL Protease 0.704 0.845 28s
ACE Metalloprotease 0.820 0.950 40s
SRC Kinase 0.548 0.711 33s

Note: Optimization time depends on dataset size and GPU speed. The vectorized numpy implementation processes ~130K evaluations per DE iteration.

Tips for Best Results

  1. More actives = better — 50+ actives gives more reliable holdout estimates than 20
  2. Quality decoys matter — Property-matched decoys (like DEKOIS2) prevent trivial solutions. Random molecules as decoys can lead to overfit profiles that exploit size/charge differences.
  3. Match the target class — Correct class prior gives DE a head start, converging faster to better solutions
  4. Check credibility gates — If any gate fails, the result may not generalize. Consider adding more data or trying a different target class.
  5. Iterate — Run optimization 2-3 times with different random seeds. Consistent results across runs indicate a robust profile.

Technical Details

Feature Budget

The Optimizer works with up to 14 scoring features. Target-class feature masks control which features are active:

  • Kinase: 12 features (metal excluded)
  • Metalloprotease: all 14 features
  • Default: 13 features (metal excluded unless detected)

Objective Function

minimize: -mean(holdout_AUC) + lambda * ||w||^2

Where:

  • holdout_AUC is tie-corrected Mann-Whitney U AUC computed on each holdout fold
  • lambda is the L2 regularization strength (default: 0.001)
  • w is the weight vector

Vectorized Computation

The Optimizer uses numpy broadcasting for fast AUC computation:

  • Feature matrix pre-computation: O(N * F) where N = ligands, F = features
  • Scoring: matrix-vector multiply scores = X @ w
  • AUC: vectorized concordance counting via broadcasting

This enables ~130,000 DE evaluations in under 30 seconds on typical hardware.

Next Steps