Two-Candidate Next-Word/Token Probability (Robust)

Compare P(A|context) vs P(B|context) from a pretrained causal LM (no fine-tuning).

  • Proper tie handling and numerical guards.
  • Optional length normalization (per-token).
  • Use Swap to sanity-check symmetry.
0 20