A Study of Nash Equilibria in Pokémon

This project builds a Pokémon battle AI to investigate where the game's Nash equilibria actually lie. There's no point keeping the findings to myself, so I'm sharing them here — with playable, GPU-free demos of the learned policies.

Stage 3b

Cycling in an extreme matchup — Goodra-Hisui vs Cloyster

Goal. Build an extreme favorable/unfavorable matchup and see whether the decision collapses to a single option — always switch, or always attack.

Setup. I paired a Special-Defense-focused Hisuian Goodra with a Defense-focused Cloyster and tuned their movesets to create a perfectly favorable/unfavorable matchup. The two share identical Attack and Special Attack, and every real stat is identical except Defense and Special Defense, which are swapped between them. Shock Wave (Special / Electric / 60 BP) hits Cloyster for super-effective damage; Bulldoze (Physical / Ground / 60 BP) hits Hisuian Goodra super-effectively — but each does very little to the other Pokémon, so switching sharply cuts the damage taken. Movesets are mirrored across the two teams, so for both sides "switching out of a bad matchup gives you a good one," making cycling battles likely. Both Pokémon hold a Covert Cloak, which suppresses the added (secondary) effects of moves. I trained the AI on this and studied the strategy it converged to.

Result. Even in a seemingly clear-cut matchup, surprisingly complex mind games emerged. The tricky part: constrained by real moves, I couldn't build a perfectly symmetric relationship — Bulldoze is neutral (1×) against Cloyster while Shock Wave is resisted (½×) by Hisuian Goodra. As a result, "even in a favorable matchup, don't fire Shock Wave — bait a switch and hit Bulldoze instead" became a very strong line.

▶ Play this battle Team 1 ↗ Team 2 ↗ How the AI is computed ↗

Stage 3c

A truly extreme matchup — Goodra vs Cloyster

Goal. Build a truly extreme favorable/unfavorable matchup and test whether the decision collapses to a single option.

Setup. I paired a Special-Defense-focused Goodra with a Defense-focused Cloyster, then gave them fictional moves — a 60-BP physical Fairy move and a 60-BP special Fighting move. Real stats are identical and the type-effectiveness relationships are made perfectly symmetric, to see what happens.

Result. It became considerably simpler, yet complex mind games still emerged. Because you gain an edge by knocking out just one of the opponent's two Pokémon — thereby denying them a switch — bait-switching to focus-fire a single target was effective. In the end, a favorable matchup did not collapse to "always attack," nor an unfavorable one to "always switch."

▶ Play this battle Team 1 ↗ Team 2 ↗ How the AI is computed ↗

Stage 3d

Neutral coverage in an extreme matchup — Goodra-Hisui vs Cloyster

Goal. Return to Stage 3b and give every Pokémon two neutral attacks — Crunch and Dark Pulse — to see how a useful hit that does not exploit a weakness changes the equilibrium.

Setup. Species, stats, teams, and super-effective moves are identical to Stage 3b. Every Pokémon now has its original coverage move plus 80-BP physical Crunch and 80-BP special Dark Pulse.

Result. The equilibrium switch rate fell from 55.1% to 23.6%, and average battles shortened from 30.0 to 11.0 turns. Yet strategy became more mixed: even in favorable matchups, neutral attacks are used 54.7% of the time to punish an expected switch.

▶ Play this battle Team 1 ↗ Team 2 ↗ How the AI is computed ↗

Stage 3e

Neutral coverage in a truly extreme matchup — Goodra vs Cloyster

Goal. Stage 3d added neutral coverage to Stage 3b. Stage 3e applies the same change to Stage 3c, completing a 2×2 grid. That grid isolates a question Stage 3d alone could not answer: did 3d's dramatic collapse in switching come from the neutral attacks themselves, or from the asymmetry that was baked into Stage 3b?

Setup. Species, stats, and the super-effective move assignment are identical to Stage 3c. Every Pokémon now also carries 80-BP physical Crunch and 80-BP special Dark Pulse. Because the matchup stays perfectly symmetric, the equilibrium must satisfy V(s) + V(mirror of s) = 1 — a strict check on the solver that Stage 3d could not provide. It holds to within one quantization step across all 3,380,000 states.

Result. The answer is the neutral attacks. Switching fell 54.1% → 24.5% and battles shortened 31.3 → 11.1 turns — almost exactly Stage 3d's numbers, with the asymmetry removed. A neat surprise also appeared: an attack that looks strictly worse on every damage roll is still played, because when the foe is nearly fainted, the excess damage of the "better" move is wasted and only the damage carried into the switch-in matters.

▶ Play this battle Team 1 ↗ Team 2 ↗ How the AI is computed ↗

ポケモンにおけるナッシュ均衡の研究

このプロジェクトはポケモンAIを作り、ナッシュ均衡がどこにあるのかを調べる試みです。研究結果を己の胸のうちにとどめておいても仕方ないので公開していきます — 学習したポリシーは、GPU不要でその場で対戦できます。

Stage 3b

極端な有利対面によるサイクル戦 — ヒスイヌメルゴン vs パルシェン

目的。極端な「有利/不利」対面を作り、「交代/攻撃」の一択になるのかを調べる。

概要。とくぼう特化のヒスイヌメルゴンと、ぼうぎょ特化のパルシェンでチームを組ませ、持ち技を工夫して完全な「有利/不利」対面を作りAIに学習させました。こうげきととくこうの数値を一致させ、さらにぼうぎょ・とくぼう以外の実数値はすべて一致、ぼうぎょととくぼうだけを反転させた二匹です。わざは、パルシェンに対して「でんげきは(特殊・でんき・威力60)」、ヒスイヌメルゴンに対して「じならし(物理・じめん・威力60)」で弱点を突きますが、もう一匹には極端にダメージが低くなるようにし、交代すればダメージを抑えられる状況にしました。持ち技をチームで反転させ、お互いに「不利対面から交代すれば有利対面になる」という、サイクル戦が起こりやすい編成にして学習させ、AIの戦略を調べました。もちものはお互いに隠密マントを持たせ、わざの追加効果を封じています。

結果。有利対面に見えても、かなり複雑な読み合いが生じました。特に難しかったのは、実在するわざの制限により完全な対称関係を作れず、じならしがパルシェンに等倍、でんげきはがヒスイヌメルゴンに半減になってしまったこと。そのため「有利対面でもでんげきはを撃たずに釣り交換し、じならしを撃つ」という戦略が非常に効果的になりました。

▶ この対面で対戦するチーム定義1 ↗ チーム定義2 ↗ このAIの作り方 ↗

Stage 3c

真・極端な有利対面によるサイクル戦 — ヌメルゴン vs パルシェン

目的。真に極端な「有利/不利」対面を作り、「交代/攻撃」の一択になるのかを調べる。

概要。とくぼう特化のヌメルゴンと、ぼうぎょ特化のパルシェンでチームを組ませました。さらに、実在しない「威力60の物理フェアリー技」「威力60の特殊かくとう技」を覚えさせて対戦させています。実数値は同じで、タイプ相性関係も完全に対称にしてどうなるかを調べました。

結果。だいぶシンプルになりましたが、やはり複雑な読み合いが生じました。相手のどちらか一体だけを倒して交代を封じると有利になれるため、一匹を集中攻撃するための釣り交換が有効でした。結果として、有利対面でも攻撃一択にはならず、不利対面でも交代一択にはなりませんでした。

▶ この対面で対戦するチーム定義1 ↗ チーム定義2 ↗ このAIの作り方 ↗

Stage 3d

等倍打撃のある極端な有利対面 — ヒスイヌメルゴン vs パルシェン

目的。Stage 3bの全個体に、弱点ではないが有効な打撃として「かみくだく」と「あくのはどう」を追加し、均衡戦略がどう変わるかを調べる。

概要。種族・実数値・チーム・弱点技はStage 3bと同一です。各個体が従来の弱点技に加え、威力80の物理技「かみくだく」と特殊技「あくのはどう」を持ちます。

結果。均衡対戦の交代率は55.1%から23.6%へ低下し、平均ターン数も 30.0から11.0へ短縮しました。一方で戦略はさらに混合的になり、有利対面でも交代先を読む等倍攻撃を54.7%使用しました。

▶ この対面で対戦するチーム定義1 ↗ チーム定義2 ↗ このAIの作り方 ↗

Stage 3e

等倍打撃のある真・極端な有利対面 — ヌメルゴン vs パルシェン

目的。Stage 3dは Stage 3b に等倍打撃を足したものでした。Stage 3eは同じ操作を Stage 3c に施し、2×2の格子を完成させます。これによりStage 3d単独では答えられなかった問いが切り分けられます — 3dで交代が激減したのは、等倍打撃そのもののせいなのか、それとも Stage 3bに元々あった非対称性のせいなのか。

概要。種族・実数値・弱点技の割当はStage 3cと同一です。各個体がこれに加えて威力80の物理技「かみくだく」と特殊技「あくのはどう」を持ちます。対面が完全対称のままなので、均衡は V(s) + V(sの鏡像) = 1 を満たさねばならず、これはStage 3dでは不可能だった厳密な検算になります。実際、全3,380,000状態で量子化1刻み以内に収まりました。

結果。答えは「等倍打撃そのもの」でした。交代率は54.1%→24.5%、平均ターン数は31.3→11.1と、非対称性を取り除いてもStage 3dとほぼ同じ数字になりました。面白い驚きもありました。どのダメージ乱数でも劣って見える技が、それでも使われるのです。相手が瀕死寸前のときは「優れた技」の超過ダメージが無駄になり、交代先へ持ち越せるダメージだけが物を言うからです。

▶ この対面で対戦するチーム定義1 ↗ チーム定義2 ↗ このAIの作り方 ↗