Phase 2.75: Wiki Ingestion (Versioned, No Math)¶
This page is Developer Documentation. It summarizes Phase 2.75 deliverables and constraints for maintainers.
Phase 2.75 introduces a reliable ingestion pipeline for wiki-derived data.
Scope is intentionally limited to data capture and change detection:
- fetch selected entity tables from wiki pages (cards, bots, guardian chips, ultimate weapons),
- store raw values as strings (whitespace-normalized only),
- detect changes via deterministic hashing,
- version changes without overwriting prior records.
This phase does not feed wiki data into the Analysis Engine or charts.
Data model¶
Wiki-derived rows are stored in definitions.models.WikiData as versioned records:
entity_idis a stable internal key (derived from the canonical name).content_hashidentifies the exact row payload (raw_row).- when content changes, a new row is inserted; prior rows are retained.
last_seenupdates when unchanged content is observed again.- entities missing from the latest scrape are marked
deprecated=True(never deleted).
Management command¶
Fetch and diff the configured wiki table(s):
python manage.py fetch_wiki_data --check
Fetch the three card-list tables under #List_of_Cards:
python manage.py fetch_wiki_data --target cards_list --check
Fetch bots, guardian chips, and ultimate weapons:
python manage.py fetch_wiki_data --target bots --check
python manage.py fetch_wiki_data --target guardian_chips --check
python manage.py fetch_wiki_data --target ultimate_weapons --check
Persist changes to the database:
python manage.py fetch_wiki_data --write
Useful options:
--url: override the page URL (defaults depend on--target)--target:slots,cards_list,bots,guardian_chips, orultimate_weapons--table-index: choose table indexes explicitly (repeatable)--check: dry-run (no DB writes)--write: apply changes (required to persist)
Notes:
- For
--target cards_list, each ingested row includes_wiki_table_labelinraw_rowto preserve which table it came from (e.g., rarity headings). - Some wiki tables repeat column names (ex: multiple “Bits” or “Stones” columns); ingestion suffixes duplicates with
__2,__3, ... to avoid data loss. - Ingestion skips non-entity rows such as table "Total" summary rows and rows that are effectively empty placeholders (empty/
-/null). - For leveled upgrade tables, ingestion also skips rows where all parameter value cells are placeholders (even if Level/cost columns are present).
- During
rebuild_wiki_definitions, placeholder/total leveled rows are also ignored when creating parameter level tables (UWs/Guardians/Bots). - Bot ingestion selects tables under the
#Costanchor to avoid picking up lab tables (currently validated for Thunder Bot).
Next step (optional): Populate Phase 3 models¶
Once definitions.WikiData has been populated, you can rebuild the structured
Definitions layer for browsing/admin/debug visibility:
python manage.py rebuild_wiki_definitions --check
python manage.py rebuild_wiki_definitions --write
Preview field-level diffs before rebuilding:
python manage.py rebuild_wiki_definitions --target guardians --diffs
This step still does not add gameplay math or derived analysis; it simply
materializes structured rows with source_wikidata pointers for traceability.