Core Parsers¶
Developer-facing API reference for the core.parsers package.
Package¶
core.parsers
¶
Parsing utilities for raw game text inputs.
Modules¶
core.parsers.battle_report
¶
Best-effort Battle Report parsing utilities.
Phase 1 intentionally extracted only a small subset of run metadata needed for the first chart. Phase 3 extends the extracted subset to support Battle History table columns while keeping the same guiding rules:
- Unknown labels are non-fatal.
- Raw report text is always preserved unchanged when persisted.
_COMPACT_VALUE_RE = re.compile('^\\$?[\\d,]+(?:\\.\\d+)?[A-Za-z]?$')
module-attribute
¶
_LABELS = {'battle date': 'battle_date', 'tier': 'tier', 'wave': 'wave', 'game time': 'game_time', 'real time': 'real_time', 'killed by': 'killed_by', 'coins': 'coins_earned', 'coins earned': 'coins_earned', 'cash earned': 'cash_earned', 'interest earned': 'interest_earned', 'gem blocks tapped': 'gem_blocks_tapped', 'cells earned': 'cells_earned', 'reroll shards earned': 'reroll_shards_earned'}
module-attribute
¶
_LABEL_KEYS_BY_LENGTH = tuple(sorted(_LABELS.keys(), key=len, reverse=True))
module-attribute
¶
_LABEL_SEPARATOR = '(?:[ \\t]*:[ \\t]*|\\t+[ \\t]*|[ \\t]{2,})'
module-attribute
¶
_LABEL_VALUE_RE = re.compile(f'(?im)^[ \t]*(?P<label>.+?){_LABEL_SEPARATOR}(?P<value>.*?)[ \t]*$')
module-attribute
¶
ParsedBattleReport
dataclass
¶
Parsed output for Battle Report ingestion.
Attributes:
| Name | Type | Description |
|---|---|---|
checksum |
str
|
SHA-256 checksum of the normalized raw text. |
battle_date |
datetime | None
|
Parsed battle datetime (UTC) if present. |
tier |
int | None
|
Parsed tier value if present. |
wave |
int | None
|
Parsed wave value if present. |
game_time_seconds |
int | None
|
Parsed game time duration in seconds if present. |
real_time_seconds |
int | None
|
Parsed real time duration in seconds if present. |
killed_by |
str | None
|
Parsed killed-by label if present. |
coins_earned |
int | None
|
Parsed coins earned as an integer if present. |
coins_earned_raw |
str | None
|
Raw coins earned string if present. |
cash_earned |
int | None
|
Parsed cash earned as an integer if present. |
cash_earned_raw |
str | None
|
Raw cash earned string if present. |
interest_earned |
int | None
|
Parsed interest earned as an integer if present. |
interest_earned_raw |
str | None
|
Raw interest earned string if present. |
gem_blocks_tapped |
int | None
|
Parsed gem blocks tapped as an integer if present. |
cells_earned |
int | None
|
Parsed cells earned as an integer if present. |
reroll_shards_earned |
int | None
|
Parsed reroll shards earned as an integer if present. |
RawBattleReportFields
dataclass
¶
Raw field values extracted from a Battle Report.
This dataclass stores the untrusted, raw string values extracted from the report. Normalization/parsing into typed values happens separately.
Attributes:
| Name | Type | Description |
|---|---|---|
battle_date |
str | None
|
Raw battle date string if present. |
tier |
str | None
|
Raw tier string if present. |
wave |
str | None
|
Raw wave string if present. |
game_time |
str | None
|
Raw game time string if present. |
real_time |
str | None
|
Raw real time string if present. |
killed_by |
str | None
|
Raw "Killed By" string if present. |
coins_earned |
str | None
|
Raw coins earned string if present. |
cash_earned |
str | None
|
Raw cash earned string if present. |
interest_earned |
str | None
|
Raw interest earned string if present. |
gem_blocks_tapped |
str | None
|
Raw gem blocks tapped string if present. |
cells_earned |
str | None
|
Raw cells earned string if present. |
reroll_shards_earned |
str | None
|
Raw reroll shards earned string if present. |
Unit
¶
UnitContract
dataclass
¶
Contract describing the expected unit type for a parsed value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
unit_type
|
UnitType
|
Expected UnitType for the value. |
required |
allow_zero
|
bool
|
Whether a numeric zero is considered valid. |
True
|
UnitType
¶
Bases: Enum
Supported unit categories for Phase 1.5.
UnitValidationError
¶
Bases: ValueError
Raised when a quantity does not satisfy a unit contract.
_dedupe_preserve_order(items)
¶
Return a tuple with duplicates removed (stable order).
_extract_raw_fields(raw_text)
¶
Extract raw values from Battle Report text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_text
|
str
|
Raw Battle Report text as pasted by the user. |
required |
Returns:
| Type | Description |
|---|---|
RawBattleReportFields
|
RawBattleReportFields with best-effort extracted strings. Unknown labels, |
RawBattleReportFields
|
missing sections, and malformed lines are treated as non-fatal. |
_extract_unknown_unit_suffixes(raw_text)
¶
Extract unknown compact magnitude suffixes from Battle Report values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_text
|
str
|
Raw Battle Report text as pasted by the user. |
required |
Returns:
| Type | Description |
|---|---|
set[str]
|
A set of magnitude suffixes not recognized by the compact parser. |
_iter_label_value_lines(raw_text)
¶
Return a best-effort list of (label, value) pairs from report text.
Notes
Battle Reports contain a mix of sections and labels. This function tolerates extra whitespace, reordered sections, and previously unseen labels by extracting only the labels we know how to interpret. This keeps parsing robust when the clipboard converts tabs into spaces (including single spaces).
_normalize_label(label)
¶
Normalize a Battle Report label for dictionary lookup.
_parse_battle_date(value)
¶
Parse a battle date string into a timezone-aware UTC datetime.
_parse_compact_int(value, *, unit_type, allow_zero=False)
¶
Parse compact Battle Report numbers (e.g. 7.67M, $55.90M) into an int.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str | None
|
Raw value string. |
required |
unit_type
|
UnitType
|
Unit category for parsing. |
required |
allow_zero
|
bool
|
Whether to accept zero as a valid value. |
False
|
_parse_hms_seconds(value)
¶
Parse HH:MM:SS or MM:SS formatted durations.
_parse_int(value)
¶
Parse a base-10 integer if possible.
_parse_real_time_seconds(value)
¶
Parse a real-time duration string into seconds.
_parse_text(value)
¶
Return a trimmed string, or None when empty.
_parse_unit_duration_seconds(value)
¶
Parse durations like 1h 2m 3s or 45m 10s.
_split_name_list(raw)
¶
Split a comma-delimited name list into normalized display names.
_try_parse_iso_datetime(value)
¶
Try parsing ISO-8601 datetime strings (best-effort).
battle_date_is_fallback(raw_text)
¶
Return True when the Battle Report lacks a usable battle date line.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_text
|
str
|
Raw Battle Report text as pasted by the user. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True when the report lacks a Battle Date entry, otherwise False. |
compute_battle_report_checksum(raw_text)
¶
Compute a deterministic checksum for a Battle Report.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_text
|
str
|
Raw Battle Report text as pasted by the user. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A hex-encoded SHA-256 checksum. |
Notes
The checksum is computed on a normalized form of the raw text to make pastes robust to common newline differences. The stored raw text is not modified.
extract_bot_usage(raw_text)
¶
Extract bot usage name list from raw Battle Report text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_text
|
str
|
Raw Battle Report text as pasted by the user. |
required |
Returns:
| Type | Description |
|---|---|
tuple[str, ...]
|
Tuple of bot display names in best-effort order. |
extract_ultimate_weapon_usage(raw_text)
¶
Extract Ultimate Weapon usage name lists from raw Battle Report text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_text
|
str
|
Raw Battle Report text as pasted by the user. |
required |
Returns:
| Type | Description |
|---|---|
tuple[str, ...]
|
Tuple of (combat_ultimate_weapons, utility_ultimate_weapons), where each |
tuple[str, ...]
|
is a tuple of UW display names in best-effort order. |
Notes
This parser is intentionally best-effort: - Unknown labels are ignored. - Unknown or empty list items are ignored. - The raw text is never modified; this only extracts names for run association tables.
fallback_battle_date(parsed_battle_date, *, parsed_at)
¶
Return a usable battle date, falling back to the parse timestamp.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parsed_battle_date
|
datetime | None
|
Parsed battle date from the report, if present. |
required |
parsed_at
|
datetime | None
|
Timestamp when the report was imported. |
required |
Returns:
| Type | Description |
|---|---|
datetime | None
|
Parsed battle date when available; otherwise the import timestamp. |
is_known_magnitude_suffix(suffix)
¶
Return True when a compact magnitude suffix is recognized.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
suffix
|
str
|
Raw magnitude suffix (e.g. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True when the suffix is supported by the compact number parser. |
parse_battle_report(raw_text)
¶
Parse a Battle Report into typed metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_text
|
str
|
Raw Battle Report text as pasted by the user. |
required |
Returns:
| Type | Description |
|---|---|
ParsedBattleReport
|
ParsedBattleReport containing a checksum and any extracted metadata. |
parse_quantity(raw_value, *, unit_type=UnitType.count)
¶
Parse a compact quantity string into a normalized Decimal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_value
|
str
|
Raw value string (e.g. |
required |
unit_type
|
UnitType
|
Unit category to assign for non-annotated values. |
count
|
Returns:
| Type | Description |
|---|---|
Quantity
|
Quantity where |
Notes
- A leading
xforcesunit_type=multiplierand parses the remainder. - A trailing
%forcesunit_type=multiplierand normalizes as a fraction (e.g.15%->0.15). - Magnitude suffixes are case-insensitive except for
Q(quintillion). - Supported suffixes include lowercase
k..qand uppercaseQ.
parse_validated_quantity(raw_value, *, contract)
¶
Parse and validate a quantity string against a strict unit contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_value
|
str
|
Raw Battle Report value (e.g. |
required |
contract
|
UnitContract
|
UnitContract describing the expected unit type. |
required |
Returns:
| Type | Description |
|---|---|
ValidatedQuantity
|
ValidatedQuantity with a non-None Decimal value. |
Raises:
| Type | Description |
|---|---|
UnitValidationError
|
When the parsed unit type does not match the contract. |
ValueError
|
When the value cannot be parsed into a numeric Decimal. |
record_unrecognized_unit_suffixes(raw_text)
¶
Persist unknown magnitude suffixes found in Battle Report values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_text
|
str
|
Raw Battle Report text as pasted by the user. |
required |
Returns:
| Type | Description |
|---|---|
set[str]
|
A set of unknown magnitude suffixes recorded in the Unit table. |