← All challenges

Open Data Infrastructure Gaps

Open data infrastructure gaps are the specific datasets and publication failures that block accountability analysis even when other data is available. This page catalogs what’s missing across the federal and Alberta records, and what each gap blocks.

What this page shows now
A gap catalog grouped by category (entity registries, procurement, transfers, mandate text, machine-readability), a federal/provincial scorecard on shared evaluation criteria, and a short list of gaps the project has already closed in production.
What is missing to prove more
Bulk open access to the Alberta Corporate Registry, stable cross-jurisdiction vendor and recipient identifiers, and machine-readable amendment and bid history are the load-bearing gaps that show up across the other ten challenges.
Why it matters
Without these datasets you can name accountability questions but not answer them — the spine of why so many of the Agency 2026 challenges stall before producing a finding.

I

Federal vs Alberta · data availability

55 dataset classes that bear on this chapter, federal vs Alberta side-by-side

55 dataset classes · Federal vs Alberta

Federal

Bulk machine-readable
17
Available · no bulk export
14
Gated · account required
1
Partial coverage
9
Not published
14

Alberta

Bulk machine-readable
11
Available · no bulk export
16
Gated · account required
1
Partial coverage
3
Not published
24

Portal & registry

  • Open-data portal
    FederalBulk
    Format
    CKAN catalogue (~40k datasets), per-dataset API
    Lag
    Varies by dataset
    License
    Open Government Licence — Canada
    Linkable ID
    Per-dataset CKAN ID
    AlbertaBulk
    Format
    CKAN catalogue, per-dataset API
    Lag
    Varies by dataset
    License
    Open Government Licence — Alberta
    Linkable ID
    Per-dataset CKAN ID
  • Corporate registry (status, directors, addresses)
    FederalBulk
    Format
    Corporations Canada XML bulk + API (federal CBCA corps only)
    Lag
    Daily
    License
    Open Government Licence — Canada
    Linkable ID
    Federal corporation number
    AlbertaNo bulk
    Format
    Per-lookup paid; no open bulk
    Lag
    License
    Linkable ID
    Note
    Single biggest provincial gap — blocks Ghost Capacity, Zombie Recipients, Related Parties
  • Beneficial ownership
    FederalPartial
    Format
    CBCA ISC (federal corps only, partial fields)
    Lag
    Daily
    License
    Open Government Licence — Canada
    Linkable ID
    Federal corporation number
    Alberta
    Format
    Not published
    Lag
    License
    Linkable ID
  • Charities / non-profit
    FederalBulk
    Format
    CRA T3010 multi-table CSV; Charities Listings CSV
    Lag
    12–18 months from FY end
    License
    Open Government Licence — Canada
    Linkable ID
    Business Number (BN)
    Alberta
    Format
    Societies Registry not open bulk
    Lag
    License
    Linkable ID
  • Unified entity registry (recipients/vendors/orgs)
    Federal
    Format
    Does not exist
    Lag
    License
    Linkable ID
    Alberta
    Format
    Does not exist
    Lag
    License
    Linkable ID

Spending disclosure

  • Grants & contributions
    FederalBulk
    Format
    Proactive G&C consolidated CSV
    Lag
    Quarterly
    License
    Open Government Licence — Canada
    Linkable ID
    Free-text recipient name (no stable ID)
    AlbertaBulk
    Format
    Grant Payments CSV (CKAN)
    Lag
    Annual + monthly delta
    License
    Open Government Licence — Alberta
    Linkable ID
    Free-text recipient name; resolved post-ingest (spec 043)
  • Contracts
    FederalBulk
    Format
    Proactive Contracts ≥$10k CSV
    Lag
    Quarterly
    License
    Open Government Licence — Canada
    Linkable ID
    Free-text vendor name
    AlbertaBulk
    Format
    Sole-source service contracts CSV (CKAN)
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Free-text vendor name; resolved post-ingest
  • Procurement (tenders & awards)
    FederalBulk
    Format
    CanadaBuys Contract History (2009+) + Tender Notices bulk CSV
    Lag
    Continuous
    License
    Open Government Licence — Canada
    Linkable ID
    Solicitation number
    AlbertaGated
    Format
    Alberta Purchasing Connection — vendor account required (not open-data; portal access only)
    Lag
    Live (account)
    License
    Terms of use, no open licence
    Linkable ID
    Note
    Web scraping required
  • Selected payments ledger
    Federal
    Format
    No federal equivalent
    Lag
    License
    Linkable ID
    AlbertaBulk
    Format
    Blue Book (GRF selected payments) XLSX (CKAN)
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Free-text vendor/recipient name
  • Travel & expense
    FederalBulk
    Format
    Proactive travel / hospitality CSV
    Lag
    Quarterly
    License
    Open Government Licence — Canada
    Linkable ID
    Per-event row
    AlbertaBulk
    Format
    Travel & expense disclosure CSV (CKAN)
    Lag
    Quarterly
    License
    Open Government Licence — Alberta
    Linkable ID
    Per-event row
  • Contract amendments linked to parent
    Federal
    Format
    Inconsistent across departments
    Lag
    License
    Linkable ID
    Alberta
    Format
    No parent_contract_id in source
    Lag
    License
    Linkable ID
  • Unit-price / line-item detail
    Federal
    Format
    Not collected
    Lag
    License
    Linkable ID
    Alberta
    Format
    Not collected
    Lag
    License
    Linkable ID
  • Bid histories (losing bidders)
    FederalGated
    Format
    ATIP-only — no public dataset
    Lag
    License
    Linkable ID
    Alberta
    Format
    Not published
    Lag
    License
    Linkable ID

Budget & estimates

  • Estimates / budget (current)
    FederalPartial
    Format
    GC InfoBase DP+DRR CSV (2018+ only)
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    Program inventory ID
    AlbertaBulk
    Format
    Government Estimates XLSX (CKAN)
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Ministry × line item × FY
  • Fiscal plan / budget workbook
    FederalNo bulk
    Format
    Budget docs HTML/PDF mix
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    AlbertaBulk
    Format
    Fiscal Plan XLSX (CKAN, full workbook)
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Plan section × line × FY
  • Historical estimates (pre-digital)
    FederalNo bulk
    Format
    PDF — PDF parsing required
    Lag
    Historical
    License
    Open Government Licence — Canada
    Linkable ID
    AlbertaNo bulk
    Format
    PDF (pre-2019) — PDF parsing required (spec 017)
    Lag
    Historical
    License
    Open Government Licence — Alberta
    Linkable ID
  • Public Accounts (transfer payments / detail)
    FederalNo bulk
    Format
    Public Accounts Vol II §9 / Vol III §6 PDF — PDF parsing required
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    Recipients ≥$100k only (Vol III §6)
    AlbertaNo bulk
    Format
    Public Accounts PDF
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
  • Departmental / annual reports
    FederalNo bulk
    Format
    Departmental Results Reports HTML/PDF — PDF parsing required
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    Per-department
    AlbertaNo bulk
    Format
    Ministry annual reports PDF — PDF parsing required
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Per-ministry
  • Business plans / mandate text
    FederalNo bulk
    Format
    Mandate letters HTML (structured)
    Lag
    Per-government
    License
    Open Government Licence — Canada
    Linkable ID
    Per-minister
    AlbertaNo bulk
    Format
    Ministry business plans PDF — PDF parsing required
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Per-ministry
  • Program evaluations
    FederalNo bulk
    Format
    PDF (5-year cycle) — PDF parsing required
    Lag
    5-year cycle, backlogs common
    License
    Open Government Licence — Canada
    Linkable ID
    Per-department
    Alberta
    Format
    Not systematically published
    Lag
    License
    Linkable ID
  • Pre-2018 program-level harmonized time-series
    Federal
    Format
    Inventory rebuilt; old series don't join
    Lag
    License
    Linkable ID
    Alberta
    Format
    Schemas vary across ministries / FYs
    Lag
    License
    Linkable ID

Intergovernmental transfers

  • Major Federal Transfers (per province)
    FederalBulk
    Format
    Finance Canada CSV (1980+, per province)
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    Province × program × FY
    AlbertaBulk
    Format
    Fiscal-plan federal revenue (aggregate only)
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Aggregate revenue category
  • CHT / CST / Equalization / TFF allocation
    FederalBulk
    Format
    Annual + monthly CSV
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    Program × province × FY
    Alberta
    Format
    Receives only — no separate AB publication
    Lag
    License
    Linkable ID
  • Bilateral agreement texts
    FederalNo bulk
    Format
    PDF, variable granularity by province — PDF parsing required
    Lag
    Per-agreement
    License
    Open Government Licence — Canada
    Linkable ID
    Per-agreement
    Alberta
    Format
    Party to agreements; not republished
    Lag
    License
    Linkable ID
  • ICIP project-level
    FederalBulk
    Format
    CSV
    Lag
    Periodic
    License
    Open Government Licence — Canada
    Linkable ID
    Project ID
    Alberta
    Format
    Party to programs; not republished
    Lag
    License
    Linkable ID
  • Canada Community-Building Fund (Gas Tax)
    FederalBulk
    Format
    CSV per-project
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    Project ID
    Alberta
    Format
    Party; not republished
    Lag
    License
    Linkable ID
  • CMHC housing-bilateral progress
    FederalPartial
    Format
    Summaries only; per-project details not public
    Lag
    6-monthly
    License
    Open Government Licence — Canada
    Linkable ID
    Alberta
    Format
    Not published
    Lag
    License
    Linkable ID
  • Province-level use-of-funds reporting
    Federal
    Format
    Not required, not collected, not published
    Lag
    License
    Linkable ID
    Alberta
    Format
    Not required by federal agreements
    Lag
    License
    Linkable ID
  • Federal-transfer ↔ provincial public-accounts reconciliation
    Federal
    Format
    No dataset exists
    Lag
    License
    Linkable ID
    Alberta
    Format
    No dataset exists
    Lag
    License
    Linkable ID
  • Equalization formula modelling + inputs
    Federal
    Format
    Not published; 2024 extension bypassed provincial consultation
    Lag
    License
    Linkable ID
    Alberta
    Format
    n/a (recipient province)
    Lag
    License
    Linkable ID

Lobbying & ethics

  • Lobbyist registry (registrations)
    FederalBulk
    Format
    Commissioner of Lobbying bulk CSV (1996+)
    Lag
    Daily
    License
    Open Government Licence — Canada
    Linkable ID
    Registration number
    AlbertaNo bulk
    Format
    Oracle APEX, no bulk export — web scraping required (spec 008)
    Lag
    Continuous
    License
    Site terms
    Linkable ID
    Registration number
  • Former public office holder disclosures (name, role, dates)
    FederalBulk
    Format
    Designated Public Office Holder (DPOH) fields on consultant/in-house registrations — bulk CSV; structured Name, Position Held, Employer, Start, End
    Lag
    Daily
    License
    Open Government Licence — Canada
    Linkable ID
    Registration number → DPOH-history rows
    AlbertaPartial
    Format
    Structured Name/Position/Start/End table embedded in registration PDFs — present in source but not extracted in v1; ~9.5% of filings populate it (spec 062 extracts)
    Lag
    Continuous
    License
    Site terms
    Linkable ID
    Registration number → FPOH rows
  • Monthly communication reports
    FederalBulk
    Format
    Bulk CSV monthly
    Lag
    Monthly
    License
    Open Government Licence — Canada
    Linkable ID
    Communication ID
    AlbertaPartial
    Format
    Rolled into AB lobbyist scrape
    Lag
    Continuous
    License
    Site terms
    Linkable ID
  • Lobbying compliance decisions
    FederalNo bulk
    Format
    HTML — web scraping required
    Lag
    Per-decision
    License
    Open Government Licence — Canada
    Linkable ID
    Decision ID
    Alberta
    Format
    Limited public surface
    Lag
    License
    Linkable ID
  • Conflict-of-interest / ethics filings
    FederalNo bulk
    Format
    Ethics Commissioner HTML/PDF — scraping + PDF parsing
    Lag
    Per-filing
    License
    Open Government Licence — Canada
    Linkable ID
    Per-individual
    AlbertaNo bulk
    Format
    MLA declared interests HTML/PDF — scraping + PDF parsing
    Lag
    Annual
    License
    Site terms
    Linkable ID
    Per-MLA
  • Orders in Council / appointees
    FederalNo bulk
    Format
    OIC Canada HTML search, no bulk API — web scraping required
    Lag
    Continuous
    License
    Open Government Licence — Canada
    Linkable ID
    OIC number
    Note
    OIC Canada exposes a current HTML search; long historical tail requires Library and Archives Canada / Wayback for pre-2010 lookups.
    AlbertaNo bulk
    Format
    PDF — PDF parsing required (via takealbertaback)
    Lag
    Continuous
    License
    Open Government Licence — Alberta
    Linkable ID
    OIC number
    Note
    AB OIC PDFs published continuously, but the upstream classified/extracted dataset (takealbertaback) currently covers 2023-01-10 onward only — pre-2023 OICs exist in King's Printer archives but are not yet machine-extracted.
  • Employee / public-service directory
    FederalNo bulk
    Format
    GEDS HTML (current snapshot only) — scraping; Wayback archived continuously through 2025
    Lag
    Continuous (no historicals)
    License
    Open Government Licence — Canada
    Linkable ID
    Per-employee email
    AlbertaNo bulk
    Format
    Searchable HTML directory (current snapshot only) — scraping required; Wayback archived 2016–2020 then ceased; Premier's Office political-staff titles (Principal Secretary, Press Secretary, Issues Manager) excluded
    Lag
    Continuous (no historicals; 2020-04→present Wayback blackout)
    License
    Site terms
    Linkable ID
    Per-employee phone / org-unit
  • Post-employment cooling-off waivers
    Federal
    Format
    Not published
    Lag
    License
    Linkable ID
    Alberta
    Format
    Not published
    Lag
    License
    Linkable ID

Education & sub-recipients

  • School authority registry + lineage
    Federal
    Format
    n/a (provincial jurisdiction)
    Lag
    License
    Linkable ID
    AlbertaBulk
    Format
    Alberta Education CKAN
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Authority code
  • School board audited financials
    Federal
    Format
    n/a
    Lag
    License
    Linkable ID
    AlbertaNo bulk
    Format
    PDF per board — PDF parsing required
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
  • Health authority vendor disclosure
    FederalPartial
    Format
    CIHI aggregate indicators only
    Lag
    Annual
    License
    CIHI terms
    Linkable ID
    AlbertaNo bulk
    Format
    AHS vendor disclosure PDF, no bulk — PDF parsing required
    Lag
    Annual
    License
    Site terms
    Linkable ID
  • Municipal SOFI / financial info returns
    FederalPartial
    Format
    FCM aggregates only
    Lag
    Annual
    License
    FCM terms
    Linkable ID
    AlbertaNo bulk
    Format
    Per-municipality PDF/XLSX, not aggregated — PDF parsing required
    Lag
    Annual
    License
    Open Government Licence — Alberta
    Linkable ID
    Municipality code

Adverse signals

  • Court records
    FederalBulk
    Format
    CanLII API (federal + provincial coverage)
    Lag
    Continuous
    License
    CanLII free licence
    Linkable ID
    Citation
    AlbertaNo bulk
    Format
    CanLII covers AB; Alberta Courts daily lists HTML — web scraping required
    Lag
    Continuous
    License
    Site terms
    Linkable ID
    Citation
  • Securities regulator decisions
    FederalPartial
    Format
    CSA / IIROC mixed
    Lag
    Per-decision
    License
    Site terms
    Linkable ID
    Decision ID
    AlbertaNo bulk
    Format
    Alberta Securities Commission HTML + PDF — scraping + PDF parsing
    Lag
    Per-decision
    License
    Site terms
    Linkable ID
    Decision ID
  • Energy / utilities regulator decisions
    FederalPartial
    Format
    NEB / CER mixed
    Lag
    Per-decision
    License
    Open Government Licence — Canada
    Linkable ID
    Decision ID
    AlbertaNo bulk
    Format
    AER PDF; AUC PDF — PDF parsing required
    Lag
    Per-decision
    License
    Site terms
    Linkable ID
    Decision ID
  • Competition / consumer-protection decisions
    FederalNo bulk
    Format
    Competition Bureau consent agreements HTML — web scraping required
    Lag
    Per-decision
    License
    Open Government Licence — Canada
    Linkable ID
    Decision ID
    Alberta
    Format
    n/a
    Lag
    License
    Linkable ID
  • Financial-sector enforcement
    FederalNo bulk
    Format
    OSFI / CRTC enforcement bulletins HTML — web scraping required
    Lag
    Per-bulletin
    License
    Open Government Licence — Canada
    Linkable ID
    Bulletin ID
    Alberta
    Format
    n/a (federal jurisdiction)
    Lag
    License
    Linkable ID
  • Ineligible / debarred suppliers
    FederalNo bulk
    Format
    PSPC Ineligible Suppliers HTML, no history — web scraping required
    Lag
    Continuous (no historicals)
    License
    Open Government Licence — Canada
    Linkable ID
    Vendor name
    Alberta
    Format
    Not published
    Lag
    License
    Linkable ID
  • Professional discipline rolls
    FederalPartial
    Format
    Per-regulator, mixed
    Lag
    Per-decision
    License
    Site terms
    Linkable ID
    Member ID
    AlbertaNo bulk
    Format
    Law Society of AB HTML — web scraping required
    Lag
    Per-decision
    License
    Site terms
    Linkable ID
    Member ID
  • Revoked charities
    FederalBulk
    Format
    CSV (in T3010 List of Charities)
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    BN
    AlbertaBulk
    Format
    Uses federal CRA
    Lag
    Annual
    License
    Open Government Licence — Canada
    Linkable ID
    BN
  • News event streams
    FederalBulk
    Format
    GDELT API; Google News RSS
    Lag
    Real-time
    License
    GDELT / Google ToS
    Linkable ID
    Article URL
    AlbertaBulk
    Format
    Uses federal/global feeds
    Lag
    Real-time
    License
    GDELT / Google ToS
    Linkable ID
    Article URL
  • Canadian news archives
    FederalPartial
    Format
    Licensed / paywalled (CP, Postmedia, CBC)
    Lag
    Continuous
    License
    Commercial
    Linkable ID
    Article URL
    AlbertaPartial
    Format
    Licensed / paywalled
    Lag
    Continuous
    License
    Commercial
    Linkable ID
    Article URL

Cross-cutting

  • Vendor stable ID across disclosures
    Federal
    Format
    Free-text vendor name only
    Lag
    License
    Linkable ID
    Alberta
    Format
    Free-text upstream; resolved post-ingest only inside ab_spending
    Lag
    License
    Linkable ID
  • Integrated risk register / watchlist
    Federal
    Format
    Not published
    Lag
    License
    Linkable ID
    Alberta
    Format
    Not published
    Lag
    License
    Linkable ID
  • 'Still active' attestation at payment time
    Federal
    Format
    Not collected anywhere
    Lag
    License
    Linkable ID
    Alberta
    Format
    Not collected anywhere
    Lag
    License
    Linkable ID

Legend bulk machine-readable available · no bulk export gated · account required partial coverage not published· click any row for license, lag, ID, source

III

Full gap catalog · what should be public but isn’t

From the live ab_spending data-quality log and the active spec roadmap. Grouped by category; each entry is a real gap, not an aspiration.

Spending

Historical budget estimates (2000–2018)

Should be
Continuous fiscal estimates time series across 25+ years.
Actually published
PDFs only across three CKAN packages with three distinct layout eras. No bulk extract, no machine-readable summary table.
Why it matters
Cost-of-government trend coverage collapsed from 25 years to 6 until spec 017 added a layout-era-aware parser. Without it, no one can plot health spending as a share of budget over a generation.

Ministers' office monthly expenses

Should be
Single CSV with date, ministry, person, category, amount.
Actually published
~125 separate CKAN packages, ~15 monthly PDFs each (per-ministry); attribution metadata lives only in filenames; rows collapse to 'Unknown Ministry' without a custom resolver.
Why it matters
Ministers' office spending is the most politically salient discretionary line. Specs 005 + 016 closed this with a filename-pattern resolver feeding 5,254 attributed rows.

Wire transfer payees (2019–2022)

Should be
Structured wire-transfer recipient table aligned with disclosure schema.
Actually published
Three legacy XLSX files with non-standard sheet structure; parser fails on EmptyCell tokens (data-quality issue DQ-002).
Why it matters
Vendor analysis blocked for the period covered by these files; the data exists but is shaped wrong for any pipeline that wasn't hand-coded for it.

People

Lobbyist registry — bulk export

Should be
Bulk download (CSV or API) of registrations across active, archived, and terminated statuses.
Actually published
Live Oracle APEX portal at albertalobbyistregistry.ca; ~19,453 registrations across 1,297 paginated listing pages; session-bound URLs; no robots.txt; no rate-limit guidance.
Why it matters
Influence-and-contract analysis (does the firm lobbying for X also receive Y in grants?) is impossible without the data flat. Spec 008 closed this with a politeness-bounded crawler.

Political contributions

Should be
Bulk CSV of donors and amounts by election period and recipient.
Actually published
Elections Alberta annual returns as PDFs; downloadable XLSX in some years; threshold-filtered (≥$250 post-2013, ≥$375 pre-2013); amendment history requires dedup logic.
Why it matters
The donor↔vendor crosswalk is the canonical favouritism question. Without bulk donations, you cannot test it.

PSBCTA executive compensation

Should be
Unified compensation file with stable organization keys across ~400 reporting bodies.
Actually published
Single consolidated CSV at psctanew.alberta.ca, but body names require manual aliasing to organization graph (row-weighted alias hit rate 92.9%).
Why it matters
Crown-corp and agency executive pay is largely invisible without PSBCTA. Core-GoA sunshine list alone covers a small slice of the public sector.

Procurement & Contracts

Sole-source justifications

Should be
Structured field on every contract recording the procurement method and rationale.
Actually published
Embedded in PDF disclosures or unstructured text; no canonical column.
Why it matters
Sole-source analysis (a leading corruption indicator) requires extraction by hand, which means it doesn't get done at scale.

Cross-dataset stable vendor ID

Should be
Canonical vendor key shared across grants, contracts, transfer payments, and charities.
Actually published
Each dataset uses its own naming. 'Acme Corp.', 'Acme Corporation', and 'ACME CORP' are three vendors until reconciled.
Why it matters
This is the load-bearing gap behind half the challenges in this objective. Spec 043 introduced a registry-first entity resolver as the AB-side fix.

Outcomes & Reconciliation

Budget-vs-disclosure reconciliation

Should be
Single ministry taxonomy linking budgeted dollars to disclosed payments.
Actually published
Two different nomenclatures (budget side: 'Health'; disclosure side: 'Primary & Preventative Health'), creating 100% reconciliation gap until aliased (data-quality issue DQ-001).
Why it matters
Citizens cannot answer 'where did the budgeted money actually go?' without reconciliation. A 100% gap is the difference between accountability theatre and real audit.

Education property tax requisition history

Should be
Annual per-municipality breakdown back to 2000.
Actually published
PDF only, starting 2019 (data-quality issue DQ-022).
Why it matters
Property owners cannot see their share of education finance; school funding is half-transparent without it.

Infrastructure

Data-freshness SLO

Should be
Per-dataset published refresh cadence and last-refreshed timestamp surfaced to citizens.
Actually published
Implicit in CKAN metadata at best; no SLA, no dashboard. Many 'annual' datasets quietly lapse.
Why it matters
Stale data passes for current data, which means analysis silently drifts. ab_spending derives this from source_datasets but it isn't exposed publicly.

Machine-readability standard

Should be
Each dataset graded against a public scorecard (CSV/API/PDF/scrape/none).
Actually published
No grading exists. The publisher decides the format and citizens take what's given.
Why it matters
Without a scorecard, there's no accountability for the choice between PDF and CSV. A short class-by-class grading appears below — the work isn't hard, it's just not done.

PDF rendering failures (ad-hoc)

Should be
Every published document machine-readable on first request.
Actually published
1 of 19,453 lobbyist registry PDFs (CL-13927-01) fails APEX BLOB-to-PDF render server-side (data-quality issue DQ-017).
Why it matters
Not catastrophic on its own, but signals a dead-letter pattern that politeness-bounded crawlers can't retry. Provenance ends up incomplete.

IV

Machine-readability scorecard

Ten representative AB dataset classes graded by what's actually published, not what should be. The grade reflects ingestion friction, not data quality.

A = stable open API with versioning. B = CSV / structured download with content-hash tracking. C = HTML table scrape required. D = PDF-only, parser-dependent. F = no published form (crawler required).

Machine-readability scorecard: representative Alberta dataset classes graded A–F by ingestion friction.
Dataset classCurrent formatGradeNote
Blue Book — supplies & procurementCKAN XLSX (annual)BStable schema, indexed on CKAN. No API yet.
Historical government estimatesCKAN PDF (2000–2018)DThree layout eras across 20 years. Spec 017 parser closes the gap; the upstream is unchanged.
Ministers' office expensesCKAN PDF (~125 packages)DPer-ministry monthly; attribution lives only in filenames.
Lobbyist registryLive Oracle APEX portalFNo bulk export. 19,453 records behind paginated search.
Political contributions (Elections AB)Annual XLSX/PDFBThreshold-filtered. Amendment history needs dedup.
PSBCTA executive compensationConsolidated CSVBSingle endpoint. ~400 body names need aliasing.
Provincial royalty revenueCKAN XLSX (1970+)BLong history. 8 unmapped category aggregates.
Education property tax requisitionCKAN PDF (annual, 2019+)DPer-municipality page layout. Limited history.
Provincial debt (TRM quarterly)CKAN PDF (quarterly)DLayout drift across fiscal years. Point-in-time, not FY-aligned.
School authority fundingCKAN PDF per authority (1,296 files)D1 PDF per authority per year. pdfplumber extracts cleanly; grain is the problem.