Data model

When Pulsy answers a performance question, it isn’t querying raw tables. It goes through a semantic layer: one Snowflake Cortex Agent (AMAZON_ADS_AGENT, running claude-sonnet-4-6) that routes the question to one of six domain analysts, each backed by a curated semantic view. Understanding those six is how you predict what Pulsy can and can’t answer.

The schemas behind it

Everything lives in the Snowflake database AMAZON_MARKETING:

Schema	What it holds
`PULSEAD`	The main source layer — 218 tables of ingested Amazon Ads data.
`PULSY`	Gold/reference layer — 65 tables (TaCoS, hourly, Brand Analytics). Reference only; a schema, not the product.
`SEMANTIC_LAYER`	The agent-facing layer: the six views + the Cortex Agent + helper views.
`VALIDATION`	QA systems (VADER, AQUA) that gate semantic-view quality.

The six analysts

One agent fronts six Cortex Analyst tools (plus a data_to_chart visualizer) and routes by intent. Each analyst maps to one semantic view:

Analyst	View	Covers	Key gotcha
SponsoredAds	`SPONSORED_ADS_PERFORMANCE`	SP/SB/SD performance in one place — ROAS, ACOS, CPC, CTR, sales, CVR.	Requires a `PROFILE_ID` filter; SP/SB/SD are combined by `UNION ALL`, never joined.
SearchTerm	`SEARCH_TERM_V4`	Customer search terms vs. your keywords, plus marketplace Brand Analytics search frequency.	Brand Analytics has no `PROFILE_ID` and a weekly/monthly grain — must not be joined to daily SP/SB data.
Product	`PRODUCT_INSIGHT`	ASIN-level performance, cross-sell (advertised vs. purchased ASIN), BSR.	Product metadata is deduped to one row/ASIN to prevent fan-out.
Budget	`BUDGET_MANAGEMENT`	Burn rate, missed sales/clicks, Amazon’s recommended budgets.	Filter `BUDGET_SCOPE_TYPE` to a single value (campaign or portfolio) or you double-count.
DSP	`DSP_PERFORMANCE`	Programmatic DSP creative/line-item performance.	Keyed on `ADVERTISER_ID`, not `PROFILE_ID`; cost is in millicents (÷100,000 → USD); KPIs differ (DPVR, ATC).
NTB	`BRAND_NTB`	New-to-brand acquisition across SB/SD/DSP.	DSP NTB ≠ SB/SD NTB (Amazon-wide vs. brand-specific) — don’t compare directly. SP has no NTB data.

The pattern to carry: Sponsored Ads and DSP are different worlds. Sponsored Ads is profile-scoped, dollar-denominated, and combined across SP/SB/SD; DSP is advertiser-scoped, millicent-denominated, and has its own KPIs. A question that crosses them needs two analysts, not one.

How current the data is

The semantic layer is refreshed nightly at 08:00 KST (an AWS EventBridge schedule, cron(0 8 * * ? *), Asia/Seoul). That governs the semantic assets — the views and their metadata. The freshness of the underlying PULSEAD source tables is governed separately by Amazon Ads ingestion, not by this refresh. So “the data is a day old” is the safe default assumption for semantic-layer answers.

Where the definitions live (SSOT)

Two repos own this layer, and they’re two generations of the same capability:

data_semanticview is the source of truth for what exists — the deployed view DDL (02_develop/semantic_views/deploy/*.sql) is “the only version executed in Snowflake,” plus the analyst routing and the VADER/AQUA validation.
snowflake-semantic-agent is the newer automated refresh runtime that regenerates and publishes the semantic assets (the control / knowledge / execution planes). It shares the SEMANTIC_LAYER schema.

Don’t describe it as a clean “runtime consumes definitions” split — both can produce SEMANTIC_LAYER assets; the manual layer is being superseded by the automated one.