Configuration¶
mcp-data-platform uses YAML configuration with environment variable expansion. Variables in the format ${VAR_NAME} are replaced with their environment values at load time.
How Configuration Works¶
The platform has two configuration modes that control how settings are stored and whether they can be changed at runtime:
File mode (default): Configuration is loaded from a YAML file at startup and is read-only. This is the simplest deployment — no database required.
Database mode: Adding database.dsn unlocks persistent platform features (audit logging, knowledge capture, session externalization). Setting config_store.mode: database additionally enables runtime configuration mutations through the admin API.
| What you configure | What it unlocks |
|---|---|
| YAML file only | Read-only config, in-memory sessions, no audit |
database.dsn |
Audit logging, knowledge capture, OAuth persistence, database-backed sessions |
database.dsn + config_store.mode: database |
All of the above, plus runtime config mutations via admin API |
database.dsn + admin.enabled: true |
REST endpoints for system health, config, personas, auth keys, audit |
See Operating Modes for the full comparison and Admin API for the REST endpoints.
Configuration File¶
Create a platform.yaml file:
apiVersion: v1
server:
name: mcp-data-platform
transport: stdio
toolkits:
trino:
primary:
host: trino.example.com
port: 443
user: ${TRINO_USER}
password: ${TRINO_PASSWORD}
ssl: true
catalog: hive
schema: default
datahub:
primary:
url: https://datahub.example.com
token: ${DATAHUB_TOKEN}
s3:
primary:
region: us-east-1
access_key_id: ${AWS_ACCESS_KEY_ID}
secret_access_key: ${AWS_SECRET_ACCESS_KEY}
injection:
trino_semantic_enrichment: true
datahub_query_enrichment: true
s3_semantic_enrichment: true
column_context_filtering: true # Only include SQL-referenced columns (default: true)
Config Versioning¶
Every configuration file should include an apiVersion field as the first key. This enables safe schema evolution with deprecation warnings and migration tooling.
| Field | Type | Default | Description |
|---|---|---|---|
apiVersion |
string | v1 |
Config schema version. Omitting defaults to v1 for backward compatibility. |
Supported versions: v1 (current)
Version Lifecycle¶
- current: Actively supported, no warnings
- deprecated: Still works, emits a warning at startup with migration guidance
- removed: Rejected at startup with an error pointing to the migration tool
Migration Tool¶
Migrate config files to the latest version:
# From file to stdout
mcp-data-platform migrate-config --config platform.yaml
# From stdin to file
cat platform.yaml | mcp-data-platform migrate-config --output migrated.yaml
# Specify target version
mcp-data-platform migrate-config --config platform.yaml --target-version v1
The migration tool preserves ${VAR} environment variable references.
Server Configuration¶
server:
name: mcp-data-platform # Server name reported to clients
transport: stdio # stdio or http
address: ":8080" # Listen address for HTTP transports
tls:
enabled: false
cert_file: ""
key_file: ""
| Field | Type | Default | Description |
|---|---|---|---|
name |
string | mcp-data-platform |
Server name in MCP handshake |
transport |
string | stdio |
Transport protocol: stdio or http (sse accepted for backward compatibility) |
address |
string | :8080 |
Listen address for HTTP transports |
tls.enabled |
bool | false |
Enable TLS for HTTP transport |
tls.cert_file |
string | - | Path to TLS certificate |
tls.key_file |
string | - | Path to TLS private key |
HTTP Transport Security
When using HTTP transport without TLS, a warning is logged. For production deployments, always enable TLS to encrypt credentials in transit.
Prompts¶
The platform registers MCP prompts at three levels:
-
Auto-registered
platform-overview— Built dynamically fromserver.descriptionand enabled toolkits. Lists what the platform can do based on which toolkits (DataHub, Trino, S3, Portal, Knowledge) are configured. -
Operator-configured prompts — Defined in
server.prompts. Support typed arguments with{placeholder}substitution in content. -
Workflow prompts — Registered automatically when required toolkits are present. Provide guided multi-step workflows (e.g.,
explore-available-data,create-interactive-dashboard,create-a-report,trace-data-lineage).
Operator-configured prompts override any auto-registered prompt with the same name. Toolkits (Portal, Knowledge) may also register their own prompts via the PromptDescriber interface.
server:
description: "ACME Corp analytics platform"
prompts:
- name: routing_rules
description: "How to route queries between systems"
content: |
Before querying, determine if you need ENTITY STATE or ANALYTICS...
- name: explore-topic
description: "Explore data about a specific topic"
content: "Find all datasets related to {topic} and summarize key metrics."
arguments:
- name: topic
description: "The topic to explore"
required: true
| Field | Type | Default | Description |
|---|---|---|---|
server.prompts[].name |
string | required | Prompt name |
server.prompts[].description |
string | - | Prompt description |
server.prompts[].content |
string | required | Prompt content (supports {arg_name} placeholders) |
server.prompts[].arguments |
array | [] |
Typed arguments for the prompt |
server.prompts[].arguments[].name |
string | required | Argument name (maps to {name} in content) |
server.prompts[].arguments[].description |
string | - | Argument description shown to clients |
server.prompts[].arguments[].required |
bool | false |
Whether the argument is required |
Built-in workflow prompts:
| Prompt | Required Toolkits | Description |
|---|---|---|
explore-available-data |
DataHub | Discover datasets about a topic |
create-interactive-dashboard |
DataHub, Trino, Portal | Full workflow: discover, query, visualize, save |
create-a-report |
DataHub, Trino | Discover data, query it, produce a Markdown report |
trace-data-lineage |
DataHub | Trace upstream/downstream lineage for a dataset |
All registered prompts (platform + toolkit) are included in the platform_info tool response and visible in the platform-info app's Prompts tab.
Streamable HTTP Configuration¶
The HTTP transport serves both legacy SSE (/sse, /message) and Streamable HTTP (/) endpoints. Streamable HTTP session behavior is configured under server.streamable:
| Field | Type | Default | Description |
|---|---|---|---|
session_timeout |
duration | 30m |
How long an idle session persists before cleanup |
stateless |
bool | false |
Disable session tracking (no Mcp-Session-Id validation) |
Authentication Configuration¶
auth:
allow_anonymous: false # Require authentication (default)
oidc:
enabled: true
issuer: "https://auth.example.com/realms/platform"
client_id: "mcp-data-platform"
audience: "mcp-data-platform"
role_claim_path: "realm_access.roles"
role_prefix: "dp_"
clock_skew_seconds: 30 # Allowed clock drift
max_token_age: 24h # Reject tokens older than this
api_keys:
enabled: true
keys:
- key: ${API_KEY_ADMIN}
name: "admin"
roles: ["admin"]
| Field | Type | Default | Description |
|---|---|---|---|
allow_anonymous |
bool | false |
Allow unauthenticated requests |
oidc.enabled |
bool | false |
Enable OIDC authentication |
oidc.issuer |
string | - | OIDC issuer URL |
oidc.client_id |
string | - | OAuth client ID |
oidc.audience |
string | - | Expected token audience |
oidc.role_claim_path |
string | roles |
Path to roles in token claims |
oidc.role_prefix |
string | - | Filter roles to those with this prefix |
oidc.clock_skew_seconds |
int | 30 |
Allowed clock skew for time claims |
oidc.max_token_age |
duration | 0 |
Max token age (0 = no limit) |
api_keys.enabled |
bool | false |
Enable API key authentication |
api_keys.keys |
array | - | List of API key configurations |
Fail-Closed Security
Authentication follows a fail-closed model. Missing tokens, invalid signatures, expired tokens, or missing required claims (sub, exp) all result in denied access.
Browser Sessions (OIDC Login for Portal UI)¶
When both auth.oidc and auth.browser_session are enabled, the portal UI offers SSO login via the configured OIDC provider. The flow uses authorization code with PKCE and stores the session in an HMAC-SHA256 signed JWT cookie.
auth:
oidc:
enabled: true
issuer: "https://auth.example.com/realms/platform"
client_id: "mcp-data-platform"
client_secret: "${OIDC_CLIENT_SECRET}"
audience: "mcp-data-platform"
role_claim_path: "realm_access.roles"
role_prefix: "dp_"
scopes: [openid, profile, email]
browser_session:
enabled: true
signing_key: "${SESSION_SIGNING_KEY}" # openssl rand -base64 32
ttl: 8h
secure: true
| Field | Type | Default | Description |
|---|---|---|---|
browser_session.enabled |
bool | false |
Enable cookie-based browser sessions |
browser_session.signing_key |
string | - | Base64-encoded HMAC key (32+ bytes) |
browser_session.ttl |
duration | 8h |
Session lifetime |
browser_session.secure |
bool | true |
HTTPS-only cookies (set false for local dev) |
browser_session.cookie_name |
string | mcp_session |
Cookie name |
browser_session.domain |
string | - | Cookie domain restriction |
The portal UI automatically detects OIDC availability and shows an SSO button. API key authentication remains as a fallback. MCP protocol clients are unaffected — browser sessions only apply to the portal HTTP endpoints.
Session Limitations
Sessions are stateless (no server-side store). Individual sessions cannot be revoked. Rotating signing_key invalidates all active sessions. Users must re-authenticate after TTL expires.
Database Configuration¶
The database block configures the PostgreSQL connection used by audit logging, knowledge capture, session externalization, OAuth persistence, and (optionally) the config store.
| Field | Type | Default | Description |
|---|---|---|---|
dsn |
string | - | PostgreSQL connection string |
max_open_conns |
int | 25 |
Maximum open database connections |
What the database unlocks
Setting dsn enables audit logging, knowledge capture, session externalization, and OAuth persistence. Without it, these features degrade to in-memory or noop implementations.
Config Store¶
The config_store block controls where platform configuration is persisted. By default, configuration is loaded from the YAML file and is read-only. Setting mode to database enables runtime config mutations via the admin API.
| Field | Type | Default | Description |
|---|---|---|---|
mode |
string | file |
Config storage mode: file or database |
file mode: Configuration loaded from YAML at startup. Read-only. Admin API mutation endpoints (config import, persona CRUD, auth key CRUD) return 409 Conflict. This is the default and requires no database.
database mode: Configuration persisted to PostgreSQL config_versions table. Requires database.dsn to be configured. Supports import, export, history, and runtime mutations via the admin API. On startup, bootstrap fields (server, database, auth, admin, config_store, apiVersion) are always loaded from the YAML file and override database values.
See Operating Modes for the full comparison of deployment configurations.
Tool Visibility Configuration¶
The tools block controls which tools appear in tools/list responses. This is a visibility filter for reducing LLM token usage — it hides tools from discovery but does not affect authorization. Persona-level tool filtering (see Tool Filtering) remains the security boundary for tools/call.
| Field | Type | Default | Description |
|---|---|---|---|
tools.allow |
array | [] |
Tool name patterns to include in tools/list |
tools.deny |
array | [] |
Tool name patterns to exclude from tools/list |
Semantics:
- No patterns configured: all tools visible (default)
- Allow only: only matching tools appear
- Deny only: all tools appear except denied
- Both: allow patterns are evaluated first, then deny removes from that set
Patterns use filepath.Match syntax — * matches any sequence of non-separator characters. For example, trino_* matches trino_query, trino_execute, and trino_describe_table.
When to use this
Deployments that only use a subset of toolkits (e.g., only Trino) can hide unused tools to save tokens. A full tool list is 26-33 tools; filtering to trino_* reduces it to 8.
Not a security boundary
Tool visibility filtering only affects tools/list responses. A user who knows a tool name can still call it via tools/call if their persona allows it. Use persona tool filtering for access control.
Admin API Configuration¶
The admin block enables and configures the REST API for system health, configuration management, persona CRUD, auth key management, and audit queries.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable admin REST API |
persona |
string | admin |
Persona required for admin access |
path_prefix |
string | /api/v1/admin |
URL prefix for admin endpoints |
HTTP transport required
The admin API is served over HTTP. It is not available when running in stdio transport mode.
The admin portal provides a web-based dashboard for audit log exploration, tool execution testing, and system monitoring. Enable with portal.enabled: true. When enabled, it is served at /portal/. See Admin API for the full endpoint reference and Admin Portal for the visual guide.
Audit Configuration¶
The audit block controls audit logging of MCP tool calls. Audit events are written asynchronously to PostgreSQL.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable audit logging |
log_tool_calls |
bool | false |
Log MCP tool call events |
retention_days |
int | 90 |
Days to retain audit events |
Requires database
Audit logging requires database.dsn to be configured. Both enabled and log_tool_calls must be true for tool call events to be recorded.
See Audit Logging for query examples and retention details.
Session Configuration¶
The sessions block controls how MCP session state is stored. In-memory sessions are lost on restart; database-backed sessions survive restarts and support multi-replica deployments.
| Field | Type | Default | Description |
|---|---|---|---|
store |
string | memory |
Backend: memory or database |
ttl |
duration | streamable session_timeout |
Session lifetime |
idle_timeout |
duration | streamable session_timeout |
Idle eviction threshold |
cleanup_interval |
duration | 1m |
Cleanup routine interval |
Requires database
The database store requires database.dsn to be configured.
See Session Externalization for architecture details and multi-replica considerations.
Toolkit Configuration¶
Trino¶
toolkits:
trino:
primary: # Instance name (can be any identifier)
host: trino.example.com
port: 443
user: analyst
password: ${TRINO_PASSWORD}
catalog: hive
schema: default
ssl: true
ssl_verify: true
timeout: 120s
default_limit: 1000
max_limit: 10000
read_only: false
connection_name: primary
| Field | Type | Default | Description |
|---|---|---|---|
host |
string | required | Trino coordinator hostname |
port |
int | 8080 (443 if SSL) | Trino coordinator port |
user |
string | required | Trino username |
password |
string | - | Trino password (if auth enabled) |
catalog |
string | - | Default catalog |
schema |
string | - | Default schema |
ssl |
bool | false |
Enable SSL/TLS |
ssl_verify |
bool | true |
Verify SSL certificates |
timeout |
duration | 120s |
Query timeout |
default_limit |
int | 1000 |
Default row limit for queries |
max_limit |
int | 10000 |
Maximum allowed row limit |
read_only |
bool | false |
Restrict to read-only queries |
connection_name |
string | instance name | Display name for this connection |
DataHub¶
toolkits:
datahub:
primary:
url: https://datahub.example.com
token: ${DATAHUB_TOKEN}
timeout: 30s
default_limit: 10
max_limit: 100
max_lineage_depth: 5
connection_name: primary
read_only: true
| Field | Type | Default | Description |
|---|---|---|---|
url |
string | required | DataHub GMS URL |
token |
string | - | DataHub access token |
timeout |
duration | 30s |
API request timeout |
default_limit |
int | 10 |
Default search result limit |
max_limit |
int | 100 |
Maximum search result limit |
max_lineage_depth |
int | 5 |
Maximum lineage traversal depth |
connection_name |
string | instance name | Display name for this connection |
read_only |
bool | false |
Restrict to read operations (disables write tools) |
S3¶
toolkits:
s3:
primary:
region: us-east-1
endpoint: "" # Custom endpoint for MinIO, etc.
access_key_id: ${AWS_ACCESS_KEY_ID}
secret_access_key: ${AWS_SECRET_ACCESS_KEY}
session_token: ""
profile: "" # AWS profile name
use_path_style: false # Use path-style URLs
timeout: 30s
disable_ssl: false
read_only: true # Restrict to read operations
max_get_size: 10485760 # 10MB
max_put_size: 104857600 # 100MB
connection_name: primary
bucket_prefix: "" # Filter to buckets with this prefix
| Field | Type | Default | Description |
|---|---|---|---|
region |
string | us-east-1 |
AWS region |
endpoint |
string | - | Custom S3 endpoint (for MinIO, etc.) |
access_key_id |
string | - | AWS access key ID |
secret_access_key |
string | - | AWS secret access key |
session_token |
string | - | AWS session token (for temporary creds) |
profile |
string | - | AWS credentials profile name |
use_path_style |
bool | false |
Use path-style S3 URLs |
timeout |
duration | 30s |
Request timeout |
disable_ssl |
bool | false |
Disable SSL (for local testing) |
read_only |
bool | false |
Restrict to read operations |
max_get_size |
int64 | 10485760 |
Max bytes to read from objects |
max_put_size |
int64 | 104857600 |
Max bytes to write to objects |
connection_name |
string | instance name | Display name for this connection |
bucket_prefix |
string | - | Only show buckets with this prefix |
Cross-Injection Configuration¶
injection:
trino_semantic_enrichment: true # Add DataHub context to Trino results
datahub_query_enrichment: true # Add Trino availability to DataHub results
s3_semantic_enrichment: true # Add DataHub context to S3 results
datahub_storage_enrichment: true # Add S3 availability to DataHub results
column_context_filtering: true # Only include SQL-referenced columns (default: true)
# Session metadata deduplication (avoids repeating metadata for same table)
session_dedup:
enabled: true # Default: true
mode: reference # reference (default), summary, none
entry_ttl: 5m # Defaults to semantic.cache.ttl
session_timeout: 30m # Defaults to server.streamable.session_timeout
| Field | Type | Default | Description |
|---|---|---|---|
trino_semantic_enrichment |
bool | false |
Enrich Trino results with DataHub metadata |
datahub_query_enrichment |
bool | false |
Add query availability to DataHub search results |
s3_semantic_enrichment |
bool | false |
Enrich S3 results with DataHub metadata |
datahub_storage_enrichment |
bool | false |
Add S3 availability to DataHub results |
column_context_filtering |
bool | true |
Limit column enrichment to SQL-referenced columns |
session_dedup.enabled |
bool | true |
Whether session dedup is active |
session_dedup.mode |
string | reference |
Repeat query content: reference, summary, none |
session_dedup.entry_ttl |
duration | semantic cache TTL | How long a table stays "already sent" |
session_dedup.session_timeout |
duration | streamable session timeout | Idle session cleanup interval |
Semantic and Query Provider Configuration¶
Specify which toolkit instance provides semantic metadata and query execution:
semantic:
provider: datahub # Provider type: datahub or noop
instance: primary # Which DataHub instance to use
cache:
enabled: true
ttl: 5m
query:
provider: trino # Provider type: trino or noop
instance: primary # Which Trino instance to use
storage:
provider: s3 # Provider type: s3 or noop
instance: primary # Which S3 instance to use
Persona Configuration¶
Personas define tool access based on user roles. The security model follows a default-deny approach.
personas:
definitions:
analyst:
display_name: "Data Analyst"
roles: ["analyst", "data_engineer"]
tools:
allow: ["trino_*", "datahub_*"]
deny: ["*_delete_*", "*_drop_*"]
admin:
display_name: "Administrator"
roles: ["admin"]
tools:
allow: ["*"]
default_persona: analyst
| Field | Type | Default | Description |
|---|---|---|---|
definitions |
map | - | Named persona configurations |
definitions.<name>.display_name |
string | - | Human-readable name |
definitions.<name>.roles |
array | - | Roles that map to this persona |
definitions.<name>.tools.allow |
array | [] |
Allowed tool patterns |
definitions.<name>.tools.deny |
array | [] |
Denied tool patterns |
default_persona |
string | - | Persona for users without role match |
Default-Deny Security
Users without a resolved persona have no tool access. The built-in default persona denies all tools. You must define explicit personas with tool access for your users.
Knowledge Capture Configuration¶
Knowledge capture records domain knowledge shared during AI sessions and provides a workflow for applying approved insights to the DataHub catalog. See Knowledge Capture for the full feature documentation.
knowledge:
enabled: true
apply:
enabled: true
datahub_connection: primary
require_confirmation: true
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable knowledge capture toolkit (capture_insight tool) |
apply.enabled |
bool | false |
Enable the apply_knowledge tool for admin review and catalog write-back |
apply.datahub_connection |
string | - | DataHub instance name for write-back operations |
apply.require_confirmation |
bool | false |
Require explicit confirm: true on apply actions |
Prerequisites
Knowledge capture requires database.dsn to be configured. The apply_knowledge tool requires the admin persona.
MCP Apps Configuration¶
MCP Apps provide interactive UI components that enhance tool results. The platform provides the infrastructure; you provide the HTML/JS/CSS apps.
mcpapps:
enabled: true
apps:
query_results:
enabled: true
assets_path: "/etc/mcp-apps/query-results"
tools:
- trino_query
- trino_execute
csp:
resource_domains:
- "https://cdn.jsdelivr.net"
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable MCP Apps infrastructure |
apps |
map | - | Named app configurations |
apps.<name>.enabled |
bool | true |
Enable this app |
apps.<name>.assets_path |
string | required | Absolute path to app directory |
apps.<name>.tools |
array | required | Tools this app enhances |
apps.<name>.csp.resource_domains |
array | - | Allowed CDN origins |
See MCP Apps Configuration for complete options.
Resource Templates Configuration¶
Resource templates expose platform data as browseable, parameterized MCP resources using RFC 6570 URI templates.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable resource templates |
When enabled, the platform registers these resource templates:
schema://{catalog}.{schema}/{table}— Table schema with column types and descriptionsglossary://{term}— Glossary term definitionsavailability://{catalog}.{schema}/{table}— Query availability and row counts
Clients that support resource browsing (e.g., Claude Desktop) will show these as navigable resources alongside tools.
Custom Resources Configuration¶
Custom resources let you expose arbitrary static content as named MCP resources — brand assets, operational limits, environment docs, or any structured blob that agents can read by URI. They are registered whenever resources.custom is non-empty, independent of resources.enabled.
resources:
custom:
- uri: "brand://theme"
name: "Brand Theme"
description: "Primary brand colors and site URL"
mime_type: "application/json"
content: |
{
"colors": {"primary": "#FF6B35", "secondary": "#004E89"},
"url": "https://example.com"
}
- uri: "brand://logo"
name: "Brand Logo SVG"
mime_type: "image/svg+xml"
content_file: "/etc/platform/logo.svg"
| Field | Type | Required | Description |
|---|---|---|---|
uri |
string | Yes | Unique resource URI (e.g., brand://theme, docs://limits) |
name |
string | Yes | Human-readable name shown in resources/list |
description |
string | No | Optional description for MCP clients |
mime_type |
string | Yes | MIME type (e.g., application/json, image/svg+xml, text/plain) |
content |
string | One of | Inline content (text, JSON, SVG, etc.) |
content_file |
string | One of | Absolute path to a file; read on every request (supports hot-reload) |
content and content_file are mutually exclusive. Invalid entries (missing required fields, both or neither content fields set) are skipped with a warning at startup; valid entries in the same list are still registered.
Progress Notifications Configuration¶
Progress notifications send granular updates to MCP clients during long-running Trino queries. The client must include _meta.progressToken in the request to receive updates.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable progress notifications |
When enabled, Trino query execution sends progress updates including rows scanned, bytes processed, and query stage information. Clients that don't send a progressToken receive no notifications (zero overhead).
Client Logging Configuration¶
Client logging sends server-to-client log messages via the MCP logging/setLevel protocol. Messages include enrichment decisions, timing data, and platform diagnostics.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable client logging |
Zero overhead if the client hasn't subscribed via logging/setLevel. When active, log messages report semantic cache hits/misses, enrichment timing, and cross-injection decisions.
Elicitation Configuration¶
Elicitation requests user confirmation before potentially expensive or sensitive operations. Requires client-side elicitation support (e.g., Claude Desktop). Gracefully degrades to a no-op if the client doesn't support elicitation.
elicitation:
enabled: true
cost_estimation:
enabled: true
row_threshold: 1000000
pii_consent:
enabled: true
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable elicitation |
cost_estimation.enabled |
bool | false |
Prompt before expensive queries |
cost_estimation.row_threshold |
int | 1000000 |
Row count threshold from EXPLAIN IO estimates |
pii_consent.enabled |
bool | false |
Prompt when query accesses PII-tagged columns |
Client support required
Elicitation uses the MCP elicitation/create capability. Clients that don't support elicitation will not receive prompts — queries proceed without confirmation.
Icons Configuration¶
Icons add visual metadata to tools, resources, and prompts in MCP list responses. Upstream toolkits (Trino, DataHub, S3) provide default icons; this configuration overrides or extends them.
icons:
enabled: true
tools:
trino_query:
src: "https://example.com/custom-trino.svg"
mime_type: "image/svg+xml"
resources:
"schema://{catalog}.{schema}/{table}":
src: "https://example.com/schema.svg"
prompts:
knowledge_capture:
src: "https://example.com/knowledge.svg"
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Enable icon injection middleware |
tools |
map | - | Icon overrides keyed by tool name |
resources |
map | - | Icon overrides keyed by resource URI |
prompts |
map | - | Icon overrides keyed by prompt name |
*.src |
string | - | Icon source URL |
*.mime_type |
string | - | Icon MIME type (e.g., image/svg+xml) |
Default icons
Each upstream toolkit provides a default icon for all its tools. You only need this configuration if you want to customize or override those defaults.
Environment Variables¶
Common environment variables:
| Variable | Description |
|---|---|
TRINO_USER |
Trino username |
TRINO_PASSWORD |
Trino password |
DATAHUB_TOKEN |
DataHub access token |
AWS_ACCESS_KEY_ID |
AWS access key |
AWS_SECRET_ACCESS_KEY |
AWS secret key |
AWS_SESSION_TOKEN |
AWS session token |
DATABASE_URL |
PostgreSQL connection string (for audit/OAuth) |
Complete Example¶
apiVersion: v1
server:
name: mcp-data-platform
transport: http
address: ":8080"
database:
dsn: ${DATABASE_URL}
config_store:
mode: database
portal:
enabled: true
admin:
enabled: true
persona: admin
# Hide unused tools from tools/list to save LLM tokens
tools:
allow:
- "trino_*"
- "datahub_*"
- "capture_insight"
deny:
- "*_delete_*"
audit:
enabled: true
log_tool_calls: true
retention_days: 90
sessions:
store: database
ttl: 30m
idle_timeout: 30m
cleanup_interval: 1m
auth:
api_keys:
enabled: true
keys:
- key: ${API_KEY_ADMIN}
name: "admin"
roles: ["admin"]
toolkits:
trino:
primary:
host: trino.example.com
port: 443
user: ${TRINO_USER}
password: ${TRINO_PASSWORD}
ssl: true
catalog: hive
schema: default
default_limit: 1000
max_limit: 10000
datahub:
primary:
url: https://datahub.example.com
token: ${DATAHUB_TOKEN}
default_limit: 10
max_limit: 100
s3:
primary:
region: us-east-1
read_only: true
semantic:
provider: datahub
instance: primary
cache:
enabled: true
ttl: 5m
query:
provider: trino
instance: primary
storage:
provider: s3
instance: primary
injection:
trino_semantic_enrichment: true
datahub_query_enrichment: true
s3_semantic_enrichment: true
column_context_filtering: true
resources:
enabled: true
progress:
enabled: true
client_logging:
enabled: true
elicitation:
enabled: true
cost_estimation:
enabled: true
row_threshold: 1000000
personas:
definitions:
analyst:
display_name: "Data Analyst"
roles: ["analyst"]
tools:
allow: ["trino_query", "trino_execute", "trino_explain", "datahub_*"]
deny: ["*_delete_*"]
admin:
display_name: "Administrator"
roles: ["admin"]
tools:
allow: ["*"]
default_persona: analyst
Next Steps¶
- Operating Modes - Standalone, file + DB, and bootstrap + DB config modes
- Admin API - REST endpoints for system, config, personas, auth keys, audit
- Tools - Available tools and parameters
- Multi-Provider - Configure multiple instances
- Authentication - Add authentication
- Personas - Role-based access control
- MCP Apps - Interactive UI for tool results
- Middleware Reference - Request processing chain details