Snowflake External Functions
External functions in Snowflake allow you to call custom code running as serverless functions in cloud providers, extending Snowflake's capabilities with custom APIs and third-party services.
What are External Functions?
- Execute code in a remote cloud service (AWS Lambda, Azure Functions, GCP Cloud Functions)
- Return results directly to Snowflake queries
- Process data in batches for efficiency
Architecture Overview
External functions in Snowflake follow this architecture:
- Snowflake Side β SQL Query Processing β API Integration Layer β CREATE EXTERNAL FUNCTION
- HTTP/S Connection β Snowflake sends batched row data to the cloud provider
- Cloud Provider Side β API Gateway / Load Balancer β Lambda functions (1..N) β Custom Business Logic (APIs, ML Models, Transforms)
Key Concepts
DfExternal Function
DfBatching
API Integration Setup
AWS API Gateway + Lambda
-- Create API integration
CREATE OR REPLACE API INTEGRATION aws_api_integration
API_PROVIDER = aws_api_gateway
API_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/snowflake-api-role'
API_ALLOWED_PREFIXES = ('https://abc123.execute-api.us-east-1.amazonaws.com/prod')
ENABLED = TRUE;
-- Check integration status
DESC INTEGRATION aws_api_integration;
Azure API Management + Functions
CREATE OR REPLACE API INTEGRATION azure_api_integration
API_PROVIDER = azure_api_management
AZURE_TENANT_ID = 'your-tenant-id'
AZURE_AD_APPLICATION_ID = 'your-app-id'
API_ALLOWED_PREFIXES = ('https://myapim.azure-api.net/')
ENABLED = TRUE;
GCP API Gateway + Cloud Functions
CREATE OR REPLACE API INTEGRATION gcp_api_integration
API_PROVIDER = google_cloud
GCP_SERVICE_ACCOUNT = 'snowflake@my-project.iam.gserviceaccount.com'
API_ALLOWED_PREFIXES = ('https://my-gateway-abc123.uc.r.appspot.com/')
ENABLED = TRUE;
Creating External Functions
Simple External Function
CREATE OR REPLACE EXTERNAL FUNCTION sentiment_analysis(text VARCHAR)
RETURNS VARCHAR
API_INTEGRATION = aws_api_integration
MAX_BATCH_ROWS = 100
AS 'https://abc123.execute-api.us-east-1.amazonaws.com/prod/sentiment';
Complex External Function with Multiple Returns
CREATE OR REPLACE EXTERNAL FUNCTION translate_text(
input_text VARCHAR,
source_lang VARCHAR,
target_lang VARCHAR
)
RETURNS TABLE (translated_text VARCHAR, confidence DECIMAL(5,4))
API_INTEGRATION = azure_api_integration
MAX_BATCH_ROWS = 50
AS 'https://myapim.azure-api.net/translate';
Using External Functions
-- Simple call
SELECT
customer_feedback,
sentiment_analysis(customer_feedback) as sentiment
FROM feedback_table;
-- Table function call
SELECT
f.translated_text,
f.confidence
FROM my_table,
LATERAL TABLE(translate_text(text, 'en', 'es')) f;
Lambda Function Example (Python)
import json
import boto3
from datetime import datetime
def lambda_handler(event, context):
"""
Snowflake External Function Handler
Receives batch of rows and returns results
"""
# Parse Snowflake request
rows = event['data']
results = []
for row in rows:
row_number = row[0]
input_data = row[1:]
# Custom processing logic
result = process_data(input_data)
results.append({
'row_number': row_number,
'data': [result]
})
return {'data': results}
def process_data(data):
# Custom business logic here
# Example: sentiment analysis, translation, etc.
return {"processed": True, "timestamp": datetime.utcnow().isoformat()}
Performance Considerations
External function performance depends on API response time, network latency, and batch size. Optimal batch sizes typically range from 100-500 rows. Monitor function duration to avoid timeouts (default 10 minutes).
Batching Strategy
| Batch Size | Latency | Throughput | Use Case |
|---|---|---|---|
| 10-50 | Low | Moderate | Real-time APIs |
| 100-250 | Medium | High | Batch processing |
| 500-1000 | High | Maximum | Bulk transformations |
Error Handling
-- Create function with error handling
CREATE OR REPLACE EXTERNAL FUNCTION safe_api_call(input VARCHAR)
RETURNS VARCHAR
API_INTEGRATION = aws_api_integration
MAX_BATCH_ROWS = 100
AS 'https://api.example.com/process';
-- Handle errors in queries
SELECT
input_data,
safe_api_call(input_data) as result
FROM my_table
WHERE safe_api_call(input_data) IS NOT NULL;
Monitoring and Logging
-- Check function invocations
SELECT *
FROM TABLE(INFORMATION_SCHEMA.EXTERNAL_FUNCTION_INVOCATION_HISTORY(
START_TIME => DATEADD(hour, -24, CURRENT_TIMESTAMP()),
END_TIME => CURRENT_TIMESTAMP()
))
ORDER BY START_TIME DESC;
- External functions extend Snowflake with custom cloud-native code
- API Integrations configure secure connections to cloud services
- Batch processing optimizes performance and reduces API calls
- Support for AWS, Azure, and GCP serverless functions
- Monitor invocation history for performance optimization