πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Snowflake Advanced UDFs and UDTFs

🟒 Free Lesson

Advertisement

Snowflake Advanced UDFs and UDTFs

Advanced UDFs and UDTFs in Snowflake enable custom data transformations, complex calculations, and reusable business logic that extends Snowflake's built-in capabilities.

UDF vs UDTF ComparisonUDF (Scalar)Single value in/outUDTF (Table)Multi-row outputUDAF (Aggregate)Multi-row in/outUDTF GeneratorRow generationSELECT func()WHERE, GROUP BYFROM func()LATERAL FLATTENGROUP BY func()Window functionsTABLE(func())Row generationUDF: scalar | UDTF: table | UDAF: aggregate | Generator: row production

What are UDFs and UDTFs?

  • UDFs return a single value per call (scalar)
  • UDTFs return multiple rows (table functions)
  • Supported in SQL, JavaScript, Python, Java, and Scala

Architecture Overview

The UDF/UDTF architecture includes two main types:

User-Defined Functions (UDFs)

  • Scalar UDFs β€” Single value in/out
  • Stored UDFs β€” Persisted result caching
  • Languages β€” SQL, JavaScript, Python, Java, Scala
  • Usage β€” SELECT, WHERE, GROUP BY
  • Determinism β€” Deterministic or non-deterministic

User-Defined Table Functions (UDTFs)

  • Table Functions β€” Multi-row output
  • Generator Functions β€” Row generation
  • Usage β€” LATERAL FLATTEN, TABLE()
  • Returns β€” Result sets; can process arrays and objects

Comparison

TypeOutputUsagePerformance
UDFSingle valueExpressionsFast
UDTFMultiple rowsTABLE()Medium
Stored ProcedureDML/DDLSide effectsVaries
External FunctionCloud APIsNetwork latencyVariable

Scalar UDFs

SQL Scalar UDF

CREATE OR REPLACE FUNCTION calculate_discount(price DECIMAL(10,2), discount_pct DECIMAL(5,2))
RETURNS DECIMAL(10,2)
LANGUAGE SQL
IMMUTABLE
AS
  $$
    SELECT price * (1 - discount_pct / 100)
  $$;

-- Usage
SELECT
  product_name,
  price,
  calculate_discount(price, 15) as discounted_price
FROM products;

JavaScript Scalar UDF

CREATE OR REPLACE FUNCTION parse_email(email VARCHAR)
RETURNS OBJECT
LANGUAGE JAVASCRIPT
IMMUTABLE
AS
  $$
    const parts = email.split('@');
    return {
      username: parts[0],
      domain: parts[1],
      isValid: parts.length === 2 && parts[1].includes('.')
    };
  $$;

-- Usage
SELECT
  email,
  parse_email(email):username::STRING as username,
  parse_email(email):domain::STRING as domain
FROM users;

Python Scalar UDF

CREATE OR REPLACE FUNCTION sentiment_score(text VARCHAR)
RETURNS FLOAT
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('textblob')
HANDLER = 'get_sentiment'
AS
  $$
    from textblob import TextBlob
    
    def get_sentiment(text):
      blob = TextBlob(text)
      return blob.sentiment.polarity
  $$;

User-Defined Table Functions (UDTFs)

SQL UDTF

CREATE OR REPLACE FUNCTION generate_date_range(start_date DATE, end_date DATE)
RETURNS TABLE (date DATE, day_name VARCHAR, week_number INTEGER)
LANGUAGE SQL
AS
  $$
    WITH date_series AS (
      SELECT DATEADD(day, SEQ4(), :start_date) as date
      FROM TABLE(GENERATOR(ROWCOUNT => DATEDIFF('day', :start_date, :end_date) + 1))
    )
    SELECT
      date,
      DAYNAME(date) as day_name,
      WEEKOFYEAR(date) as week_number
    FROM date_series
  $$;

-- Usage
SELECT * FROM TABLE(generate_date_range('2024-01-01', '2024-12-31'));

JavaScript UDTF

CREATE OR REPLACE FUNCTION split_string_to_rows(input VARCHAR, delimiter VARCHAR)
RETURNS TABLE (value VARCHAR, position INTEGER)
LANGUAGE JAVASCRIPT
AS
  $$
    const values = INPUT_DELIMITER.split(DELIMITER);
    for (let i = 0; i < values.length; i++) {
      yield {VALUE: values[i], POSITION: i + 1};
    }
  $$
  INPUT_DELIMITER => (delimiter VARCHAR, delimiter VARCHAR);

-- Usage
SELECT * FROM TABLE(split_string_to_rows('a,b,c,d', ','));

Python UDTF for Time Series

CREATE OR REPLACE FUNCTION moving_average(values ARRAY, window_size INTEGER)
RETURNS TABLE (index INTEGER, value FLOAT, moving_avg FLOAT)
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
HANDLER = 'compute_moving_avg'
AS
  $$
    class compute_moving_avg:
      def __init__(self):
        self.values = []
        self.window_size = 0
      
      def process(self, values, window_size):
        self.values = values
        self.window_size = window_size
        for i in range(len(values)):
          start = max(0, i - window_size + 1)
          window_vals = values[start:i+1]
          moving_avg = sum(window_vals) / len(window_vals)
          yield (i, values[i], moving_avg)
  $$;

Stored UDFs for Caching

CREATE OR REPLACE FUNCTION expensive_calculation(input VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
CALLED ON NULL INPUT
STABLE
AS
  $$
    SELECT MD5(input || CURRENT_DATE())
  $$;

-- Stored UDF with caching
CREATE OR REPLACE FUNCTION cached_lookup(key VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
STABLE
AS
  $$
    SELECT value FROM lookup_table WHERE lookup_key = key
  $$;

Use IMMUTABLE for functions with no side effects and consistent outputs. Use STABLE for functions that depend on session state. Use VOLATILE for functions with random or time-dependent outputs.

UDF Best Practices

AspectBest PracticeImpact
DeterminismMark as IMMUTABLE/STABLEEnables caching
NULL handlingUse COALESCE or CALLED ON NULLPrevents errors
Language choiceMatch to complexityPerformance
Batch sizeProcess in batchesMemory efficiency
Error handlingValidate inputsReliability
  • UDFs return single values, UDTFs return result sets
  • SQL, JavaScript, Python, Java, and Scala are supported
  • IMMUTABLE/STABLE/VOLATILE affect caching behavior
  • UDTFs are essential for row-generating transformations
  • Stored UDFs cache results for repeated calls
⭐

Premium Content

Snowflake Advanced UDFs and UDTFs

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Snowflake Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement