⚙️ SQL Patterns for Optimizing IoT Queries in TimescaleDB

IoT applications generate massive amounts of time-series data—millions of rows from thousands of sensors, often with unpredictable reporting intervals. As your dataset grows, queries that once took milliseconds can take minutes or more if not properly optimized.

In this post, we’ll explore practical SQL patterns for retrieving the latest sensor readings efficiently, and how to use continuous aggregates for historical analytics—based on real-world deployments using TimescaleDB on PostgreSQL.

Understanding IoT Table Design

How you structure your sensor data impacts everything from ingestion to query performance. Let’s look at two common formats:

1. Narrow Table Format (Flexible + Efficient)

This design works well when:

Sensors sample at different times or rates
You want to avoid schema changes when adding new sensors
You support hysteresis (only recording changes)

An example of a narrow table format is:

Narrow table format:

ts	sensor_id	value
2024-10-31 11:17:30.000	1007	23.45

Pair this with a sensor metadata table:

Sensor metadata table:

sensor_id	sensor_name	units
1007	temperature	degreesC
1012	heat_mode	on/off
1013	cooling_mode	on/off
1041	occupancy	number of people in room

2. Medium-Width Format (Multi-type Flexibility)

Adds columns for different data types (float, int, bool, text, json).

Medium-width table format:

ts	sensor_id	d	i	b	t	j
2024-10-31 11:17:30.000	1007	23.45	null	null	null	null
2024-10-31 11:17:47.000	1012	null	null	TRUE	null	null
2024-10-31 11:18:01.000	1041	null	4	null	null	null

Add a constraint to enforce that at least one value is recorded:

CONSTRAINT at_least_one_not_null
  CHECK ((d IS NOT NULL) OR (i IS NOT NULL) OR (b IS NOT NULL) OR (j IS NOT NULL) OR (t IS NOT NULL))

Querying the Latest Sensor Values (Efficiently)

The Problem:
Naive queries (like SELECT MAX(ts)) don’t scale well. With 1,000+ sensors and 40M+ rows, full-table scans can take over an hour.

The Goal:
Get the latest reading for each sensor—fast.

Pattern: SELECT DISTINCT ON + Time Constraint
This is a reliable pattern for dashboards or status displays.

WITH latest_data AS (
  SELECT DISTINCT ON (sensor_id) ts, sensor_id, d
  FROM iot_data
  WHERE d IS NOT NULL
    AND ts > CURRENT_TIMESTAMP - INTERVAL '1 week'  --  crucial
  ORDER BY sensor_id, ts DESC
)
SELECT
  sensor_id, sensors.sensor_name, ts, d
FROM latest_data
LEFT JOIN sensors ON latest_data.sensor_id = sensors.sensor_id
ORDER BY sensor_id, ts;

Why it works:

DISTINCT ON gets the most recent row per sensor_id
Time constraint avoids scanning old or inactive data
CTE makes the join cleaner and improves readability

Without the time filter, this query can take >60 minutes on large datasets.

2. Pattern: JOIN LATERAL for Targeted Subqueries
Great when querying a subset of known sensors.

SELECT sensor_list.sensor_id, latest_data.ts, latest_data.d
FROM sensors sensor_list
LEFT JOIN LATERAL (
  SELECT ts, d
  FROM iot_data raw_data
  WHERE sensor_id = sensor_list.sensor_id
  ORDER BY ts DESC
  LIMIT 1
) latest_data ON true
WHERE latest_data.d IS NOT NULL
  AND latest_data.ts > CURRENT_TIMESTAMP - INTERVAL '1 week'
ORDER BY sensor_list.sensor_id, latest_data.ts;

Why it works:

Limits scans to known sensors (from sensors table)
LIMIT 1 exits early once it finds the latest value
Cleaner for subsets (e.g., by location or device type)

3. Pattern: Continuous Aggregates for Historical Analysis

Timescale’s continuous aggregates precompute and store results—perfect for queries over longer time ranges.

CREATE MATERIALIZED VIEW temperature_daily_summary
WITH (timescaledb.continuous) AS
SELECT
  device_id,
  time_bucket('1 day', ts) AS day,
  AVG(d) AS avg_temp,
  MAX(d) AS max_temp,
  MIN(d) AS min_temp
FROM iot_data
GROUP BY device_id, day;

Then, set a policy to keep the view fresh:

SELECT add_continuous_aggregate_policy('temperature_daily_summary',
  start_offset => INTERVAL '1 month',
  end_offset => INTERVAL '1 day',
  schedule_interval => INTERVAL '1 hour');

Works seamlessly with compression and tiered storage.

Performance Takeaways

Always use time constraints (e.g. last 1h, 1d, 1w)—this avoids unnecessary scans
Use sensor metadata as your driving table when possible (it’s smaller)
Use LIMIT 1, JOIN LATERAL, or DISTINCT ON patterns strategically
For historical views, use continuous aggregates (they’re fast and storage-efficient)

Summary: Choosing the Right Pattern

Pattern	Best For	Notes
`SELECT DISTINCT ON`	General latest-value queries	Simple, fast with time constraint
`JOIN LATERAL`	Targeted sensor queries	Efficient for subsets
Continuous Aggregates	Historical analytics	Fast + compressed + refreshable

Learn More

Try it Yourself

Source link

Trending News

Finance

Finance

Category Collection

SQL Patterns for Optimizing IoT Queries in TimescaleDB

Understanding IoT Table Design

Querying the Latest Sensor Values (Efficiently)

Performance Takeaways

Summary: Choosing the Right Pattern

Learn More

Try it Yourself

Leave a Reply Cancel reply

Trending News

Finance

Finance

Category Collection

Understanding IoT Table Design

Querying the Latest Sensor Values (Efficiently)

Performance Takeaways

Summary: Choosing the Right Pattern

Learn More

Try it Yourself

Leave a Reply Cancel reply

Related News

Encryption and Information Security – DEV Community

How Hidden Channels and Miner Teams Shape Transaction Transparency

Searching among 3.2 Billion Common Crawl URLs with

How We Collected Bitcoin/Ethereum Data and Tested Miner Collusion