Finding Booking Islands: Adapting a Network Monitoring Pattern [>_]

From Network Alarms to Hotel Bookings

As a network engineer, I frequently deal with alarm events from monitoring systems. Devices generate overlapping alerts—one alarm might trigger at 10:00 and clear at 10:15, while another starts at 10:05 and runs until 10:20. The key question is: when do we actually have continuous issues versus isolated incidents?

This is the classic "gaps and islands" problem. I originally built this query to analyze alarm clustering on network devices, but I've adapted it here to use hotel bookings as the example. Why? Two reasons:

Hotels are more intuitive - Most people understand overlapping reservations better than SNMP trap sequences
The pattern is universal - Once you see that network alarms and hotel bookings are fundamentally the same problem, you realize this SQL pattern applies everywhere: server uptime windows, user session analysis, equipment maintenance schedules, or any time-series data where events can overlap

The algorithm doesn't care whether you're tracking alert_id on network switches or booking_id at beach resorts. The structure is identical.

The Problem

Whether it's network devices or hotel properties, you need to answer:

When do we have continuous activity versus quiet periods?
How many individual events cluster together during busy times?
What's the average intensity during peak periods?
Where are the gaps we should investigate?

Real-World Example

Here's what the booking pattern looks like visually:

Timeline: March 1 ──────────────────────────────────────────────> March 14

Oceanview Property:
                    ISLAND 1                    GAP           ISLAND 2
         ├─────────────────────────────┤                ├──────────────────┤
Mar 1    Mar 2    Mar 3    Mar 4    Mar 5    Mar 6 ... Mar 10   Mar 11   Mar 12   Mar 13   Mar 14
  │        │        │        │        │        │             │        │        │        │        │
  ├────────────────┤ Booking 1 (2 rooms)
           ├────────────────┤ Booking 2 (1 room)
                    ├────────────────┤ Booking 3 (3 rooms)
                                      │                       │
                                      └───── 4-day gap ───────┘
                                                               │
                                                               ├────────────────┤ Booking 4 (2 rooms)
                                                                        ├────────────────┤ Booking 5 (1 room)
                                                                                 ├────────────────┤ Booking 6 (2 rooms)

Island 1: March 1-6 (3 overlapping/touching bookings)
Gap: March 6-10 (no activity)
Island 2: March 10-14 (3 overlapping bookings)

In my network monitoring context, these would be alarm clusters on a device, separated by periods of normal operation.

The Data

Here's the schema adapted for hotels (my original used device_id, alert_id, start_time, end_time):

DROP TABLE IF EXISTS hotel_bookings;
CREATE TABLE hotel_bookings (
    id SERIAL NOT NULL PRIMARY KEY,
    property_id TEXT NOT NULL,
    booking_id TEXT NOT NULL,
    check_in TIMESTAMP NOT NULL,
    check_out TIMESTAMP NOT NULL,
    room_count INTEGER NOT NULL DEFAULT 1
);

-- Sample data
INSERT INTO hotel_bookings (property_id, booking_id, check_in, check_out, room_count)
VALUES 
    -- Property 1: Overlapping bookings, then a gap, then more bookings
    ('oceanview', 'bk-001', '2025-03-01 14:00:00', '2025-03-03 11:00:00', 2),
    ('oceanview', 'bk-002', '2025-03-02 15:00:00', '2025-03-04 11:00:00', 1),
    ('oceanview', 'bk-003', '2025-03-04 14:00:00', '2025-03-06 11:00:00', 3),
    ('oceanview', 'bk-004', '2025-03-10 14:00:00', '2025-03-12 11:00:00', 2),
    ('oceanview', 'bk-005', '2025-03-11 15:00:00', '2025-03-13 11:00:00', 1),
    ('oceanview', 'bk-006', '2025-03-12 14:00:00', '2025-03-14 11:00:00', 2),
    
    -- Property 2: Sparser activity
    ('downtown', 'bk-101', '2025-03-01 14:00:00', '2025-03-03 11:00:00', 1),
    ('downtown', 'bk-102', '2025-03-02 15:00:00', '2025-03-04 11:00:00', 2),
    ('downtown', 'bk-103', '2025-03-08 14:00:00', '2025-03-10 11:00:00', 1);

The Solution

This is the exact same algorithm I use for network alarms, just with different column names:

WITH ordered_bookings AS (
    -- Step 1: Order events chronologically and look at when the previous event ended
    -- In my network monitoring version, this would be ordered_alarms with device_id
    -- LAG() gives us the end time of the event that came immediately before
    SELECT 
        property_id,
        booking_id,
        check_in,
        check_out,
        room_count,
        LAG(check_out) OVER (
            PARTITION BY property_id 
            ORDER BY check_in
        ) AS prev_checkout
    FROM hotel_bookings
),
island_starts AS (
    -- Step 2: Detect where new islands begin
    -- An island starts when there's no previous event (first one) 
    -- OR when there's a time gap between events
    -- This is the key insight: if current start > previous end, we have a gap
    SELECT 
        *,
        CASE 
            WHEN prev_checkout IS NULL OR check_in > prev_checkout 
            THEN 1 
            ELSE 0 
        END AS is_island_start
    FROM ordered_bookings
),
island_groups AS (
    -- Step 3: Assign island numbers using a running sum
    -- Every time is_island_start = 1, we increment the counter
    -- All events in the same island get the same number
    SELECT 
        *,
        SUM(is_island_start) OVER (
            PARTITION BY property_id 
            ORDER BY check_in
        ) AS island_number
    FROM island_starts
)
-- Step 4: Aggregate by island to get summary statistics
-- In network monitoring, this tells me: how long was the outage? how many alarms fired?
SELECT 
    property_id,
    island_number,
    MIN(check_in) AS island_start,
    MAX(check_out) AS island_end,
    COUNT(*) AS booking_count,
    ROUND(AVG(room_count), 2) AS avg_rooms_per_booking,
    SUM(room_count) AS total_room_nights,
    ROUND(
        EXTRACT(EPOCH FROM (MAX(check_out) - MIN(check_in))) / 86400, 
        2
    ) AS island_duration_days
FROM island_groups
GROUP BY property_id, island_number
ORDER BY property_id, island_number;

Understanding the Results

property_id	island_number	island_start	island_end	booking_count	avg_rooms	total_room_nights	duration_days
downtown	1	2025-03-01 14:00	2025-03-04 11:00	2	1.50	3	2.88
downtown	2	2025-03-08 14:00	2025-03-10 11:00	1	1.00	1	1.88
oceanview	1	2025-03-01 14:00	2025-03-06 11:00	3	2.00	6	4.88
oceanview	2	2025-03-10 14:00	2025-03-14 11:00	3	1.67	5	3.88

In the hotel context: oceanview has two strong occupancy islands separated by a 4-day gap. In the network context: a device had two outage windows with multiple alarms each, separated by a period of stability.

Same pattern, different domain.

The Core Algorithm

Three SQL features working together:

LAG() - Look at the previous event's end time
CASE - Detect gaps (current start > previous end = new island)
SUM() OVER() - Running total that increments at each gap, creating island IDs

Try It Yourself

Switch to SQL tab to try out other challenges or try adapting it to your domain:

Network ops: Replace with device alarms
DevOps: Server uptime windows
Analytics: User session clustering
Maintenance: Equipment downtime periods

The table structure is always the same: an entity ID, an event ID, start time, end time, and optionally a metric.

Why This Matters

When I first encountered this problem with network alarms, I thought it was specific to monitoring systems. Then I needed the same logic for analyzing user sessions in application logs. Then for equipment maintenance windows. Then I saw hotel revenue managers asking the same question.

The realization: most time-series problems are the same problem wearing different hats. Master this pattern once, and you'll recognize it everywhere. The SQL doesn't care what domain you're in—it just finds continuous periods in overlapping time ranges.

That's the power of understanding patterns rather than memorizing solutions.