Use desktop for interactive SQL sandbox

SQL for Flow Diagrams: Visualizing Money Flows Between Accounts

The Backstory

A few years back, I was analyzing network traffic trying to figure out which servers were talking to each other the most. Millions of flow records. The boss wanted a visual.

I ended up building a flow diagram, you know, those charts that show connections between nodes. The hard part wasn't the visualization, it was the SQL to aggregate source→target→value tuples.

Here's what the end result looked like:

graph LR Server_A[Server A] -->|25.3GB| Database[Database Server] Server_B[Server B] -->|18.7GB| Database[Database Server] Server_C[Server C] -->|12.4GB| API_Gateway[API Gateway] API_Gateway[API Gateway] -->|8.9GB| Server_A[Server A]

The pattern? Simple:

  1. Aggregate flows: SELECT source, target, SUM(value) GROUP BY source, target
  2. Rank by volume: ROW_NUMBER() OVER (ORDER BY value DESC)
  3. Take top N flows
  4. Generate visualization syntax

That same pattern works for any source→target→value data. Network traffic, supply chains, email threads, API calls...

But the most useful application I've found? Tracking where my money actually goes.

The Real Problem: Budget Tracking

The Budgeting Problem

You have multiple bank accounts, credit cards, and investment accounts. Money flows between them all the time:

  • Paycheck → Checking
  • Checking → Savings
  • Checking → Credit Card (payment)
  • Credit Card → Expenses
  • Investment Account → Checking (dividend)

You want to visualize: Where is my money actually going?

The Schema

DROP TABLE IF EXISTS transactions;
CREATE TABLE transactions (
  id SERIAL PRIMARY KEY,
  from_account TEXT NOT NULL,
  to_account TEXT NOT NULL,
  amount NUMERIC(10,2) NOT NULL,
  category TEXT,
  transaction_date DATE NOT NULL,
  description TEXT
);

-- Sample data
INSERT INTO transactions (from_account, to_account, amount, category, transaction_date, description) VALUES
  ('Paycheck', 'Checking', 3500.00, 'Income', '2025-03-01', 'Salary deposit'),
  ('Checking', 'Savings', 500.00, 'Transfer', '2025-03-02', 'Monthly savings'),
  ('Checking', 'Credit Card', 1200.00, 'Payment', '2025-03-05', 'CC payment'),
  ('Credit Card', 'Groceries', 450.00, 'Expense', '2025-03-01', 'Whole Foods'),
  ('Credit Card', 'Restaurants', 280.00, 'Expense', '2025-03-03', 'Various dining'),
  ('Credit Card', 'Gas', 120.00, 'Expense', '2025-03-04', 'Shell station'),
  ('Checking', 'Rent', 1500.00, 'Expense', '2025-03-01', 'Monthly rent'),
  ('Checking', 'Utilities', 150.00, 'Expense', '2025-03-10', 'Electric + Internet'),
  ('Checking', 'Entertainment', 200.00, 'Expense', '2025-03-15', 'Movies + Games'),
  ('Investment', 'Checking', 45.00, 'Income', '2025-03-20', 'Dividend payment'),
  ('Paycheck', 'Checking', 3500.00, 'Income', '2025-03-15', 'Salary deposit'),
  ('Checking', 'Savings', 500.00, 'Transfer', '2025-03-16', 'Monthly savings'),
  ('Credit Card', 'Shopping', 340.00, 'Expense', '2025-03-12', 'Amazon orders'),
  ('Checking', 'Healthcare', 75.00, 'Expense', '2025-03-08', 'Doctor copay');

The Money Flow Query

Same pattern, different domain:

WITH money_flows AS (
  -- Step 1: Aggregate flows between accounts
  SELECT 
    from_account AS source,
    to_account AS target,
    SUM(amount) AS total_amount,
    COUNT(*) AS num_transactions
  FROM transactions
  WHERE transaction_date >= '2025-03-01' 
    AND transaction_date < '2025-04-01'
  GROUP BY from_account, to_account
),
ranked_flows AS (
  -- Step 2: Rank by dollar amount
  SELECT 
    source,
    target,
    total_amount,
    num_transactions,
    ROW_NUMBER() OVER (ORDER BY total_amount DESC) AS rank
  FROM money_flows
),
top_flows AS (
  -- Step 3: Get top 15 flows
  SELECT 
    source,
    target,
    total_amount AS value,
    num_transactions
  FROM ranked_flows
  WHERE rank <= 15
)
-- Step 4: Format for visualization
SELECT 
  source,
  target,
  value,
  '$' || ROUND(value, 2) AS formatted_value,
  num_transactions
FROM top_flows
ORDER BY value DESC;

Output:

source        | target         | value   | formatted_value | num_transactions
--------------|----------------|---------|-----------------|------------------
Paycheck      | Checking       | 7000.00 | $7000.00        | 2
Rent          | Checking       | 1500.00 | $1500.00        | 1
Credit Card   | Checking       | 1200.00 | $1200.00        | 1
Checking      | Savings        | 1000.00 | $1000.00        | 2
Groceries     | Credit Card    | 450.00  | $450.00         | 1
Shopping      | Credit Card    | 340.00  | $340.00         | 1
Restaurants   | Credit Card    | 280.00  | $280.00         | 1
Entertainment | Checking       | 200.00  | $200.00         | 1
Utilities     | Checking       | 150.00  | $150.00         | 1
Gas           | Credit Card    | 120.00  | $120.00         | 1
Healthcare    | Checking       | 75.00   | $75.00          | 1
Dividend      | Investment     | 45.00   | $45.00          | 1

Visualization:

graph LR Paycheck[Paycheck] -->|$7000| Checking[Checking] Checking[Checking] -->|$1500| Rent[Rent] Checking[Checking] -->|$1200| Credit_Card[Credit Card] Checking[Checking] -->|$1000| Savings[Savings] Credit_Card[Credit Card] -->|$450| Groceries[Groceries] Credit_Card[Credit Card] -->|$340| Shopping[Shopping] Credit_Card[Credit Card] -->|$280| Restaurants[Restaurants] Checking[Checking] -->|$200| Entertainment[Entertainment] Checking[Checking] -->|$150| Utilities[Utilities] Credit_Card[Credit Card] -->|$120| Gas[Gas] Checking[Checking] -->|$75| Healthcare[Healthcare] Investment[Investment] -->|$45| Checking[Checking]

And just like that, you can see where your money is flowing!

The Pattern

Notice the pattern here? Whether it's:

  • Network traffic: src_ipdst_ip with bytes
  • Money flows: from_accountto_account with amount
  • Supply chain: warehousestore with units
  • Email threads: senderrecipient with message_count
  • API calls: service_aservice_b with request_count

Real-World Tips

From my experience building these:

  1. Always limit to top N flows

    • Sankey diagrams get messy with 100+ nodes
    • Top 10-20 flows usually tell the story
  2. Consider bidirectional flows

    • Networks: A→B and B→A might be the same conversation
    • Money: They're definitely different (income vs. expense)
  3. Add human-readable labels

    • Convert IPs to hostnames
    • Add port numbers → service names (443 → HTTPS)
    • Format large numbers (7523234 bytes → 7.2 MB)
  4. Filter by time window

    • "Last 24 hours" or "This month"
    • Rolling windows are more useful than all-time
  5. Watch for self-loops

    • WHERE source != target to exclude same-account transfers
    • Unless that's interesting data (servers talking to themselves)

Advanced: Multi-Level Flow Diagrams

Want to get fancy? Show flows at multiple levels:

-- Budget example: Income → Checking → Categories → Specific Expenses
WITH level1_flows AS (
  -- Income sources → Checking
  SELECT 'Income' AS source, 'Checking' AS target, SUM(amount) AS value
  FROM transactions
  WHERE category = 'Income'
),
level2_flows AS (
  -- Checking → Categories
  SELECT 'Checking' AS source, category AS target, SUM(amount) AS value
  FROM transactions
  WHERE from_account = 'Checking' AND category != 'Transfer'
  GROUP BY category
),
level3_flows AS (
  -- Categories → Specific accounts
  SELECT category AS source, to_account AS target, SUM(amount) AS value
  FROM transactions
  WHERE from_account = 'Checking' AND category NOT IN ('Income', 'Transfer')
  GROUP BY category, to_account
)
SELECT * FROM level1_flows
UNION ALL
SELECT * FROM level2_flows
UNION ALL
SELECT * FROM level3_flows
ORDER BY value DESC;

This creates a multi-tier flow:

Income → Checking → [Rent, Groceries, Entertainment] → [Landlord, Whole Foods, Netflix]

Try It Yourself

Want to build your own Sankey flow query? Here's the template:

  1. Identify your source and target

    • What entities are connected?
    • What direction does the flow go?
  2. Define your metric

    • What are you measuring? (bytes, dollars, count, etc.)
  3. Aggregate the flows

    SELECT source, target, SUM(metric) AS value
    FROM table
    GROUP BY source, target
  4. Rank and filter

    WHERE rank <= 10
    ORDER BY value DESC
  5. Format for your visualization tool

    • Mermaid syntax
    • D3.js JSON
    • Plotly format
    • Whatever your frontend expects

Wrapping Up

This pattern saved my ass multiple times:

  • Network troubleshooting (identifying top talkers)
  • Budget analysis (personal finance)
  • Application performance (microservice calls)
  • Customer journey mapping (clickstream data)
  • Supply chain visibility (logistics data)

The SQL is straightforward once you see the pattern. The hard part is recognizing when you have a flow problem in the first place.

Next time someone asks "where's all this [traffic/money/data] going?" — you'll know exactly what query to write.


Try it on SQLBook: Head over to sqlbook.io, or fork this book to spin up a PostgreSQL sandbox, and paste in the budget tracking example. Modify it for your own use case. See if you can generate valid Mermaid syntax that renders a flow diagram.

Challenge: Can you modify the query to:

  1. Show only flows above a certain threshold?
  2. Include percentage of total for each flow?
  3. Detect circular flows (A→B→C→A)?
  4. Calculate the "efficiency" of money flows (how many hops from income to final expense)?

Drop your solutions in the comments!


Tags: #SQL #DataVisualization #FlowDiagrams #NetworkAnalysis #PostgreSQL #Budgeting #FlowAnalysis

Difficulty: Intermediate
Topics: Aggregation, Window Functions, Data Visualization, CTEs

SQL Output

Getting Started
1Switch to a SQL tab and select a database
2Write or paste your SQL in the editor
3Press Alt+Enter to run
Tip:Select text in the editor to run only that portion
SQL for Flow Diagrams-Following the Money Updated 1/28/26, 1:08 AM
Notes SQL
Alt+Enter to run