Querying Maximum Portfolio Value per Client in MySQL Using Multi-Column Grouping and Subqueries

Dec 03, 2025 · Programming · 8 views · 7.8

Keywords: MySQL | GROUP BY | Subquery

Abstract: This article provides an in-depth exploration of complex GROUP BY operations in MySQL, focusing on a practical case study of client portfolio management. It systematically analyzes how to combine subqueries, JOIN operations, and aggregate functions to retrieve the highest portfolio value for each client. The discussion begins with identifying issues in the original query, then constructs a complete solution including test data creation, subquery design, multi-table joins, and grouping optimization, concluding with a comparison of alternative approaches.

Problem Context and Original Query Analysis

In database application development, complex data query scenarios involving multi-table joins and sophisticated grouping are common. The specific case discussed in this article originates from a real Stack Overflow question: the user needs to query the highest portfolio value for each client from client and portfolio tables, where portfolio value is calculated by summing the cash and stocks fields. The key challenge is that a client may have multiple portfolios, and the original query fails to group correctly by client ID, causing duplicate client entries in the results.

The original query code is:

SELECT clients.id, clients.name, portfolios.id, SUM(portfolios.portfolio + portfolios.cash) AS total
FROM clients, portfolios
WHERE clients.id = portfolios.client_id
GROUP BY portfolios.id, clients.id
ORDER BY total DESC
LIMIT 30

The main issue with this query lies in the GROUP BY portfolios.id, clients.id clause. By grouping by both portfolio ID and client ID, each portfolio creates an independent group, even if they belong to the same client. This violates the core requirement of returning only the highest-value portfolio per client.

Test Data Preparation and Basic Query

To clearly demonstrate the solution, we first create test data. The following SQL statements create two tables and insert sample data:

CREATE TABLE client (
    client_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(64)
);

CREATE TABLE portfolio (
    portfolio_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,
    client_id INTEGER REFERENCES client.id,
    cash DECIMAL(10,2),
    stocks DECIMAL(10,2)
);

INSERT INTO client (name) VALUES ('John Doe'), ('Jane Doe');

INSERT INTO portfolio (client_id, cash, stocks) VALUES 
    (1, 11.11, 22.22),
    (1, 10.11, 23.22),
    (2, 30.30, 40.40),
    (2, 40.40, 50.50);

If portfolio ID were not required, the problem would be much simpler:

SELECT client_id, name, MAX(cash + stocks)
FROM client JOIN portfolio USING (client_id)
GROUP BY client_id;

This query correctly returns the highest portfolio value for each client but cannot identify which specific portfolio achieved this value.

Complete Solution: Subqueries and Multi-Table Joins

To obtain both portfolio ID and maximum value simultaneously, a more complex query structure is needed. The solution consists of three logical steps:

Step 1: Calculate Maximum Portfolio Value per Client

First, create a subquery that calculates the maximum portfolio value for each client:

SELECT client_id, MAX(cash + stocks) AS maxtotal
FROM portfolio
GROUP BY client_id;

This subquery returns a temporary table (aliased as maxima) containing each client's ID and their corresponding maximum portfolio value.

Step 2: Filter Portfolios That Achieve Maximum Value

Next, join the portfolio table with the subquery result from the previous step, retaining only those portfolios whose value equals the client's maximum:

SELECT portfolio_id, cash + stocks 
FROM portfolio 
JOIN (
    SELECT client_id, MAX(cash + stocks) AS maxtotal 
    FROM portfolio
    GROUP BY client_id
) AS maxima USING (client_id)
WHERE cash + stocks = maxtotal;

Here, USING (client_id) is used for the join, a simplified join syntax requiring the join field to have the same name in both tables. The WHERE cash + stocks = maxtotal condition ensures only portfolios with value equal to the maximum are selected.

Step 3: Incorporate Client Information and Handle Ties

Finally, join the client table to obtain client names and handle potential tie situations (multiple portfolios with the same value for the same client):

SELECT client_id, name, portfolio_id, cash + stocks
FROM client
JOIN portfolio USING (client_id)
JOIN (
    SELECT client_id, MAX(cash + stocks) AS maxtotal
    FROM portfolio 
    GROUP BY client_id
) AS maxima USING (client_id)
WHERE cash + stocks = maxtotal
GROUP BY client_id, cash + stocks;

The final GROUP BY client_id, cash + stocks ensures only one row is returned per client, even if multiple portfolios have the same value. MySQL in this case arbitrarily selects one row from each group (typically the first encountered).

Alternative Approaches and Comparison

In the original Stack Overflow question, another answer proposed a solution using the CONCAT function:

SELECT clients.id, clients.name, portfolios.id, SUM(portfolios.portfolio + portfolios.cash) AS total
FROM clients, portfolios
WHERE clients.id = portfolios.client_id
GROUP BY CONCAT(portfolios.id, "-", clients.id)
ORDER BY total DESC
LIMIT 30

This approach creates "pseudo-grouping" by concatenating portfolio ID and client ID into a single string, but has significant drawbacks:

  1. Unclear logic: CONCAT grouping essentially creates the same groups as the original query (one per portfolio), only with a different form of grouping key
  2. Performance issues: String concatenation and comparison are generally slower than integer comparison
  3. Unpredictable results: When multiple portfolios have the same value, there's no guarantee of returning the expected result

In contrast, the subquery-based solution offers:

  1. Clear logic: Explicitly separates "calculate maximum value" and "filter matching records" steps
  2. Accurate results: Ensures returning the highest-value portfolio for each client
  3. Strong extensibility: Easily modifiable for more complex requirements, such as returning the top N highest-value portfolios

Performance Optimization and Best Practices

In production environments, this type of query may require further optimization:

Indexing Strategy: Creating a composite index on the portfolio table's client_id column and the computed expression (cash + stocks) can significantly improve subquery performance:

CREATE INDEX idx_portfolio_client_value ON portfolio(client_id, (cash + stocks));

Window Function Alternative (MySQL 8.0+): For users with newer MySQL versions, window functions provide a more concise solution:

WITH ranked_portfolios AS (
    SELECT 
        c.client_id,
        c.name,
        p.portfolio_id,
        p.cash + p.stocks AS total,
        ROW_NUMBER() OVER (PARTITION BY p.client_id ORDER BY p.cash + p.stocks DESC) AS rn
    FROM client c
    JOIN portfolio p ON c.client_id = p.client_id
)
SELECT client_id, name, portfolio_id, total
FROM ranked_portfolios
WHERE rn = 1;

This approach uses the ROW_NUMBER() window function to rank portfolios for each client by value in descending order, then selects the top-ranked record.

Conclusion and Summary

This article demonstrates a methodology for handling complex grouping requirements through a specific MySQL query case study:

  1. Problem Analysis: Accurately understanding the issues in the original query is the first step toward solution. In this case, the problem was that the grouping logic failed to meet the core requirement of returning only the highest-value portfolio per client.
  2. Stepwise Solution: Decompose complex problems into multiple logical steps—calculate aggregate values, filter matching records, incorporate related information.
  3. Subquery Application: Subqueries are powerful tools for handling "aggregate then match" scenarios, particularly when combined with JOIN operations.
  4. Tie Handling: When multiple records satisfy conditions, explicit strategies are needed to determine which record to return, typically implemented through additional GROUP BY clauses or window functions.

This subquery-based solution applies not only to portfolio query scenarios but also to various data analysis tasks requiring "records corresponding to maximum/minimum/average values per group," such as sales record analysis, user behavior statistics, and performance monitoring. Mastering this pattern enables developers to solve complex SQL query problems more efficiently.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.