从NORTHWIND数据库获取每个国家/地区的最畅销产品

问题描述：

早上好，过去的一天我一直在为此苦苦挣扎，但我似乎无法弄清楚.

Good day guys, I've been struggling with this for the past day and I just can't seem to figure it out.

我的任务是从流行的开源数据库NORTHWIND中获取每个国家/地区销售量最大的产品: https://northwinddatabase.codeplex.com

My task is to derive the most sold product for each country from the popular open source database called NORTHWIND: https://northwinddatabase.codeplex.com

我能够进入这一阶段，这是我在SQL Server中的代码:

I was able to get to this stage, here is my code in SQL Server:

--Get most sold product for each country
WITH TotalProductsSold AS 
(
    SELECT od.ProductID, SUM(od.Quantity) AS TotalSold
        FROM [Order Details] AS od
        GROUP BY od.ProductID
)
SELECT MAX(TotalProductsSold.TotalSold) AS MostSoldQuantity, s.Country --,p.ProductName
    FROM Products AS p
    INNER JOIN TotalProductsSold
    ON TotalProductsSold.ProductID = p.ProductID
    INNER JOIN Suppliers AS s
    ON s.SupplierID = p.SupplierID
    GROUP BY s.Country
    ORDER BY MostSoldQuantity DESC

这给了我以下结果:

这很好，但是我希望找出MostSoldQuantity的产品名称.

That's all good but I wish to find out the product name for the MostSoldQuantity.

非常感谢您！

PS，我在我认为可以使用的地方添加了--p.ProductName注释，但没有，如果有人可以向我解释为什么GROUP BY不能自动允许我为该行得出很好的产品名称，>

P.S I put a comment --p.ProductName where I thought it would work but it didnt and if someone could explain me why does GROUP BY not automatically allow me to derive the product name for the row that would be great

答

首先，从每个国家(而不只是每个产品)的销售产品数量开始.然后对它们进行排名，并只选择RANK = 1的任何东西. 像...

First, start with the count of products sold, per country, not just per product. Then rank them and pick only anything at RANK = 1. Something like...

WITH
    ProductQuantityByCountry AS 
(
    SELECT
       s.CountryID,
       p.ProductID,
       SUM(od.Quantity)   AS Quantity
    FROM
        [Order Details]   AS od
    INNER JOIN
        Products          AS p
            ON  p.ProductID = od.ProductID
    INNER JOIN
        Suppliers         AS s
            ON  s.SupplierID = p.SupplierID
    GROUP BY
       s.CountryID,
       p.ProductID
),
    RankedProductQuantityByCountry
AS
(
    SELECT
        RANK() OVER (PARTITION BY CountryID ORDER BY Quantity DESC)  AS countryRank,
        *
    FROM
        ProductQuantityByCountry
)
SELECT
    *
FROM
    RankedProductQuantityByCountry
WHERE
    countryRank = 1

请注意，一个国家可能会提供相同数量的不同产品，因此两种产品的排名都可能为1.调查ROW_NUMER()和/或DENSE_RANK()的其他行为类似RANK().

Note, one country may supply identical quantity of different producs, and so two products could both have rank = 1. Look into ROW_NUMER() and/or DENSE_RANK() for other but similar behaviours to RANK().

一个简单但容易理解的问题是为什么SQL不允许您在最终查询中放入Product.Name呢?

A simple though exercise to cover why SQL doesn't let you put Product.Name in your final query is to ask a question.

在这种情况下，SQL应该做什么?

What should SQL do in this case?

SELECT
    MAX(TotalProductsSold.TotalSold) AS MostSoldQuantity,
    MIN(TotalProductsSold.TotalSold) AS LeastSoldQuantity,
    s.Country,
    p.ProductName
FROM
    blahblahblah
GROUP BY
    s.Country
ORDER BY
    MostSoldQuantity DESC

MIN 和 MAX的存在使事情变得模棱两可.

The presence of a MIN and a MAX makes things ambiguous.

您可能很清楚，您要执行操作by country，并且该操作是从该国家/地区选择销量最高的产品.但这实际上不是明确的，对查询的微小更改可能会对任何推断的行为造成非常混乱的后果.相反，SQL的声明性语法对要解决的问题提供了非常清晰/明确/确定性的描述.

You may be clear that you want to perform an operation by country and that operation to be to pick the product with the highest sales volume from that country. But it's not actually explicit, and small changes to the query could have very confusing consequences to any inferred behaviour. Instead SQL's declarative syntax provides a very clear / explicit / deterministic description of the problem to be solved.

如果GROUP BY子句中未提及表达式，则不能SELECT对其进行汇总，而不能对其进行汇总.这样一来，对于SQL引擎的含义或应该做的事情就没有歧义.

If an expression isn't mentioned in the GROUP BY clause, you can't SELECT it, without aggregating it. This is so that there is no ambiguity as to what is meant or what the SQL engine is supposed to do.

通过要求您在查询的一个级别上规定get the total sales per country per product，然后可以在另一查询级别将干净地声明为and then pick the highest ranked per country.

By requiring you to stipulate get the total sales per country per product at one level of the query, you can then cleanly state and then pick the highest ranked per country at another level of the query.

感觉好像您遇到的查询的时间长于必须"的时间.但这还会导致查询完全明确，既可以将查询编译成执行计划，也可以供将来将来读取您的代码的其他编码人员使用.

This can feel like you end up with queries that are longer than "should" be necessary. But it also results in queries that are completely un-ambiguous, both for compiling the query down to an execution plan, and for other coders who will read your code in the future.

从NORTHWIND数据库获取每个国家/地区的最畅销产品

相关推荐