SELECT product_id, toMonth(order_date) AS month, sum(quantity) AS total_quantity
GROUP BY product_id, month
ORDER BY product_id, month
This query will group the sales data by product_id and month, and calculate the total quantity sold for each combination of product and month. The toMonth() function is used to extract the month from the order_date column.
Example 2: Web Traffic Data
Suppose we have a web traffic table with the following columns: timestamp, ip_address, page_url, user_agent. We want to calculate the number of page views by browser type and operating system. Here’s how we can do this using rollup:
WHEN user_agent LIKE ‘%Firefox%’ THEN ‘Firefox’
WHEN user_agent LIKE ‘%Chrome%’ THEN ‘Chrome’
END AS browser,
WHEN user_agent LIKE ‘%Windows%’ THEN ‘Windows’
WHEN user_agent LIKE ‘%Mac OS%’ THEN ‘Mac OS’
END AS os,
count(*) AS page_views
GROUP BY ROLLUP(browser, os)
ORDER BY browser, os
This query will group the web traffic data by browser and operating system, and calculate the number of page views for each combination. The ROLLUP() function is used to create a hierarchy of subtotals, so the query will also return subtotals for each browser and for each operating system.
Example 3: Employee Data
Suppose we have an employee table with the following columns: employee_id, department, job_title, salary. We want to calculate the average salary by department and job title, and also calculate subtotals by department and totals for all employees. Here’s how we can do this using cube:
SELECT department, job_title, avg(salary) AS avg_salary
GROUP BY CUBE(department, job_title)
ORDER BY department, job_title
This query will group the employee data by department and job title, and calculate the average salary for each combination. The CUBE() function is used to create a hierarchy of subtotals and totals, so the query will also return subtotals by department and totals for all employees.
In summary, grouping, rollup, and cube are powerful SQL query operations that allow for grouping and aggregation of data based on multiple dimensions or attributes. In ClickHouse, these operations are implemented using the GROUP BY clause, along with functions such as ROLLUP() and CUBE(). By using these operations, you can gain deeper insights into your data and perform complex analysis on large-scale data sets.