Month: October 2014

Using UNION ALL together with ORDER SIBLINGS BY in hierarchical SQL queries

In Hierarchical queries using the Oracle RDBMS – Determine ordering within each level, I showed how to sort the rows of a hierarchical SQL query within each tree level by using ORDER SIBLINGS BY instead of ORDER BY. There is one detail to keep in mind, though: even the ORDER SIBLINGS BY is done on the overall result of the query, which can cause problems when using UNION ALL to take additional rows into the result:

SELECT 0, 0, 'xyz' FROM DUAL
UNION ALL
SELECT LEVEL, Id, substr( LPAD (' ', 3 * (LEVEL - 1)) || Label, 0, 30) Node
FROM TreeNode
START WITH Parent = 0
CONNECT BY PRIOR Id = Parent
ORDER SIBLINGS BY Label;

ERROR at line 7:
ORA-30929: ORDER SIBLINGS BY clause not allowed here

Using parantheses to explicitly group the second select within the UNION ALL does not help, but there is a simple solution: just wrap the hierarchical query into a sub select, and then do a SELECT * from the sub select:

SELECT 0, 0, 'xyz' FROM DUAL
UNION ALL
SELECT * FROM (
	SELECT LEVEL, Id, substr( LPAD (' ', 3 * (LEVEL - 1)) || Label, 0, 30) Node
	FROM TreeNode
	START WITH Parent = 0
	CONNECT BY PRIOR Id = Parent
	ORDER SIBLINGS BY Label
);

                 0                  0 'XYZ'
------------------ ------------------ ---------
                 0                  0 xyz
                 1                  1 Root Node
                 2                  2    Node 1
                 3                  3       Node 1.1
                 3                  4       Node 1.2
...

SQL “GROUP BY” by example

GROUP BY example

The GROUP BY clause is used to group selected rows by a certain expression, means it groups all rows for which the given expression returns the same value. Then, for each group, a single result is calculated by using aggregate functions (the number of result rows is the same as the number of groups). See Oracle® Database SQL Language Reference: group_by_clause for more information. As an example, lets assume we have some revenue information for each month:

Description Month     Revenue
----------- --------- -------
Deal 1      January   100
Deal 2      January   150
Deal 3      February  200
Deal 4      February  50
Deal 5      March     120

Using GROUP BY, we can group the data by Month and calculate a single result for each group, for example the summary of all revenues for each month:

Description Month     Revenue
----------- --------- -------

Deal 1      January   100
Deal 2      January   150
===========================
             Summary: 250

Deal 3      February  200
Deal 4      February  50
===========================
             Summary: 250

Deal 5      March     120
=========================== 
             Summary: 120

The result will then only contain the three rows with the summarized revenue values.

GROUP BY SQL query

Lets see how this works in SQL – first, we create a table with some data, according to the example from above:

CREATE TABLE t_revenue (
   c_id          NUMBER(18) NOT NULL,
   c_description VARCHAR2(100) NOT NULL,
   c_month       VARCHAR2(20) NOT NULL,
   c_amount      NUMBER(18) NOT NULL
);

INSERT INTO t_revenue VALUES(0, 'Deal 1', 'January', 100);
INSERT INTO t_revenue VALUES(1, 'Deal 2', 'January', 150);
INSERT INTO t_revenue VALUES(2, 'Deal 3', 'February', 200);
INSERT INTO t_revenue VALUES(3, 'Deal 4', 'February', 50);
INSERT INTO t_revenue VALUES(4, 'Deal 5', 'March', 120);
COMMIT;

The most simple example calculates the summary of all revenues for each month, as above:

SELECT SUM(c_amount)    -- The expression (aggregate function) which is applied to each group of rows
FROM t_revenue
GROUP BY c_month;       -- The expression by which to group the rows

SUM(C_AMOUNT)
-------------
          120
          250
          250

While this query works as expected, we can not see which month the results refer to – but we can simply add the month column to the result:

SELECT c_month, SUM(c_amount)
FROM t_revenue
GROUP BY c_month;

C_MONTH              SUM(C_AMOUNT)
-------------------- -------------
March                          120
January                        250
February                       250

(Note that, with this data model, it is not easy to sort by the month – we can not simply ORDER BY c_month because that would cause “February” to appear before “January”).

ORA-00979: not a GROUP BY expression

One error which often occurs with GROUP BY is ORA-00979: not a GROUP BY expression. That means what we have added an expression to the projection of the SQL statement (the columns which we want to select) which would not result in a single value. For example, we could try to add the c_description column as follows:

SELECT c_description, c_month, SUM(c_amount)
FROM t_revenue
GROUP BY c_month;

But since the c_description column can not be condensed to a single value for each group (the values neither have the same value within the group as specified in the GROUP BY clause, nor are they condensed to a single value through an aggregate function) we get:

SELECT c_description, c_month, SUM(c_amount)
       *
ERROR at line 1:
ORA-00979: not a GROUP BY expression

However, we could use another aggregate function, for example LISTAGG which returns a single result by concatenating the values of a given column for each row within the group. The LISTAGG function also requires some additional syntax to define the ordering of the values within the list. See the LISTAGG documentation for more information.

SELECT LISTAGG(c_description, ';') WITHIN GROUP (ORDER BY c_description) Deals, 
       c_month, SUM(c_amount)
FROM t_revenue
GROUP BY c_month;

DEALS            C_MONTH      SUM(C_AMOUNT)
---------------- ------------ -------------
Deal 3;Deal 4    February               250
Deal 1;Deal 2    January                250
Deal 5           March                  120

What is a schema in an Oracle database?

There is sometimes a little confusion about the term Schema in Oracle databases. In General, a database schema is a collection of database objects (Tables, Views, Packages, …) which conceptually belong together. However, in Oracle RDBMS, a schema is also tightly coupled to a user. Essentially, once there is a user, there is a schema (which can be empty though, as for user “RUN” in the following diagram):

The Schema name is the same as the user name – generally, in Oracle databases, the terms “User” and “Schema” are the same. A typical deployment scenario is to have a runtime user (“RUN” in the example above) who does not own any database objects, and an owning user who owns the database objects. The application then accesses the database through the runtime user, which can be granted or denied access to the various database objects which are owned by the owning user. See also Difference between a User and a Schema in Oracle? on stackoverflow for more information.