Year: 2014

SQL “GROUP BY” by example

GROUP BY example

The GROUP BY clause is used to group selected rows by a certain expression, means it groups all rows for which the given expression returns the same value. Then, for each group, a single result is calculated by using aggregate functions (the number of result rows is the same as the number of groups). See Oracle® Database SQL Language Reference: group_by_clause for more information. As an example, lets assume we have some revenue information for each month:

Description Month     Revenue
----------- --------- -------
Deal 1      January   100
Deal 2      January   150
Deal 3      February  200
Deal 4      February  50
Deal 5      March     120

Using GROUP BY, we can group the data by Month and calculate a single result for each group, for example the summary of all revenues for each month:

Description Month     Revenue
----------- --------- -------

Deal 1      January   100
Deal 2      January   150
===========================
             Summary: 250

Deal 3      February  200
Deal 4      February  50
===========================
             Summary: 250

Deal 5      March     120
=========================== 
             Summary: 120

The result will then only contain the three rows with the summarized revenue values.

GROUP BY SQL query

Lets see how this works in SQL – first, we create a table with some data, according to the example from above:

CREATE TABLE t_revenue (
   c_id          NUMBER(18) NOT NULL,
   c_description VARCHAR2(100) NOT NULL,
   c_month       VARCHAR2(20) NOT NULL,
   c_amount      NUMBER(18) NOT NULL
);

INSERT INTO t_revenue VALUES(0, 'Deal 1', 'January', 100);
INSERT INTO t_revenue VALUES(1, 'Deal 2', 'January', 150);
INSERT INTO t_revenue VALUES(2, 'Deal 3', 'February', 200);
INSERT INTO t_revenue VALUES(3, 'Deal 4', 'February', 50);
INSERT INTO t_revenue VALUES(4, 'Deal 5', 'March', 120);
COMMIT;

The most simple example calculates the summary of all revenues for each month, as above:

SELECT SUM(c_amount)    -- The expression (aggregate function) which is applied to each group of rows
FROM t_revenue
GROUP BY c_month;       -- The expression by which to group the rows

SUM(C_AMOUNT)
-------------
          120
          250
          250

While this query works as expected, we can not see which month the results refer to – but we can simply add the month column to the result:

SELECT c_month, SUM(c_amount)
FROM t_revenue
GROUP BY c_month;

C_MONTH              SUM(C_AMOUNT)
-------------------- -------------
March                          120
January                        250
February                       250

(Note that, with this data model, it is not easy to sort by the month – we can not simply ORDER BY c_month because that would cause “February” to appear before “January”).

ORA-00979: not a GROUP BY expression

One error which often occurs with GROUP BY is ORA-00979: not a GROUP BY expression. That means what we have added an expression to the projection of the SQL statement (the columns which we want to select) which would not result in a single value. For example, we could try to add the c_description column as follows:

SELECT c_description, c_month, SUM(c_amount)
FROM t_revenue
GROUP BY c_month;

But since the c_description column can not be condensed to a single value for each group (the values neither have the same value within the group as specified in the GROUP BY clause, nor are they condensed to a single value through an aggregate function) we get:

SELECT c_description, c_month, SUM(c_amount)
       *
ERROR at line 1:
ORA-00979: not a GROUP BY expression

However, we could use another aggregate function, for example LISTAGG which returns a single result by concatenating the values of a given column for each row within the group. The LISTAGG function also requires some additional syntax to define the ordering of the values within the list. See the LISTAGG documentation for more information.

SELECT LISTAGG(c_description, ';') WITHIN GROUP (ORDER BY c_description) Deals, 
       c_month, SUM(c_amount)
FROM t_revenue
GROUP BY c_month;

DEALS            C_MONTH      SUM(C_AMOUNT)
---------------- ------------ -------------
Deal 3;Deal 4    February               250
Deal 1;Deal 2    January                250
Deal 5           March                  120

What is a schema in an Oracle database?

There is sometimes a little confusion about the term Schema in Oracle databases. In General, a database schema is a collection of database objects (Tables, Views, Packages, …) which conceptually belong together. However, in Oracle RDBMS, a schema is also tightly coupled to a user. Essentially, once there is a user, there is a schema (which can be empty though, as for user “RUN” in the following diagram):

The Schema name is the same as the user name – generally, in Oracle databases, the terms “User” and “Schema” are the same. A typical deployment scenario is to have a runtime user (“RUN” in the example above) who does not own any database objects, and an owning user who owns the database objects. The application then accesses the database through the runtime user, which can be granted or denied access to the various database objects which are owned by the owning user. See also Difference between a User and a Schema in Oracle? on stackoverflow for more information.

Passing an array to a function which expects constant array elements

In C, I recently wanted to pass a pointer to an array of chars to a function, and since the function is not allowed to modify the array contents, I declared it as const. The rationale behind this is

char (*param)[20];

defines a pointer param to an array of non-constant chars, and

const char (*param)[20];

defines a pointer param to an array of constant chars. So, the straight forward definition of the function looked like this, and I also verified that the compiler properly catches attempts to write to the array elements:

void f(const char (*param)[20]) {
   /* (*param)[2] = 'A'; */ /* correctly catched by the compiler: assignment of read-only location '(*param)[2]' */
}

Then, I wanted to call the function as follows:

int main() {
   char (*data)[20] = calloc(20, 1);

   f(data);

   return 0;
}

But, I got the following warning when compiling this code:

$  gcc -Wall -pedantic -o so so.c
so.c: In function 'main':
so.c:15:4: warning: passing argument 1 of 'f' from incompatible pointer type [enabled by default]
    f(data);
    ^
so.c:3:6: note: expected 'const char (*)[20]' but argument is of type 'char (*)[20]'
 void f(const char (*param)[20]) {
      ^

Why does this happen? It should be possible to pass a pointer to any non-const data to a function which expects a pointer to const data – this is for example the way how the standard string functions are declared, like strcpy(). The source parameter is a const char*, but we can pass both a pointer to const or to non-const data. As long as the called function does not modify the contents the pointer points to, this should work well. While trying to understand why the above does not work, it turned out that “this is how C works”, and the GCC FAQ explains it: T (*)[n] is not assignment compatible to const T (*)[n] (note though that T* would be assignment compatible to const T* as described above). The C FAQ (if the site is down, use something like https://archive.today/mGvX7 instead) boils this down to the fact that it is not allowed to assign a char** pointer to a const char** pointer. It is only allowed for simple pointers to data (as in the strcpy() example above), but not for pointers to pointers (at any level, recursively). The FAQ also lists the steps which show how the unmodifiable data could be modified without using an explicit cast if this was allowed. Hence the reason that the compiler correctly prints this warning.

Solutions

The simplest solution is to remove the const from the function’s parameter list. However, this would not allow the compiler to catch write access to the array within the function anymore. If we want to keep the const, we need to cast when calling the function (but explicit casts especially between non-const and const usually indicates other deeper issues, so this should be avoided). A much better solution is to wrap the array in a structure. With this approach, it is possible to keep the const so that the compiler catches modifications of the array, and the function can be called without warning. The drawback is that accessing the data inside the function requires a bit more effort to reference the array structure member:

#include <stdlib.h>

typedef struct _arr11 { char data[11]; } arr11;

void f(const arr11* param) {
   /* error: assignment of read-only location 'param->data[3]' */
   /* param->data[3] = 'A';*/
}

int main() {
   arr11* data = calloc(sizeof(arr11), 1);

   f(data);    /* No warning! */

   return 0;
}

JCA code sample: encrypting data with AES

The following code fragments show how a text can be encrypted and decrypted with the JCA (Java Cryptography Architecture) API using the AES cryptographic algorithm. Lets assume that we have a text which we want to encrypt, and also a key which we want to use for the encryption (remember that AES is a symmetric encryption algorithm, so the same key is used for both encryption and decryption):

String text = "Sample String which we want to encrypt";
String key = "Bar12345Bar12345";

The first thing we need to do is to get the cipher (the cryptographic algorithm) we want to use, and we also need to create a Key which represents our String-based key we defined above so that it can be used in JCE API calls. Cryptographic algorithms are normally working on binary data, so we also need to get the binary representation of the key first, using an appropriate character set:

Cipher cipher = Cipher.getInstance("AES");
Key aesKey = new SecretKeySpec(key.getBytes(StandardCharsets.ISO_8859_1), "AES");

The Key class has a method getEncoded() which returns the binary representation of the key as a byte[] array. We can use this method do print a hexadecimal dump of the key, and we will get the following output:

0000: 42 61 72 31 32 33 34 35 42 61 72 31 32 33 34 35  Bar12345Bar12345

Note that the key which we created is exactly 16 bytes = 128 bits long. This is the minimum key size supported by AES – and with a default JDK installation, we can not use larger keys. See … for more information how to use larger keys (192 bit and 256 bit). Next, we also need to get the binary representation of the text we want to encrypt, so that it can be fed into the encryption algorithm:

byte[] plaintext = text.getBytes(StandardCharsets.ISO_8859_1);

The dump of the plaintext data looks like this:

0000: 53 61 6D 70 6C 65 20 53 74 72 69 6E 67 20 77 68  Sample String wh
0010: 69 63 68 20 77 65 20 77 61 6E 74 20 74 6F 20 65  ich we want to e
0020: 6E 63 72 79 70 74                                ncrypt

Now, we can use the Ciper.doFinal() method to encrypt the whole plaintext:

cipher.init(Cipher.ENCRYPT_MODE, aesKey);
byte[] ciphertext = cipher.doFinal(plaintext);

The dump of the ciphertext (which is the encrypted data) looks like this:

0000: 29 0F 31 B5 7E D4 BD 02 69 8E C3 8E 87 C9 8A 4A  ).1µ~Ô½.i?Ã??É?J
0010: 2D C6 FB F9 E8 4E C9 F7 34 61 33 9B 46 27 57 49  -ÆûùèNÉ÷4a3?F'WI
0020: 31 44 53 5B 58 C6 1F 8A 99 A0 F5 18 5C EB 6A 05  1DS[XÆ.?? õ.\ëj.

One thing to mention is that the length of the ciphertext has been extended to a multiple of 128 bits – this is the block size which AES is using. Now we can use the ciphertext and decrypt it using the original key:

cipher.init(Cipher.DECRYPT_MODE, aesKey);
byte[] decrypted = cipher.doFinal(ciphertext);

The hexadecimal dump of the decrypted array shows that the data has been properly decrypted:

0000: 53 61 6D 70 6C 65 20 53 74 72 69 6E 67 20 77 68  Sample String wh
0010: 69 63 68 20 77 65 20 77 61 6E 74 20 74 6F 20 65  ich we want to e
0020: 6E 63 72 79 70 74                                ncrypt

The complete runnable sample is available at https://github.com/afester/CodeSamples/blob/master/Java/JCE/src/com/example/AESSample.java.