AGGREGATE AND ANALYSIS FUNCTIONS AS
WELL AS WINDOW FUNCTION
Once
your database has been loaded with data, your users or applications will, of
course, want to use that data to run queries, perform analysis, produce
reports, extract data, and so forth. Oracle Database 11g provides many
sophisticated aggregation and analysis functions that can help ease the pain
sometimes associated with analyzing data in large databases.
Aggregation Functions
Oracle
Database 11g provides extensions to the standard SQL group by clause of
the select statement that generate other totals as part of the result set that
previously required multiple queries, nested subqueries, or importing into
spreadsheet type applications. These extensions are rollup and cube.
Rollup
The
rollup extension generates subtotals for attributes specified in the group by
clause, plus another row representing the grand total. The following is an
example of the rollup extension
select c.cust_gender gender,
b.channel_class
channel_class,
to_char(a.time_id, 'yyyy-mm') month,
count(*) unit_count,
sum(a.amount_sold) amount_sold
from sales a, channels b,
customers c
where a.channel_id = b.channel_id
and a.cust_id = c.cust_id
and to_char(a.time_id, 'yyyy-mm') between '2001-01' and '2001-02'
group by rollup(c.cust_gender,
b.channel_class,
to_char(a.time_id, 'yyyy-mm'));
/*
---LOGIC
counts and sums of
amount_sold are returned at the following levels:
1-By GENDER,
CHANNEL_CLASS, and MONTH
2-Subtotals by
CHANNEL_CLASS within GENDER
3-Subtotals by GENDER
4-Grand total
*/
GENDER
|
CHANNEL_CLASS
|
MONTH
|
UNIT_COUNT
|
AMOUNT_SOLD
|
F
|
Direct
|
2001-01
|
4001
|
387000.9
|
F
|
Direct
|
2001-02
|
3208
|
365860.13
|
F
|
Direct
|
7209
|
752861.03
|
|
F
|
Others
|
2001-01
|
2486
|
242615.9
|
F
|
Others
|
2001-02
|
2056
|
229633.52
|
F
|
Others
|
4542
|
472249.42
|
|
F
|
Indirect
|
2001-01
|
1053
|
138395.21
|
F
|
Indirect
|
2001-02
|
1470
|
189425.88
|
F
|
Indirect
|
2523
|
327821.09
|
|
F
|
14274
|
1552931.54
|
||
M
|
Direct
|
2001-01
|
7038
|
719146.28
|
M
|
Direct
|
2001-02
|
6180
|
641192.61
|
M
|
Direct
|
13218
|
1360338.89
|
|
M
|
Others
|
2001-01
|
4310
|
414603.03
|
M
|
Others
|
2001-02
|
3751
|
391792.61
|
M
|
Others
|
8061
|
806395.64
|
|
M
|
Indirect
|
2001-01
|
1851
|
211947.81
|
M
|
Indirect
|
2001-02
|
2520
|
285219.79
|
M
|
Indirect
|
4371
|
497167.6
|
|
M
|
25650
|
2663902.13
|
||
39924
|
4216833.67
|
--Cube
The cube extension takes rollup a step further by generating
subtotals for each combination of the group by attributes, totals by attribute,
and the grand total. The following is an example of the cube extension
select c.cust_gender gender,
b.channel_class
channel_class,
to_char(a.time_id, 'yyyy-mm') month,
count(*) unit_count,
sum(a.amount_sold) amount_sold
from sales a, channels b,
customers c
where a.channel_id = b.channel_id
and a.cust_id = c.cust_id
and to_char(a.time_id, 'yyyy-mm') between '2001-01' and '2001-02'
group by cube(c.cust_gender,
b.channel_class,
to_char(a.time_id, 'yyyy-mm'));
GENDER
|
CHANNEL_CLASS
|
MONTH
|
UNIT_COUNT
|
AMOUNT_SOLD
|
39924
|
4216833.67
|
|||
2001-01
|
20739
|
2113709.13
|
||
2001-02
|
19185
|
2103124.54
|
||
Direct
|
20427
|
2113199.92
|
||
Direct
|
2001-01
|
11039
|
1106147.18
|
|
Direct
|
2001-02
|
9388
|
1007052.74
|
|
Others
|
12603
|
1278645.06
|
||
Others
|
2001-01
|
6796
|
657218.93
|
|
Others
|
2001-02
|
5807
|
621426.13
|
|
Indirect
|
6894
|
824988.69
|
||
Indirect
|
2001-01
|
2904
|
350343.02
|
|
Indirect
|
2001-02
|
3990
|
474645.67
|
|
F
|
14274
|
1552931.54
|
||
F
|
2001-01
|
7540
|
768012.01
|
|
F
|
2001-02
|
6734
|
784919.53
|
|
F
|
Direct
|
7209
|
752861.03
|
|
F
|
Direct
|
2001-01
|
4001
|
387000.9
|
F
|
Direct
|
2001-02
|
3208
|
365860.13
|
F
|
Others
|
4542
|
472249.42
|
|
F
|
Others
|
2001-01
|
2486
|
242615.9
|
F
|
Others
|
2001-02
|
2056
|
229633.52
|
F
|
Indirect
|
2523
|
327821.09
|
|
F
|
Indirect
|
2001-01
|
1053
|
138395.21
|
F
|
Indirect
|
2001-02
|
1470
|
189425.88
|
M
|
25650
|
2663902.13
|
||
M
|
2001-01
|
13199
|
1345697.12
|
|
M
|
2001-02
|
12451
|
1318205.01
|
|
M
|
Direct
|
13218
|
1360338.89
|
|
M
|
Direct
|
2001-01
|
7038
|
719146.28
|
M
|
Direct
|
2001-02
|
6180
|
641192.61
|
M
|
Others
|
8061
|
806395.64
|
|
M
|
Others
|
2001-01
|
4310
|
414603.03
|
M
|
Others
|
2001-02
|
3751
|
391792.61
|
M
|
Indirect
|
4371
|
497167.6
|
|
M
|
Indirect
|
2001-01
|
1851
|
211947.81
|
M
|
Indirect
|
2001-02
|
2520
|
285219.79
|
---rank
and dense_rank Functions
select prod_id,
sum(quantity_sold),
rank () over (order by sum(quantity_sold) desc) as rank,
dense_rank () over (order by sum(quantity_sold) desc) as dense_rank
from sales
where to_char(time_id, 'yyyy-mm') = '2001-06'
group by prod_id;
PROD_ID
|
SUM(QUANTITY_SOLD)
|
RANK
|
DENSE_RANK
|
24
|
762
|
1
|
1
|
30
|
627
|
2
|
2
|
147
|
578
|
3
|
3
|
33
|
552
|
4
|
4
|
133
|
550
|
5
|
5
|
40
|
550
|
5
|
5
|
48
|
541
|
7
|
6
|
120
|
538
|
8
|
7
|
23
|
535
|
9
|
8
|
119
|
512
|
10
|
9
|
124
|
503
|
11
|
10
|
140
|
484
|
12
|
11
|
148
|
472
|
13
|
12
|
139
|
464
|
14
|
13
|
123
|
459
|
15
|
14
|
131
|
447
|
16
|
15
|
25
|
420
|
17
|
16
|
135
|
415
|
18
|
17
|
137
|
407
|
19
|
18
|
146
|
401
|
20
|
19
|
134
|
393
|
21
|
20
|
45
|
389
|
22
|
21
|
132
|
387
|
23
|
22
|
121
|
375
|
24
|
23
|
16
|
352
|
25
|
24
|
117
|
348
|
26
|
25
|
28
|
343
|
27
|
26
|
130
|
343
|
27
|
26
|
113
|
341
|
29
|
27
|
145
|
340
|
30
|
28
|
116
|
338
|
31
|
29
|
26
|
328
|
32
|
30
|
47
|
325
|
33
|
31
|
41
|
317
|
34
|
32
|
141
|
310
|
35
|
33
|
129
|
282
|
36
|
34
|
42
|
277
|
37
|
35
|
27
|
266
|
38
|
36
|
32
|
264
|
39
|
37
|
138
|
257
|
40
|
38
|
118
|
252
|
41
|
39
|
38
|
246
|
42
|
40
|
115
|
246
|
42
|
40
|
114
|
241
|
44
|
41
|
122
|
240
|
45
|
42
|
128
|
240
|
45
|
42
|
142
|
224
|
47
|
43
|
18
|
215
|
48
|
44
|
15
|
214
|
49
|
45
|
20
|
212
|
50
|
46
|
34
|
209
|
51
|
47
|
36
|
204
|
52
|
48
|
43
|
200
|
53
|
49
|
44
|
198
|
54
|
50
|
46
|
197
|
55
|
51
|
17
|
196
|
56
|
52
|
144
|
188
|
57
|
53
|
13
|
173
|
58
|
54
|
22
|
162
|
59
|
55
|
35
|
161
|
60
|
56
|
143
|
161
|
60
|
56
|
19
|
159
|
62
|
57
|
37
|
159
|
62
|
57
|
29
|
158
|
64
|
58
|
31
|
154
|
65
|
59
|
125
|
147
|
66
|
60
|
126
|
123
|
67
|
61
|
39
|
97
|
68
|
62
|
14
|
95
|
69
|
63
|
127
|
91
|
70
|
64
|
21
|
83
|
71
|
65
|
select * from
(select prod_id,
sum(quantity_sold),
rank () over (order by sum(quantity_sold) desc) as rank,
dense_rank () over (order by sum(quantity_sold) desc) as dense_rank
from sales
where to_char(time_id, 'yyyy-mm') = '2001-06'
group by prod_id)
where rank < 11;
PROD_ID
|
SUM(QUANTITY_SOLD)
|
RANK
|
DENSE_RANK
|
24
|
762
|
1
|
1
|
30
|
627
|
2
|
2
|
147
|
578
|
3
|
3
|
33
|
552
|
4
|
4
|
133
|
550
|
5
|
5
|
40
|
550
|
5
|
5
|
48
|
541
|
7
|
6
|
120
|
538
|
8
|
7
|
23
|
535
|
9
|
8
|
119
|
512
|
10
|
9
|
Cume_dist
select prod_id,
sum(quantity_sold),
cume_dist () over (order by sum(quantity_sold) asc) as cume_dist
from sales
group by prod_id
order by sum(quantity_sold) desc;
PROD_ID
|
SUM(QUANTITY_SOLD)
|
CUME_DIST
|
24
|
762
|
1
|
30
|
627
|
0.985915493
|
147
|
578
|
0.971830986
|
33
|
552
|
0.957746479
|
40
|
550
|
0.943661972
|
133
|
550
|
0.943661972
|
48
|
541
|
0.915492958
|
120
|
538
|
0.901408451
|
23
|
535
|
0.887323944
|
percent_rank Functions
The percent_rank
function is similar to the cume_dist function, but calculates a percentage
ranking of a value relative to its group. Again, without getting into the
theory, we can make some points about percent_rank:
· The range of values returned by the function is always
between 0 and 1.
· The row with a rank of 1 will have a percent rank of 0.
· The formula for calculating percent rank is
select prod_id,
sum(quantity_sold),
cume_dist () over (order by sum(quantity_sold) asc) as cume_dist
from sales
where to_char(time_id, 'yyyy-mm') = '2001-06'
group by prod_id
order by sum(quantity_sold) desc;
PROD_ID
|
SUM(QUANTITY_SOLD)
|
CUME_DIST
|
24
|
762
|
1
|
30
|
627
|
0.985915493
|
147
|
578
|
0.971830986
|
33
|
552
|
0.957746479
|
40
|
550
|
0.943661972
|
133
|
550
|
0.943661972
|
48
|
541
|
0.915492958
|
The ntile Function
select b.prod_subcategory,
sum(a.quantity_sold),
ntile(4) over (ORDER BY SUM(a.quantity_sold) desc) as quartile
from sales a, products b
where a.prod_id = b.prod_id
and to_char(a.time_id, 'yyyy-mm') = '2001-06'
group by b.prod_subcategory;
PROD_SUBCATEGORY
|
SUM(A.QUANTITY_SOLD)
|
QUARTILE
|
Accessories
|
3230
|
1
|
Y Box Games
|
2572
|
1
|
Recordable CDs
|
2278
|
1
|
Camera Batteries
|
2192
|
1
|
Recordable DVD Discs
|
2115
|
1
|
Documentation
|
1931
|
1
|
Modems/Fax
|
1314
|
2
|
CD-ROM
|
1076
|
2
|
Y Box Accessories
|
1050
|
2
|
Printer Supplies
|
956
|
2
|
Memory
|
748
|
2
|
Camera Media
|
664
|
3
|
Home Audio
|
370
|
3
|
Game Consoles
|
352
|
3
|
Operating Systems
|
343
|
3
|
Bulk Pack Diskettes
|
270
|
3
|
Portable PCs
|
215
|
4
|
Desktop PCs
|
214
|
4
|
Camcorders
|
196
|
4
|
Monitors
|
178
|
4
|
Cameras
|
173
|
4
|
Row_Number
Function
select b.prod_subcategory,
sum(a.quantity_sold),
ntile(4) over (ORDER BY SUM(a.quantity_sold) desc) as quartile,
row_number () over (order by sum(quantity_sold) desc) as rownumber
from
sales a, products b
where a.prod_id = b.prod_id
and to_char(a.time_id, 'yyyy-mm') = '2001-06'
group by b.prod_subcategory;
PROD_SUBCATEGORY
|
SUM(A.QUANTITY_SOLD)
|
QUARTILE
|
ROWNUMBER
|
Accessories
|
3230
|
1
|
1
|
Y Box Games
|
2572
|
1
|
2
|
Recordable CDs
|
2278
|
1
|
3
|
Camera Batteries
|
2192
|
1
|
4
|
Recordable DVD Discs
|
2115
|
1
|
5
|
Documentation
|
1931
|
1
|
6
|
Modems/Fax
|
1314
|
2
|
7
|
CD-ROM
|
1076
|
2
|
8
|
Y Box Accessories
|
1050
|
2
|
9
|
Printer Supplies
|
956
|
2
|
10
|
Memory
|
748
|
2
|
11
|
Camera Media
|
664
|
3
|
12
|
Home Audio
|
370
|
3
|
13
|
Game Consoles
|
352
|
3
|
14
|
Operating Systems
|
343
|
3
|
15
|
Bulk Pack Diskettes
|
270
|
3
|
16
|
Portable PCs
|
215
|
4
|
17
|
Desktop PCs
|
214
|
4
|
18
|
Camcorders
|
196
|
4
|
19
|
Monitors
|
178
|
4
|
20
|
Cameras
|
173
|
4
|
21
|
Windowing
Functions
Windowing functions can only be used in
the select and order by clauses. They can be used to calculate the following:
·
Moving sum
·
Moving average
·
Cumulative sum
·
Statistical functions
select b.prod_category,
to_char(a.time_id, 'yyyy-mm'),
sum(a.quantity_sold),
sum(sum(a.quantity_sold)) over (partition by b.prod_category
order by to_char(a.time_id, 'yyyy-mm')
rows unbounded preceding)
as cume_sum
from sales a,
products b
where a.prod_id
= b.prod_id
and b.prod_category_id
between 202 and
204
and to_char(a.time_id, 'yyyy-mm') between '2001-01' and '2001-06'
group by
b.prod_category,
to_char(a.time_id, 'yyyy-mm')
order by b.prod_category, to_char(a.time_id, 'yyyy-mm');
PROD_CATEGORY
|
TO_CHAR(A.TIME_ID,'YYYY-MM')
|
SUM(A.QUANTITY_SOLD)
|
CUME_SUM
|
Hardware
|
2001-01
|
281
|
281
|
Hardware
|
2001-02
|
306
|
587
|
Hardware
|
2001-03
|
1029
|
|
Hardware
|
2001-04
|
439
|
1468
|
Hardware
|
2001-05
|
413
|
1881
|
Hardware
|
2001-06
|
429
|
2310
|
Peripherals and Accessories
|
2001-01
|
5439
|
5439
|
Peripherals and
Accessories
|
2001-02
|
5984
|
11423
|
Peripherals and
Accessories
|
2001-03
|
5104
|
16527
|
Peripherals and
Accessories
|
2001-04
|
5619
|
22146
|
Peripherals and
Accessories
|
2001-05
|
4955
|
27101
|
Peripherals and
Accessories
|
2001-06
|
5486
|
32587
|
Photo
|
2001-01
|
2802
|
2802
|
Photo
|
2001-02
|
2220
|
5022
|
Photo
|
2001-03
|
2982
|
8004
|
Photo
|
2001-04
|
2824
|
10828
|
Photo
|
2001-05
|
2359
|
13187
|
Photo
|
2001-06
|
3225
|
16412
|
No comments:
Post a Comment