官方定义:
Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they return multiple rows for each group. The group of rows is called a window and is defined by the analytic_clause. For each row, a sliding window of rows is defined. The window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a physical number of rows or a logical interval such as time.
Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVINGclauses are completed before the analytic functions are processed. Therefore, analytic functions can appear only in the select list or ORDER BY clause.
Analytic functions are commonly used to compute cumulative, moving, centered, and reporting aggregates.
有以下几个关键点:
对一组数据进行计算,返回多行
不需要进行多表联合,提高性能
在所有表连接和所有WHERE, GROUP BY和HAVING字句之后处理,在ORDER BY子句之前处理
只能位于SELECT或者ORDER BY子句
常用analytic_function
arguments个数:0~3
arguments类型:数字类型或可以隐式转为为数字类型的非数字类型
analytic_clause
在FROM,WHERE,GROUP BY和HAVING子句之后进行计算
在SELECT和ORDER BY子句指定带analytic_clause的分析函数
query_partition_clause
根据expr对查询结果进行分组
忽略该语句则查询结果为一个分组
分析函数使用上面的分支,不带括号
Expr可以是常量,字段,非分析函数,函数表达式
order_by_clause
windowing_clause
Type | Used For |
---|---|
Reporting | Calculating shares, for example, market share. Works with these functions: SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, RATIO_TO_REPORT, and new statistical functions. Note that the DISTINCT keyword may be used in those reporting functions that support DISTINCT in aggregate mode. |
Windowing | Calculating cumulative and moving aggregates. Works with these functions: SUM, AVG, MIN, MAX, COUNT, VARIANCE, STDDEV, FIRST_VALUE, LAST_VALUE, and new statistical functions. Note that the DISTINCTkeyword is not supported in windowing functions except for MAX and MIN. |
Ranking | Calculating ranks, percentiles, and n-tiles of the values in a result set. |
LAG/LEAD | Finding a value in a row a specified number of rows from a current row. |
FIRST/LAST | First or last value in an ordered group. |
Hypothetical Rank and Distribution | The rank or percentile that a row would have if inserted into a specified data set. |
select employee_id,last_name,department_id,salary,
avg(salary) over () avg_sal,
max(salary) over () max_sal,
min(salary) over () min_sal,
sum(salary) over () sum_sal,
count(salary) over () count_sal
from employees order by department_id;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID SALARY AVG_SAL MAX_SAL MIN_SAL SUM_SAL COUNT_SAL
----------- --------------- ------------- ---------- ---------- ---------- ---------- ---------- ----------
200 Whalen 10 4400 6461.83178 24000 2100 691416 107
201 Hartstein 20 13000 6461.83178 24000 2100 691416 107
202 Fay 20 6000 6461.83178 24000 2100 691416 107
114 Raphaely 30 11000 6461.83178 24000 2100 691416 107
119 Colmenares 30 2500 6461.83178 24000 2100 691416 107
115 Khoo 30 3100 6461.83178 24000 2100 691416 107
116 Baida 30 2900 6461.83178 24000 2100 691416 107
117 Tobias 30 2800 6461.83178 24000 2100 691416 107
118 Himuro 30 2600 6461.83178 24000 2100 691416 107
203 Mavris 40 6500 6461.83178 24000 2100 691416 107
198 OConnell 50 2600 6461.83178 24000 2100 691416 107
......
select employee_id,last_name,department_id,salary,
avg(salary) over (partition by department_id) avg_sal,
max(salary) over (partition by department_id) max_sal,
min(salary) over (partition by department_id) min_sal,
sum(salary) over (partition by department_id) sum_sal,
count(salary) over (partition by department_id) count_sal
from employees order by department_id;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID SALARY AVG_SAL MAX_SAL MIN_SAL SUM_SAL COUNT_SAL
----------- --------------- ------------- ---------- ---------- ---------- ---------- ---------- ----------
200 Whalen 10 4400 4400 4400 4400 4400 1
201 Hartstein 20 13000 9500 13000 6000 19000 2
202 Fay 20 6000 9500 13000 6000 19000 2
114 Raphaely 30 11000 4150 11000 2500 24900 6
119 Colmenares 30 2500 4150 11000 2500 24900 6
115 Khoo 30 3100 4150 11000 2500 24900 6
116 Baida 30 2900 4150 11000 2500 24900 6
117 Tobias 30 2800 4150 11000 2500 24900 6
118 Himuro 30 2600 4150 11000 2500 24900 6
203 Mavris 40 6500 6500 6500 6500 6500 1
198 OConnell 50 2600 3475.55556 8200 2100 156400 45
......
select employee_id,last_name,e1.department_id,job_id,salary
from employees e1
where e1.salary=(select max(salary) from employees e2 where e1.department_id=e2.department_id)
order by department_id;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID JOB_ID SALARY
----------- --------------- ------------- ---------- ----------
200 Whalen 10 AD_ASST 4400
201 Hartstein 20 MK_MAN 13000
114 Raphaely 30 PU_MAN 11000
203 Mavris 40 HR_REP 6500
121 Fripp 50 ST_MAN 8200
103 Hunold 60 IT_PROG 9000
204 Baer 70 PR_REP 10000
145 Russell 80 SA_MAN 14000
100 King 90 AD_PRES 24000
108 Greenberg 100 FI_MGR 12008
205 Higgins 110 AC_MGR 12008
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 298340369
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 44 | 5 (20)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 44 | 5 (20)| 00:00:01 |
| 2 | NESTED LOOPS | | 1 | 44 | 5 (20)| 00:00:01 |
| 3 | NESTED LOOPS | | 10 | 44 | 5 (20)| 00:00:01 |
| 4 | VIEW | VW_SQ_1 | 1 | 16 | 4 (25)| 00:00:01 |
|* 5 | FILTER | | | | | |
| 6 | HASH GROUP BY | | 1 | 7 | 4 (25)| 00:00:01 |
| 7 | TABLE ACCESS FULL | EMPLOYEES | 107 | 749 | 3 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 10 | | 0 (0)| 00:00:01 |
|* 9 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 28 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter(MAX("SALARY")>0)
8 - access("E1"."DEPARTMENT_ID"="ITEM_1")
9 - filter("E1"."SALARY"="MAX(SALARY)")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
18 consistent gets
0 physical reads
0 redo size
1178 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
11 rows processed
select emp.*
from (select employee_id,last_name,department_id,job_id,salary,
max(salary) over (partition by department_id) max_sal
from employees
order by department_id) emp
where salary=max_sal
order by department_id;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID JOB_ID SALARY MAX_SAL
----------- --------------- ------------- ---------- ---------- ----------
200 Whalen 10 AD_ASST 4400 4400
201 Hartstein 20 MK_MAN 13000 13000
114 Raphaely 30 PU_MAN 11000 11000
203 Mavris 40 HR_REP 6500 6500
121 Fripp 50 ST_MAN 8200 8200
103 Hunold 60 IT_PROG 9000 9000
204 Baer 70 PR_REP 10000 10000
145 Russell 80 SA_MAN 14000 14000
100 King 90 AD_PRES 24000 24000
108 Greenberg 100 FI_MGR 12008 12008
205 Higgins 110 AC_MGR 12008 12008
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID JOB_ID SALARY MAX_SAL
----------- --------------- ------------- ---------- ---------- ----------
178 Grant SA_REP 7000 7000
12 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 720055818
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 107 | 6848 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 107 | 6848 | 3 (0)| 00:00:01 |
| 2 | WINDOW SORT | | 107 | 2996 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| EMPLOYEES | 107 | 2996 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("SALARY"="MAX_SAL")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
1312 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
12 rows processed
可以看到使用分析函数的SQL性能有一定提升。
select employee_id,last_name,department_id,job_id,salary,
sum(salary) over (partition by department_id,job_id) job_sal1,
sum(salary) over (partition by department_id) dept_sal2
from employees
order by department_id;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID JOB_ID SALARY JOB_SAL1 DEPT_SAL2
----------- --------------- ------------- ---------- ---------- ---------- ----------
200 Whalen 10 AD_ASST 4400 4400 4400
201 Hartstein 20 MK_MAN 13000 13000 19000
202 Fay 20 MK_REP 6000 6000 19000
118 Himuro 30 PU_CLERK 2600 13900 24900
119 Colmenares 30 PU_CLERK 2500 13900 24900
115 Khoo 30 PU_CLERK 3100 13900 24900
116 Baida 30 PU_CLERK 2900 13900 24900
117 Tobias 30 PU_CLERK 2800 13900 24900
114 Raphaely 30 PU_MAN 11000 11000 24900
203 Mavris 40 HR_REP 6500 6500 6500
198 OConnell 50 SH_CLERK 2600 64300 156400
......
select department_id,job_id,
sum(salary) job_sal1,
sum(sum(salary)) over (partition by department_id) dept_sal2
from employees
group by department_id,job_id
order by department_id;
DEPARTMENT_ID JOB_ID JOB_SAL1 DEPT_SAL2
------------- ---------- ---------- ----------
10 AD_ASST 4400 4400
20 MK_MAN 13000 19000
20 MK_REP 6000 19000
30 PU_CLERK 13900 24900
30 PU_MAN 11000 24900
40 HR_REP 6500 6500
50 SH_CLERK 64300 156400
50 ST_CLERK 55700 156400
50 ST_MAN 36400 156400
60 IT_PROG 28800 28800
70 PR_REP 10000 10000
DEPARTMENT_ID JOB_ID JOB_SAL1 DEPT_SAL2
------------- ---------- ---------- ----------
80 SA_MAN 61000 304500
80 SA_REP 243500 304500
90 AD_PRES 24000 58000
90 AD_VP 34000 58000
100 FI_ACCOUNT 39600 51608
100 FI_MGR 12008 51608
110 AC_ACCOUNT 8300 20308
110 AC_MGR 12008 20308
SA_REP 7000 7000
20 rows selected.
select emp.*,100 * round(job_sal1/dept_sal2, 2)||'%' Percent
from (select department_id,job_id,
sum(salary) job_sal1,
sum(sum(salary)) over (partition by department_id) dept_sal2
from employees
group by department_id,job_id) emp
where job_sal1>dept_sal2*0.5;
DEPARTMENT_ID JOB_ID JOB_SAL1 DEPT_SAL2 PERCENT
------------- ---------- ---------- ---------- -----------------------------------------
10 AD_ASST 4400 4400 100%
20 MK_MAN 13000 19000 68%
30 PU_CLERK 13900 24900 56%
40 HR_REP 6500 6500 100%
60 IT_PROG 28800 28800 100%
70 PR_REP 10000 10000 100%
80 SA_REP 243500 304500 80%
90 AD_VP 34000 58000 59%
100 FI_ACCOUNT 39600 51608 77%
110 AC_MGR 12008 20308 59%
SA_REP 7000 7000 100%
11 rows selected.
select emp.*
from (select department_id,job_id,
sum(salary) job_sal1,
sum(sum(salary)) over (partition by department_id) dept_sal2,
ratio_to_report(sum(salary)) over (partition by department_id) job_to_dept_sal3
from employees
group by department_id,job_id) emp
where job_to_dept_sal3>0.5;
DEPARTMENT_ID JOB_ID JOB_SAL1 DEPT_SAL2 JOB_TO_DEPT_SAL3
------------- ---------- ---------- ---------- ----------------
10 AD_ASST 4400 4400 1
20 MK_MAN 13000 19000 .684210526
30 PU_CLERK 13900 24900 .558232932
40 HR_REP 6500 6500 1
60 IT_PROG 28800 28800 1
70 PR_REP 10000 10000 1
80 SA_REP 243500 304500 .799671593
90 AD_VP 34000 58000 .586206897
100 FI_ACCOUNT 39600 51608 .767322896
110 AC_MGR 12008 20308 .591294071
SA_REP 7000 7000 1
11 rows selected.
select employee_id,last_name,department_id,hire_date,salary,
ratio_to_report(salary) over(partition by department_id) as pct1,
ratio_to_report(salary) over() as pct2
from employees;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID HIRE_DATE SALARY PCT1 PCT2
----------- --------------- ------------- ------------------ ---------- ---------- ----------
200 Whalen 10 17-SEP-03 4400 1 .006363752
201 Hartstein 20 17-FEB-04 13000 .684210526 .018801995
202 Fay 20 17-AUG-05 6000 .315789474 .008677844
114 Raphaely 30 07-DEC-02 11000 .441767068 .01590938
119 Colmenares 30 10-AUG-07 2500 .100401606 .003615768
115 Khoo 30 18-MAY-03 3100 .124497992 .004483553
116 Baida 30 24-DEC-05 2900 .116465863 .004194291
117 Tobias 30 24-JUL-05 2800 .112449799 .00404966
118 Himuro 30 15-NOV-06 2600 .104417671 .003760399
203 Mavris 40 07-JUN-02 6500 1 .009400997
198 OConnell 50 21-JUN-07 2600 .016624041 .003760399
......
Cumulative Aggregate Function
select employee_id,last_name,department_id,salary,
sum(salary) over (partition by department_id order by department_id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) dept_sal_cum1,
sum(salary) over (order by department_id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) dept_sal_cum2
from employees;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID SALARY DEPT_SAL_CUM1 DEPT_SAL_CUM2
----------- --------------- ------------- ---------- ------------- -------------
200 Whalen 10 4400 4400 691416
201 Hartstein 20 13000 19000 691416
202 Fay 20 6000 19000 691416
114 Raphaely 30 11000 24900 691416
119 Colmenares 30 2500 24900 691416
115 Khoo 30 3100 24900 691416
116 Baida 30 2900 24900 691416
117 Tobias 30 2800 24900 691416
118 Himuro 30 2600 24900 691416
203 Mavris 40 6500 6500 691416
198 OConnell 50 2600 156400 691416
......
和以下SQL等价:
select employee_id,last_name,department_id,salary,
sum(salary) over (partition by department_id) dept_sal_cum1,
sum(salary) over () dept_sal_cum2
from employees;
select employee_id,last_name,department_id,salary,
sum(salary) over (partition by department_id order by department_id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) dept_sal_cum1,
sum(salary) over (order by department_id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) dept_sal_cum2
from employees;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID SALARY DEPT_SAL_CUM1 DEPT_SAL_CUM2
----------- --------------- ------------- ---------- ------------- -------------
200 Whalen 10 4400 4400 4400
201 Hartstein 20 13000 13000 17400
202 Fay 20 6000 19000 23400
114 Raphaely 30 11000 11000 34400
119 Colmenares 30 2500 13500 36900
115 Khoo 30 3100 16600 40000
116 Baida 30 2900 19500 42900
117 Tobias 30 2800 22300 45700
118 Himuro 30 2600 24900 48300
203 Mavris 40 6500 6500 54800
198 OConnell 50 2600 2600 57400
......
和以下SQL等价:
select employee_id,last_name,department_id,salary,
sum(salary) over (partition by department_id order by department_id ROWS UNBOUNDED PRECEDING) dept_sal_cum1,
sum(salary) over (order by department_id ROWS UNBOUNDED PRECEDING) dept_sal_cum2
from employees;
select employee_id,last_name,department_id,salary,
sum(salary) over (partition by department_id order by department_id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) dept_sal_cum1,
sum(salary) over (partition by department_id order by department_id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING) dept_sal_cum2
from employees;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID SALARY DEPT_SAL_CUM1 DEPT_SAL_CUM2
----------- --------------- ------------- ---------- ------------- -------------
200 Whalen 10 4400 4400
201 Hartstein 20 13000 19000
202 Fay 20 6000 13000 19000
114 Raphaely 30 11000 13500
119 Colmenares 30 2500 11000 16600
115 Khoo 30 3100 13500 19500
116 Baida 30 2900 16600 22300
117 Tobias 30 2800 19500 24900
118 Himuro 30 2600 22300 24900
203 Mavris 40 6500 6500
198 OConnell 50 2600 5200
......
Moving Aggregate Function
select employee_id,last_name,department_id,salary,
sum(salary) over (partition by department_id order by department_id ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) dept_sal_cum1,
sum(salary) over (partition by department_id order by department_id ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) dept_sal_cum2
from employees;
EMPLOYEE_ID LAST_NAME DEPARTMENT_ID SALARY DEPT_SAL_CUM1 DEPT_SAL_CUM2
----------- --------------- ------------- ---------- ------------- -------------
200 Whalen 10 4400 4400 4400
201 Hartstein 20 13000 13000 19000
202 Fay 20 6000 19000 6000
114 Raphaely 30 11000 11000 13500
119 Colmenares 30 2500 13500 5600
115 Khoo 30 3100 5600 6000
116 Baida 30 2900 6000 5700
117 Tobias 30 2800 5700 5400
118 Himuro 30 2600 5400 2600
203 Mavris 40 6500 6500 6500
198 OConnell 50 2600 2600 5200
......
Centered Aggregate
SELECT hire_date, SUM(salary) AS sum_sal1,
AVG(SUM(salary)) OVER (ORDER BY hire_date RANGE BETWEEN INTERVAL '1' DAY PRECEDING AND INTERVAL '1' DAY FOLLOWING) AS CENTERED_1_DAY_AVG
FROM employees
GROUP BY hire_date;
HIRE_DATE SUM_SAL1 CENTERED_1_DAY_AVG
------------------ ---------- ------------------
13-JAN-01 17000 17000
07-JUN-02 36808 36808
16-AUG-02 9000 10504
17-AUG-02 12008 10504
07-DEC-02 11000 11000
01-MAY-03 7900 7900
18-MAY-03 3100 3100
17-JUN-03 24000 24000
14-JUL-03 3600 3600
17-SEP-03 4400 4400
17-OCT-03 3500 3500
......