Oracle SQL*Net相关的空闲等待事件


空闲等待事件,从字面上理解似乎并不需要我们花多大心思去关心,就让它等在那里呗,万事总有它存在的道理,特别对于SQL*Net系列等待事件而言,细细琢磨还是有不少文章可做的。

---------<    Part 1. 那些我们经常见到的SQL*Net等待  >---------
SQL*NET message to client等待事件
Server process通知client准备接收服务器端返回的数据,通知完成后SQL*NET message to client等待事件立即结束,然后server process开始将要返回的数据从Buffer cache复制到PGA,再从PGA传送给client。SQL*NET message to client等待事件发生的次数与SQL*NET message from client发生次数相当。在10046 Trace文件里我们一般可以观察到"SQL*NET message to client"总是先于"SQL*NET message from client"出现

SQL*NET message from client等待事件
笼统的说就是服务器进程空闲等待来自客户端进程的消息响应。
最常见的情况是session连接到数据库上之后什么事情都不干,这时我们到v$session里可以查询到该session的status='ACTIVE', event='SQL*NET message from client',表示server process正在等待client process发指令,例如:
(如无特殊说明,本文中session 1用来执行测试用的SQL,session 2用来监测session 1执行时产生的统计信息)
---session 1:
select * from v$mystat where rownum=1;
      SID STATISTIC#      VALUE
---------- ---------- ----------
      862          0          0

 ---session 2: 查询session 1的状态
col event format a40
set linesize 120
select sid,event,status from v$session where sid=862 and event in ('SQL*Net message to client','SQL*Net message from client');
      SID EVENT                                    STATUS
---------- ---------------------------------------- --------
      862 SQL*Net message from client              INACTIVE

---session 2: 如果session 1一直无所事事,那么SQL*Net message from client事件的等待次数一直不会增加,直到下一次该事件被触发total_waits、time_waited才会累计上升
select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');


EVENT                                    TOTAL_WAITS TIME_WAITED
---------------------------------------- ----------- -----------
SQL*Net message from client                      30      45497


select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');


EVENT                                    TOTAL_WAITS TIME_WAITED
---------------------------------------- ----------- -----------
SQL*Net message from client                      30      45497


total_waits比较好理解,等一次往上加1;TIME_WAITED(单位是百分之一秒)会把之前等待的事件累计上去,例如Session 1等待了30秒后再次发起了一条select * from dual,那么在session 2里再次运行select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client')后的结果基本就是:


EVENT                                    TOTAL_WAITS TIME_WAITED
---------------------------------------- ----------- -----------
SQL*Net message from client                      33      48497    <---45497+空闲等待的30秒时间


所谓空闲等待事件的头衔应该就是这样来的吧,纯粹的空闲等待事件并无任何研究价值,我们只关心"SQL*Net message from client"等待事件的次数何时会上升,根据实验结果我们发现有以下两种情况:
第一种情况:
client端向Server端发起一条SQL命令,只要该命令包含了命令关键字,"SQL*Net message from client"事件就会增长。
验证过程如下:
  a) client端发起一条命令后有结果行返回,事件次数从30->36
      ---session 2:
        select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');
 EVENT                                    TOTAL_WAITS TIME_WAITED
 ---------------------------------------- ----------- -----------
 SQL*Net message from client                      30      45497


  ---session 1:
  SQL> select * from dual;
 
 DUMMY
 -----
 X


  ---session 2:
  select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');
    EVENT                                    TOTAL_WAITS TIME_WAITED
 ---------------------------------------- ----------- -----------
 SQL*Net message from client                      36      230203


  b) client端发起一条命令后无结果返回,事件次数从36->42
      ---session 2:
      select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');
 EVENT                                    TOTAL_WAITS TIME_WAITED
 ---------------------------------------- ----------- -----------
 SQL*Net message from client                      36      230203


  ---session 1:
  SQL> select * from scott.empty_tab1;
 
 USERNAME                          USER_ID CREATED
      ------------------------------ ---------- -----------


  ---session 2:
  select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');
    EVENT                                    TOTAL_WAITS TIME_WAITED
 ---------------------------------------- ----------- -----------
 SQL*Net message from client                      42      263051


  c) client端发起一条语法正确的命令,虽然查询的对象不存在,事件次数还是从42->47
      ---session 2:
        select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');
 EVENT                                    TOTAL_WAITS TIME_WAITED
        ---------------------------------------- ----------- -----------
        SQL*Net message from client                      42      263051


  ---session 1:
  SYS@tstdb1-SQL> select * from notexist1;
 select * from notexist1
              *
 ERROR at line 1:
 ORA-00942: table or view does not exist


  ---session 2:
  SYS@tstdb1-SQL> select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');


    EVENT                                    TOTAL_WAITS TIME_WAITED
 ---------------------------------------- ----------- -----------
 SQL*Net message from client                      47      384038


  d) client端发起得命令语法检查不通过,事件次数还是从63->68,因为命令里包含了select这个命令关键字
      ---session 2:
        SYS@tstdb1-SQL> select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client')


 EVENT                                    TOTAL_WAITS TIME_WAITED
 ---------------------------------------- ----------- -----------
 SQL*Net message from client                      63      472134


  ---session 1:
  SQL> select * fro dual;
        select * fro dual
 
 ORA-00923: FROM keyword not found where expected
 
  ---session 2:
  SYS@tstdb1-SQL> select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client')


  EVENT                                    TOTAL_WAITS TIME_WAITED
  ---------------------------------------- ----------- -----------
  SQL*Net message from client                      68      477107
 
  e) client端发起的命令里不包含任何命令关键字,事件次数维持不变
    ---session 2:
        SYS@tstdb1-SQL> select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');


 EVENT                                    TOTAL_WAITS TIME_WAITED
 ---------------------------------------- ----------- -----------
 SQL*Net message from client                      68      477107
 
    ---session 1:
  SYS@tstdb1-SQL> selec;
        SP2-0042: unknown command "selec" - rest of line ignored. 
       
        ---session 2:
  SYS@tstdb1-SQL> select event,total_waits,time_waited from v$session_event where sid=862 and event in ('SQL*Net message from client');


 EVENT                                    TOTAL_WAITS TIME_WAITED
 ---------------------------------------- ----------- -----------
 SQL*Net message from client                      68      477107


因client在向Server端发起命令这一过程的时间非常短暂,所以我们不大容易观察到active的session等待在SQL*NET message from client事件上。我们可以用如下的过程去模拟active session的等待:
---session 2:
SYS@tstdb1-SQL> select sid,event,status from v$session where sid=862;


      SID EVENT                                    STATUS
---------- ---------------------------------------- --------
      862 SQL*Net message from client              INACTIVE
     
---session 1:
declare
v_sql varchar2(100);
begin
while ( true ) loop
v_sql:='select * from scott.t1';
execute immediate v_sql;
end loop;
end;
/


---session 2:在session 1执行期间观察SQL*Net message from client处于active,表示client端正在发送命令给server端,因为client的命令是一个无限循环,所以发送命令的这一过程始终没有结束,因此一直处于ACTIVE状态
SYS@tstdb1-SQL> select sid,event,status from v$session where sid=862;


      SID EVENT                                    STATUS
---------- ---------------------------------------- --------
      862 SQL*Net message from client              ACTIVE
     
第二种情况:
当服务器进程把PGA里的结果数据返回给客户端之后,也会触发SQL*NET message from client等待,这时等待的是客户端收到结果后给服务器进程的确认。服务器进程在返回客户端结果集的时候不是一次全量返回的,而是按照一定单位数量分多批次来返回,特别是在结果集记录数较多的时候服务器端和客户端会经历多次交互的过程,这一过程对最终用户或者接收结果的应用程序来是透明的。我们最熟悉的SQL*PLUS就是一个典型的客户端进程,我们可以通过设置set arraysize 30来限制服务器端每次最多返回30行记录给客户端,若一查询语句总共需要返回100行记录,那么以每30行为单位,服务器和客户端总共需要4个回合(30->30->30->10)才能完成100条记录的传送,当我们打开SQL*PLUS里autotrace功能执行这条查询语句就会发现SQL*Net roundtrips to/from client的统计值为4,SQL*NET message from client等待事件发生的次数近似等于SQL*Net roundtrips to/from client的统计值,对SQL*PLUS客户端程序而言,arraysize设置越大roundtrips就会较小,SQL*NET message from client等待事件次数也会越少,也即意味着服务器与客户端的通信次数减少。针对这一情况我们也来做两组简单的测试:


第(1)组:设置arraysize等于返回的行数
---session 1
SYS@tstdb1-SQL> show arraysize
arraysize 15


select count(*)from scott.t1;
  COUNT(*)
----------
        21


select p.spid,s.sid,s.serial# from v$session s,v$process p where s.sid=sys_context('userenv','sid') and s.paddr=p.addr;
SPID                            SID    SERIAL#
------------------------ ---------- ----------
5439878                        266      48189
       
---session 2
col event format a40
select event,total_waits,time_waited from v$session_event where sid=266 and event in ('SQL*Net message from client');
EVENT                                    TOTAL_WAITS TIME_WAITED
---------------------------------------- ----------- -----------
SQL*Net message from client                      14        6875


---session 1: 设定arraysize等于表的行数
set arraysize 21


select * from scott.t1;
。。。省略输出


---session 2
select event,total_waits,time_waited from v$session_event where sid=266 and event in ('SQL*Net message from client');
EVENT                                    TOTAL_WAITS TIME_WAITED
---------------------------------------- ----------- -----------
SQL*Net message from client                      17      29997


上述结果得出session 1 执行Select后"SQL*Net message from client"次数从14->17,即增加了3次,分析这3次分别在什么时候产生的,可以通过session 1的trace文件获得:
*** 2016-01-22 13:41:37.686
WAIT #4573489544: nam='SQL*Net message from client' ela= 132467556 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11895166310146
CLOSE #4573458848:c=0,e=24,dep=0,type=0,tim=11895166311042
CLOSE #4573489544:c=0,e=45,dep=0,type=3,tim=11895166311131
=====================
PARSING IN CURSOR #4573469256 len=23 dep=0 uid=0 oct=3 lid=0 tim=11895166311320 hv=3534831217 ad='7000001ba6b8878' sqlid='1qgrwqv9b2gmj'
 select * from scott.t1
END OF STMT
PARSE #4573469256:c=0,e=129,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=838529891,tim=11895166311319
EXEC #4573469256:c=0,e=63,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=838529891,tim=11895166311456
WAIT #4573469256: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11895166311500
FETCH #4573469256:c=0,e=124,p=0,cr=2,cu=0,mis=0,r=1,dep=0,og=1,plh=838529891,tim=11895166311659    <---r=1表示返回了21条里的第1条
WAIT #4573469256: nam='SQL*Net message from client' ela= 271 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11895166311965
WAIT #4573469256: nam='SQL*Net message to client' ela= 0 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11895166311995
FETCH #4573469256:c=0,e=49,p=0,cr=1,cu=0,mis=0,r=20,dep=0,og=1,plh=838529891,tim=11895166312036  <---r=20表示返回了21条里的后20条
STAT #4573469256 id=1 cnt=21 pid=0 pos=1 obj=41088 op='TABLE ACCESS FULL T1 (cr=3 pr=0 pw=0 time=113 us cost=3 size=357 card=21)'
WAIT #4573469256: nam='SQL*Net message from client' ela= 1102 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11895166313235
PARSE #4573489544:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=11895166313300
WAIT #4573489544: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11895166313634
EXEC #4573489544:c=0,e=290,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=0,tim=11895166313672


第1次出现SQL*Net message from client等待事件出现在client向server端发送命令的时候;
第2次出现在server端进程返回第一行数据的时候,client端接收到第1行数据时对server端的确认响应;
第3次是client端在接收到的第2~21行数据后对server端作出的确认响应;


---session 1:从autotrace里也可以看到roundtrips=2
SYS@tstdb1-SQL> select * from scott.t1;


21 rows selected.

 


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
      1216  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        21  rows processed


第(2)组:设置arraysize小于返回的行数
---session 1:
SYS@tstdb1-SQL> set arraysize 15
SYS@tstdb1-SQL> show arraysize
arraysize 15


---session 2:
SYS@tstdb1-SQL> select event,total_waits,time_waited from v$session_event where sid=266 and event in ('SQL*Net message from client');


EVENT                                    TOTAL_WAITS TIME_WAITED
---------------------------------------- ----------- -----------
SQL*Net message from client                      23      111804


---session 1:
select * from scott.t1;


---session 2:
SYS@tstdb1-SQL> select event,total_waits,time_waited from v$session_event where sid=266 and event in ('SQL*Net message from client')


EVENT                                    TOTAL_WAITS TIME_WAITED
---------------------------------------- ----------- -----------
SQL*Net message from client                      27      198030

 


前后对比发现SQL*Net message from client从23->27,增长了4次,从tracefile里可以清楚的看出这21行记录是按照1->15->5分三批次返回的
FETCH #4573469256:c=0,e=119,p=0,cr=2,cu=0,mis=0,r=1,dep=0,og=1,plh=838529891,tim=11896171488908
WAIT #4573469256: nam='SQL*Net message from client' ela= 272 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11896171489214
WAIT #4573469256: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11896171489244
FETCH #4573469256:c=0,e=40,p=0,cr=1,cu=0,mis=0,r=15,dep=0,og=1,plh=838529891,tim=11896171489275
WAIT #4573469256: nam='SQL*Net message from client' ela= 735 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11896171490033
WAIT #4573469256: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11896171490068
FETCH #4573469256:c=0,e=31,p=0,cr=1,cu=0,mis=0,r=5,dep=0,og=1,plh=838529891,tim=11896171490091
STAT #4573469256 id=1 cnt=21 pid=0 pos=1 obj=41088 op='TABLE ACCESS FULL T1 (cr=4 pr=0 pw=0 time=107 us cost=3 size=357 card=21)'
WAIT #4573469256: nam='SQL*Net message from client' ela= 410 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=11896171490602


---session 1:从autotrace里也可以看到roundtrips=3
SYS@tstdb1-SQL> set autotrace traceonly statistics
SYS@tstdb1-SQL> select * from scott.t1


21 rows selected.

 


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
      1402  bytes sent via SQL*Net to client
        531  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        21  rows processed
       
当我们Select一个大表时,往往需要等待很长时间才能得到所有记录,这之中的大部分时间都耗在了网络传输上,这期间session绝大部分情况下都处于INACTIVE状态下,等待的事件往往是"SQL*NET message from client",就像下面这样
SYS@tstdb1-SQL> select sid,event,status from v$session where sid=266;


      SID EVENT                                    STATUS
---------- ---------------------------------------- --------
      266 SQL*Net message from client              INACTIVE


有同学可能会问,既然结果记录还在返回过程中,为何不是ACTIVE状态,我们之前的测试清楚的回答了这个问题:全量的结果集是按批次返回的,每当一个批次的结果返回至客户端,客户端发起给服务器端的确认响应时,才会触发一次"SQL*Net message from client"等待事件,因为这个响应的瞬间极短,所以我们几乎无法观察到针对SQL*Net message from client的ACTIVE等待。


讲了这么多,可能有的人又要问了研究SQL*Net message from client对DBA而言有何意义?
答:可以大致估算服务器与客户端间的网络传输效率
我们挑选一张百万级记录的大表ca_mms_file来做测试


---session 1: 执行大表的扫描,设置arraysize=100,ad._rec_sid表用于记录sid信息
create table ad.rec_sid (sid number,serial# number);
truncate table ad.rec_sid;
sqlplus -S system/asdf3_14@shzwbcv1 << EOF > /dev/null
insert into ad.rec_sid select s.sid,s.serial# from v\$session s,v\$process p where s.sid=sys_context('userenv','sid') and s.paddr=p.addr;
commit;
set arraysize 100
select * from ad.ca_mms_file;
EOF


---session 2: 执行下列过程,保持对于session 1的持续统计
while [ true ]
do
sqlplus -S '/as sysdba' << EOF
set feedback off
set serveroutput on
declare
v_sql2 varchar2(2000):='select ev.sid,event,total_waits,time_waited from v\$session_event ev,ad.rec_sid rs where ev.sid=rs.sid and event in (''SQL*Net message from client'')';
v_event varchar2(300);
v_total_waits number;
v_sid number;
v_time_waited number;
begin
execute immediate v_sql2 into v_sid,v_event,v_total_waits,v_time_waited;
dbms_output.put_line(v_event||' sid:  '||v_sid||'  total_waits:'||v_total_waits||'  time_waited:'||v_time_waited);
end;
/
EOF
sleep 1
done


SQL*Net message from client sid:  2362  total_waits:42  time_waited:1725
SQL*Net message from client sid:  2362  total_waits:44  time_waited:1842
SQL*Net message from client sid:  2362  total_waits:46  time_waited:1961
SQL*Net message from client sid:  2362  total_waits:48  time_waited:2080
SQL*Net message from client sid:  2362  total_waits:50  time_waited:2194
SQL*Net message from client sid:  2362  total_waits:52  time_waited:2299
SQL*Net message from client sid:  2362  total_waits:54  time_waited:2412
SQL*Net message from client sid:  2362  total_waits:56  time_waited:2530
。。。省略了部分输出


可以大致计算出返回的速率为(2530-1725)/(56-42)*10=575ms,因为arraysize=100,575ms是指每100条记录的传输时间,严格来575ms=服务器从PGA里抓取100条记录耗时+100条记录传输到客户端的耗时,值得一提的是因session 1将输出重定义到了/dev/null,所以这个575ms并未包含写入磁盘的事件,如果将Session 1的输出重定向到磁盘上的某个文件,那么时长会因为包含了IO响应而明显延长,以下是将/dev/null替换为一个文件路径后的输出,可以看出100条记录的处理时长增加到了1191ms,其中一半的时长消耗在了IO上

SQL*Net message from client sid:  2362  total_waits:26  time_waited:1507
SQL*Net message from client sid:  2362  total_waits:26  time_waited:1507
SQL*Net message from client sid:  2362  total_waits:27  time_waited:1634
SQL*Net message from client sid:  2362  total_waits:28  time_waited:1756
SQL*Net message from client sid:  2362  total_waits:29  time_waited:1877
SQL*Net message from client sid:  2362  total_waits:30  time_waited:1998
SQL*Net message from client sid:  2362  total_waits:31  time_waited:2113
SQL*Net message from client sid:  2362  total_waits:32  time_waited:2229
SQL*Net message from client sid:  2362  total_waits:33  time_waited:2341


其实还有一个决定服务器与客户端传输效率的因素,那就是SDU size,下面会有详细阐述


SQL*NET more data to client等待事件:
客户端通过arraysize能够指定服务器每一批次传送给客户端的记录条数,还有一个与此功能类似但更贴近网络层面的参数SDU(Session Data Unit),SDU定义了服务器端一次能够向客户端传送多少个字节,大小范围从512 bytes~65535 bytes。当arraysize指定的行数换算成字节后的值大于SDU设定值时,就会触发SQL*NET more data to client等待。举个例子:
客户端的SQL*Plus里设定了,100行为一批次进行返回
set arraysize 100
客户端的sqlnet.ora里定义了sdu_size=8192 bytes
default_sdu_size=8192


假设每行记录的长度为100bytes,100行的总长度就是10000bytes,比8192要大,所以在访问这100条记录的过程中会触发一次SQL*NET more data to client等待,服务器端会先返回8192bytes给客户端,再返回剩下的1808bytes。当然这一切对于客户端来说是透明的:客户端仍然是以每批次100行的速度从服务器端获取结果,只不过在network层面这100行会被拆分成8192bytes、1808bytes两个包。
---session 1:作为客户端连上database server,创建一个测试表Scott.t12
SQL> create table scott.t12 as select owner,table_name,tablespace_name from dba_tables where rownum<200;


Table created.


SQL> SELECT distinct DBMS_ROWID.rowid_block_number(rowid),DBMS_ROWID.rowid_relative_fno(rowid) from scott.t12;


DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                                827                                    6


SQL> select count(*) from scott.t12;


  COUNT(*)
----------
      199


exec dbms_stats.gather_table_stats('scott','t12');


SQL> select table_name,avg_row_len from dba_tables where table_name='T12';


TABLE_NAME                    AVG_ROW_LEN
------------------------------ -----------
T12                                    24


select p.spid,s.sid,s.serial# from v$session s,v$process p where s.sid=sys_context('userenv','sid') and s.paddr=p.addr;


SPID                            SID    SERIAL#
------------------------ ---------- ----------
15859916                        730      1333


set arraysize 199    <---设置每次返回199行,因为AVG_ROW_LEN=24,所以总的字节数为4776bytes


---session 2:
col event format a40
set linesize 120
select sid,event,total_waits,time_waited,average_wait from v$session_event where sid=730 and event like 'SQL*Net%';
      SID EVENT                                    TOTAL_WAITS TIME_WAITED AVERAGE_WAIT
---------- ---------------------------------------- ----------- ----------- ------------
      730 SQL*Net message to client                        27          0            0
      730 SQL*Net message from client                      26      61386        2361


---session 1:
set arraysize 199
select * from scott.t12;


---session 2: session 1的执行后,没有出现SQL*NET more data to client等待事件,这是因为sdu_size默认值为8192bytes > 4776bytes,所以能在一个网络包里传送过去
SYS@tstdb1-SQL> select sid,event,total_waits,time_waited,average_wait from v$session_event where sid=730 and event like 'SQL*Net%';


        SID EVENT                                    TOTAL_WAITS TIME_WAITED AVERAGE_WAIT
---------- ---------------------------------------- ----------- ----------- ------------
      730 SQL*Net message to client                        29          0            0
      730 SQL*Net message from client                      28      72039      2572.82


接下来我们人工将客户端的sdu_size调整为3072bytes,调整的方法有多种,下面有列举:


---session 1:
select p.spid,s.sid,s.serial# from v$session s,v$process p where s.sid=sys_context('userenv','sid') and s.paddr=p.addr;
SPID                            SID    SERIAL#
------------------------ ---------- ----------
6357954                        467      6611


---session 2:
select sid,event,total_waits,time_waited,average_wait from v$session_event where sid=467 and event like 'SQL*Net%';
      SID EVENT                                    TOTAL_WAITS TIME_WAITED AVERAGE_WAIT
---------- ---------------------------------------- ----------- ----------- ------------
      467 SQL*Net message to client                        12          0            0
      467 SQL*Net message from client                      11        468        42.52


---session 1:
set arraysize 199
select * from scott.t12;


---session 2: 观察到多了一次SQL*Net more data to client等待,这时因为4776bytes要分为3072bytes->1704bytes两批次进行传送
select sid,event,total_waits,time_waited,average_wait from v$session_event where sid=467 and event like 'SQL*Net%';


      SID EVENT                                    TOTAL_WAITS TIME_WAITED AVERAGE_WAIT
---------- ---------------------------------------- ----------- ----------- ------------
      467 SQL*Net message to client                        14          0            0
      467 SQL*Net more data to client                        1          0            0
      467 SQL*Net message from client                      13        3342      257.11
     
附:SDU可以通过以下三种方式进行定义:
服务器端的sqlnet.ora定义、客户端的sqlnet.ora定义:
default_sdu_size=3072


服务器端的listener.ora定义
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SDU=3072)    <---
      (GLOBAL_DBNAME = manaux1)
      (Oracle_HOME = /oracle/app/oracle/product/11.2.0/db_1)
      (SID_NAME = manaux)
    )
  )


客户端的tnsnames.ora里定义:
tstdb1 =
  (DESCRIPTION =
    (sdu=3072)    <---
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 10.10.141.209)(PORT = 1521))
    )
    (CONNECT_DATA =
      (Service_name = tstdb1)
      (SERVER = DEDICATED)
    )
  )

 


在上节利用"SQL*Net message from client"事件跟踪服务器->客户端传输效率的例子中,如果要使ad.ca_mms_file表的全表扫描更快的返回结果,我们可以调大sdu_size,调整方法如下:
select owner,table_name,avg_row_len from dba_tables where table_name='CA_MMS_FILE';


OWNER                          TABLE_NAME                    AVG_ROW_LEN
------------------------------ ------------------------------ -----------
AD                            CA_MMS_FILE                            181


array=100的情况下,Sdu_size至少应修改成> 18100 bytes (181*100),把sdu_size设为20000的情况下,得到如下统计结果
SQL*Net message from client sid:  10  total_waits:94  time_waited:3771
SQL*Net message from client sid:  10  total_waits:96  time_waited:3864
SQL*Net message from client sid:  10  total_waits:99  time_waited:4003
SQL*Net message from client sid:  10  total_waits:101  time_waited:4096
SQL*Net message from client sid:  10  total_waits:103  time_waited:4189
SQL*Net message from client sid:  10  total_waits:106  time_waited:4328
SQL*Net message from client sid:  10  total_waits:108  time_waited:4421
SQL*Net message from client sid:  10  total_waits:111  time_waited:4561
SQL*Net message from client sid:  10  total_waits:114  time_waited:4699
SQL*Net message from client sid:  10  total_waits:116  time_waited:4792
SQL*Net message from client sid:  10  total_waits:119  time_waited:4931


平均下来每传送100条记录耗时约为464ms,比sdu_size采用默认值的情况下降低了111ms,提速近20%,SQL*Net more data to client等待事件大大减少,
所以合理调大sdu size设置值也能显著提高服务器端与客户端传输效率
   
---------<    Part 2 把SQL*Net 三类事件串联起来连同session logical reads分析一下它们之间的联系  >---------
(1) arraysize<=SDU size, CTAS方式创建测试表
---session 1: 创建测试表
create tablespace ts0111 datafile '/oradata06/testaaaaa/ts0111.dbf' size 32m segment space management auto;


create table scott.t1 tablespace ts0111 as select * from all_users;


select count(*) from scott.t1 ;
  COUNT(*)
----------
        21


SQL> select header_file,header_block from dba_segments where segment_name='T1' and owner='SCOTT';


HEADER_FILE HEADER_BLOCK
----------- ------------
          5          130                                <---segment header
       
SELECT distinct DBMS_ROWID.rowid_block_number(rowid),DBMS_ROWID.rowid_relative_fno(rowid) from scott.t1 ;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                              131                                    5          <---真正存放数据的块仅一个


---session 2: 获取session基本信息,确认所有行的长度和<SDU_SIZE,SDU_SIZE默认情况下是8192字节
select sum(length(user_id)+length(username)+length(created)) from scott.t1 ;  <---确保所有记录总长<默认的SDU值8192字节,否则会出现SQL*NET more data to client事件,后面会有专针对于SQL*NET more data to client的讨论
SUM(LENGTH(USER_ID)+LENGTH(USERNAME)+LENGTH(CREATED))
-----------------------------------------------------
                                                550


SQL> select p.spid,s.sid,s.serial# from v$session s,v$process p where s.sid=sys_context('userenv','sid') and s.paddr=p.addr;


SPID                            SID    SERIAL#
------------------------ ---------- ----------
17105026                        266      1433


---session 3:记录session 2的执行前的SQL*NET等待事件和logical reads次数
set linesize 120
col event format a40
select sid,event,total_waits,time_waited from v$session_event where sid=266 and event in ('SQL*Net message to client','SQL*Net message from client','SQL*Net more data to client');
      SID EVENT                                    TOTAL_WAITS TIME_WAITED
---------- ---------------------------------------- ----------- -----------
      266 SQL*Net message to client                        139          0
      266 SQL*Net message from client                      138      188288


col name format a26
select n.name,s.sid,s.VALUE from v$sesstat s,v$statname n where n.name='session logical reads' and sid=266 and n.statistic#=s.statistic#;
NAME                              SID      VALUE
-------------------------- ---------- ----------
session logical reads            266      4537


跟踪session 2的sid,tracefile里可以记录逻辑读访问哪些块,前提是必须保证这些块不在buffer cache里,可以先flush buffer_cache
exec DBMS_MONITOR.SESSION_TRACE_ENABLE(session_id=>266,serial_num=>1433,waits=>TRUE,plan_stat=>NULL);


---session 2: 执行select * from scott.t1 ;
set arraysize 21
select /*+ dynamic_sampling(0) */ * from scott.t1 ;


---session 3:记录session 2的执行后的SQL*NET等待事件,session logical reads次数
set linesize 120
col event format a40
select sid,event,total_waits,time_waited from v$session_event where sid=266 and event in ('SQL*Net message to client','SQL*Net message from client','SQL*Net more data to client');


      SID EVENT                                    TOTAL_WAITS TIME_WAITED
---------- ---------------------------------------- ----------- -----------
      266 SQL*Net message to client                        142          0
      266 SQL*Net message from client                      141      199871


select n.name,s.sid,s.VALUE from v$sesstat s,v$statname n where n.name='session logical reads' and sid=266 and n.statistic#=s.statistic#;
NAME                              SID      VALUE
-------------------------- ---------- ----------
session logical reads            266      4541
     
结束session 2的跟踪
exec DBMS_MONITOR.SESSION_TRACE_DISABLE(session_id=>266,serial_num=>1433);     
如果出现了物理读,跟踪的结果中就会发现 5/130(seg hdr)、5/131(data blk)两个block会被访问到,在tracefile里可以观察到如下的字样:
WAIT #4573470248: nam='db file sequential read' ela= 16513 file#=5 block#=130 blocks=1 obj#=41088 tim=11189835682677
WAIT #4573470248: nam='db file sequential read' ela= 3417 file#=5 block#=131 blocks=1 obj#=41088 tim=11189835686504


如果没有经过物理读我们也可以通过跟踪基表x$bh来发现,方法如下:
***记录x$bh当前值:
SYS@tstdb1-SQL> select file#,dbablk,tch from x$bh where obj=41088;


    FILE#    DBABLK        TCH
---------- ---------- ----------
        5        128          0
        5        128          0
        5        130          2
        5        130          0
        5        129          0
        5        129          0
        5        129          0
        5        131          2
        5        131          0
        5        131          0
        5        131          0
        5        131          0
        5        131          0
        5        131          0
        5        131          0


15 rows selected.


***执行了select * from scott.t1;


***再次获取x$bh当前值后发现5/130、5/131两个block的touch count均增加了1,值得注意的是虽然只有两个block被访问过但并不一定意味着只会产生两次logical reads,这时因为x$bh里对于同一个block的访问在3秒钟内只会计数一次,一个block被访问可能会造成多次logical reads
 select file#,dbablk,tch from x$bh where obj=41088


    FILE#    DBABLK        TCH
---------- ---------- ----------
        5        128          0
        5        128          0
        5        130          3
        5        130          0
        5        129          0
        5        129          0
        5        129          0
        5        131          3
        5        131          0
        5        131          0
        5        131          0
        5        131          0
        5        131          0
        5        131          0
        5        131          0


总结,arraysize<=SDU size, CTAS方式创建的测试表执行select前后:
logical reads从4537->4541,共4次,其中2次用于读取segment header,1次用于读取表里的第一行,1次用于读取block里的剩余行
SQL*Net message to client从139->142,共3次,其中1次用于将表里的第一行数据从buffer cache复制到PGA前,Server process通知client准备接收服务器端返回的第一行数据;1次用于将表中block里除第一行外的剩余行从buffer cache复制到PGA前,Server process通知client准备接收服务器端返回的剩余行数据;另1次的用途未知;读取segment header时不会产生"SQL*Net message to client"事件,因为客户端请求访问的永远是表里的数据而不是段头信息。
SQL*Net message from client从138->141,共3次,其中1次用于client向server发送命令,1次在用于Server process将PGA里的第一行数据返回给client后client给server process的确认响应,1次用于Server process将PGA里该block里的剩余数据返回给client后client给server process的确认响应。


上述场景使用的测试表scott.t1里的记录是在CTAS建表的同时插入进来的,实践发现与CTAS创建完空表后再用insert into插入记录所形成的测试表其在SQL*Net事件的等待次数上有所差异


(2) arraysize<=SDU size, CTAS方式创建空表,之后使用insert into语句往这张空表插入记录
---session 1: 创建测试表
create tablespace ts0111 datafile '/oradata06/testaaaaa/ts0111.dbf' size 32m segment space management auto;


create table scott.t11 tablespace ts0111 as select * from all_users where 1=2;


insert into scott.t11 select * from all_users;


SQL> select header_file,header_block from dba_segments where segment_name='T11' and owner='SCOTT';


HEADER_FILE HEADER_BLOCK
----------- ------------
          5          154                                <---segment header
       
SELECT distinct DBMS_ROWID.rowid_block_number(rowid),DBMS_ROWID.rowid_relative_fno(rowid) from scott.t11 ;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                              159                                    5            <---真正存放数据的块仅一个


---session 2: 获取session基本信息,确认所有行的长度和<SDU_SIZE,SDU_SIZE默认情况下是8192字节
select sum(length(user_id)+length(username)+length(created)) from scott.t1 ;  <---确保所有记录总长<默认的SDU值8192字节,否则会出现SQL*NET more data to client事件,后面会有专针对于SQL*NET more data to client的讨论
SUM(LENGTH(USER_ID)+LENGTH(USERNAME)+LENGTH(CREATED))
-----------------------------------------------------
                                                550


SQL> select p.spid,s.sid,s.serial# from v$session s,v$process p where s.sid=sys_context('userenv','sid') and s.paddr=p.addr;


SPID                            SID    SERIAL#
------------------------ ---------- ----------
5374224                        467      3927


---session 3:记录session 2的执行前的SQL*NET等待事件和logical reads次数,x$bh中的touch count,并且执行flush buffer_cache
set linesize 120
col event format a40
select sid,event,total_waits,time_waited from v$session_event where sid=467 and event in ('SQL*Net message to client','SQL*Net message from client','SQL*Net more data to client');
      SID EVENT                                    TOTAL_WAITS TIME_WAITED
---------- ---------------------------------------- ----------- -----------
      467 SQL*Net message to client                        42          0
      467 SQL*Net message from client                      41      108403


col name format a26
select n.name,s.sid,s.VALUE from v$sesstat s,v$statname n where n.name='session logical reads' and sid=467 and n.statistic#=s.statistic#;
NAME                              SID      VALUE
-------------------------- ---------- ----------
session logical reads            467      1407


alter system flush buffer_cache;


SYS@tstdb1-SQL> set pagesize 20
SYS@tstdb1-SQL> select file#,dbablk,tch from x$bh where obj=41299;


    FILE#    DBABLK        TCH
---------- ---------- ----------
        5        157          0
        5        157          0
        5        154          0
        5        154          0
        5        159          0
        5        159          0
        5        156          0
        5        156          0
        5        153          0
        5        158          0
        5        158          0
        5        155          0
        5        155          0
        5        152          0


跟踪session 2的sid,tracefile里可以记录逻辑读访问哪些块,前提是必须保证这些块不在buffer cache里,可以先flush buffer_cache
exec DBMS_MONITOR.SESSION_TRACE_ENABLE(session_id=>467,serial_num=>3927,waits=>TRUE,plan_stat=>NULL);


---session 2: 执行select * from scott.t11
set arraysize 21
select /*+ dynamic_sampling(0) */ * from scott.t11 ;


---session 3:记录session 2的执行后的SQL*NET等待事件,session logical reads次数
set linesize 120
col event format a40
select sid,event,total_waits,time_waited from v$session_event where sid=467 and event in ('SQL*Net message to client','SQL*Net message from client','SQL*Net more data to client');


      SID EVENT                                    TOTAL_WAITS TIME_WAITED
---------- ---------------------------------------- ----------- -----------
      467 SQL*Net message to client                        45          0
      467 SQL*Net message from client                      44      136331


col name format a30
select n.name,s.sid,s.VALUE from v$sesstat s,v$statname n where n.name='session logical reads' and sid=467 and n.statistic#=s.statistic#;
NAME                                  SID      VALUE
------------------------------ ---------- ----------
session logical reads                467      1415
     
结束session 2的跟踪
exec DBMS_MONITOR.SESSION_TRACE_DISABLE(session_id=>467,serial_num=>3927);   
 
生成的Tracefile里包含的内容:
*** 2016-01-14 14:48:52.745
WAIT #4574029704: nam='SQL*Net message from client' ela= 279270707 driver id=1650815232 #bytes=1 p3=0 obj#=41299 tim=11207996794058  <--第1次"SQL*Net message from client"等待
CLOSE #4573414144:c=0,e=25,dep=0,type=0,tim=11207996794949
CLOSE #4574029704:c=0,e=48,dep=0,type=3,tim=11207996795047
=====================
PARSING IN CURSOR #4573414144 len=51 dep=0 uid=0 oct=3 lid=0 tim=11207996795258 hv=3660860863 ad='7000001b8499ec0' sqlid='0ckm0urd38mdz'
select /*+ dynamic_sampling(0) */ * from scott.t11
END OF STMT
PARSE #4573414144:c=0,e=156,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=711494848,tim=11207996795257
EXEC #4573414144:c=0,e=63,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=711494848,tim=11207996795393
WAIT #4573414144: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=41299 tim=11207996795436  <--第1次"SQL*Net message to client"等待
WAIT #4573414144: nam='db file sequential read' ela= 466 file#=5 block#=154 blocks=1 obj#=41299 tim=11207996796009  <--发生1次session logical reads
WAIT #4573414144: nam='db file scattered read' ela= 433 file#=5 block#=155 blocks=5 obj#=41299 tim=11207996796755    <--又发生5次session logical reads
FETCH #4573414144:c=0,e=1384,p=6,cr=7,cu=0,mis=0,r=1,dep=0,og=1,plh=711494848,tim=11207996796853
WAIT #4573414144: nam='SQL*Net message from client' ela= 292 driver id=1650815232 #bytes=1 p3=0 obj#=41299 tim=11207996797184  <--第2次"SQL*Net message from client"等待
WAIT #4573414144: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=41299 tim=11207996797220  <--第2次"SQL*Net message to client"等待
FETCH #4573414144:c=0,e=50,p=0,cr=1,cu=0,mis=0,r=20,dep=0,og=1,plh=711494848,tim=11207996797260
STAT #4573414144 id=1 cnt=21 pid=0 pos=1 obj=41299 op='TABLE ACCESS FULL T11 (cr=8 pr=6 pw=0 time=1375 us cost=2 size=819 card=21)'
WAIT #4573414144: nam='SQL*Net message from client' ela= 1157 driver id=1650815232 #bytes=1 p3=0 obj#=41299 tim=11207996798516  <--第3次"SQL*Net message from client"等待
PARSE #4574029704:c=0,e=45,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=11207996798594
WAIT #4574029704: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=41299 tim=11207996798997  <--第3次"SQL*Net message to client"等待
EXEC #4574029704:c=0,e=345,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=0,tim=11207996799039
~


通过跟踪基表x$bh也可以发现,scott.t11表里有154~159六个block被访问过了:
SYS@tstdb1-SQL> select file#,dbablk,tch from x$bh where obj=41299;


    FILE#    DBABLK        TCH
---------- ---------- ----------
        5        157          1
        5        157          0
        5        157          0
        5        154          1
        5        154          0
        5        154          0
        5        159          1
        5        159          0
        5        159          0
        5        156          1
        5        156          0
        5        156          0
        5        153          0
        5        158          1
        5        158          0
        5        158          0
        5        155          1
        5        155          0
        5        155          0
        5        152          0


15 rows selected.


总结,arraysize<=SDU size, CTAS方式创建了空表,再用insert into往空表中插入记录的场景下,对测试表执行select前后:
logical reads从1407->1415,共8次,其中2次用于读取segment header,1次用于读取表里的第一行,1次用于读取第一个block 155里的剩余行,剩余4次用于读取156~159这四个块
SQL*Net message to client从139->142,共3次,其中1次在读取block里的第一行时,Server process通知client准备接收服务器端返回的第一行数据,1次在读取block里的剩余行时,Server process通知client准备接收服务器端返回的剩余行数据,另1次用途未知
SQL*Net message from client从138->141,共3次,其中1次用于client向server发送命令,1次用于Server process将PGA里的第一行数据返回给client后client给server process的确认响应,另1次用于Server process将PGA里该block里的剩余数据返回给client后client给server process的确认响应。


(1)和(2)的区别在于session logical reads数量增加了,SQL*NET系列等待事件数量保持一致,这是因为CTAS创建出来的表其占用的blocks数量是最少的刚好能够存放下现有的记录,而insert into创建出来的表是按照extent为基本单位扩展的,因此HWM比前者要高,进而全表扫描时经历的blocks会更多,session logical reads值也就更大了。有兴趣的TX可以使用dbms_space.space_usage,或者dump segment header观察这两种方式创建出来的表的HWM各是多少


(3) arraysize > SDU size的情况
以我们之前创建的scott.t12表为例,设置客户端的Sdu_size=3072 bytes
---session 1: 连接上数据库
select count(*) from scott.t12;
  COUNT(*)
----------
      199
     
select owner,table_name,avg_row_len from dba_tables where table_name='T12';


OWNER                          TABLE_NAME                    AVG_ROW_LEN
------------------------------ ------------------------------ -----------
SCOTT                          T12                                    24


select p.spid,s.sid,s.serial# from v$session s,v$process p where s.sid=sys_context('userenv','sid') and s.paddr=p.addr;


SPID                            SID    SERIAL#
------------------------ ---------- ----------
15138938                        664          3
     
set arraysize 199


---session 2:
select sid,event,total_waits,time_waited,average_wait from v$session_event where sid=664 and event like 'SQL*Net%';
    SID EVENT                                    TOTAL_WAITS TIME_WAITED AVERAGE_WAIT
---------- ---------------------------------------- ----------- ----------- ------------
      664 SQL*Net message to client                        19          0            0
      664 SQL*Net message from client                      18      27997      1555.38
     
select n.name,s.sid,s.VALUE from v$sesstat s,v$statname n where n.name='session logical reads' and sid=664 and n.statistic#=s.statistic#;
NAME                                                                    SID      VALUE
---------------------------------------------------------------- ---------- ----------
session logical reads                                                  664        776


---session 1:
select * from scott.t12;


---session 2:
select sid,event,total_waits,time_waited,average_wait from v$session_event where sid=664 and event like 'SQL*Net%';
        SID EVENT                                    TOTAL_WAITS TIME_WAITED AVERAGE_WAIT
---------- ---------------------------------------- ----------- ----------- ------------
      664 SQL*Net message to client                        22          0            0
      664 SQL*Net more data to client                        1          0            0
      664 SQL*Net message from client                      21      36219      1724.73
     
select n.name,s.sid,s.VALUE from v$sesstat s,v$statname n where n.name='session logical reads' and sid=664 and n.statistic#=s.statistic#;
NAME                                                                    SID      VALUE
---------------------------------------------------------------- ---------- ----------
session logical reads                                                  664        779

logical reads从776->779,共3次,其中1次用于读取segment header,1次用于读取表里的第一行,1次用于读取第一个block里的剩余行
SQL*Net message to client从19->22,共3次,其中1次在读取block里的第一行时,Server process通知client准备接收服务器端返回的第一行数据,1次在读取block里的剩余行时,Server process通知client准备接收服务器端返回的剩余行数据,另1次用途未知
SQL*Net message from client从18->21,共3次,其中1次用于client向server发送命令,1次用于Server process将PGA里的第一行数据返回给client后client给server process的确认响应,另1次用于Server process将PGA里该block里的剩余数据返回给client后client给server process的确认响应。
SQL*Net more data to client 的1次用于网络层拆分成3072bytes->1704 bytes两个包进行传送所致


简单总结一下:
SQL*NET message to client与SQL*NET message from client成对出现;
尽可能将arraysize调大以使结果集在较少的批次内返回,以降低服务器和客户端的交互次数,减少session logical reads;
尽可能调大sdu_size,以减少SQL*NET more data to client事件的发生次数,提高服务器与客户端间的传输效率;

本文永久更新链接地址

相关内容