ORACLE字符集检察工具CSSCAN
我们在创建数据库的时候,通常会把字符集设置为ZHS16GBK,有些时候,ZHS16GBK字符集满足不了业务的需求,会更换字符集,以AL32UTF8为例,ZHS16GBK字符集每个汉字以2个字符存放,而AL32UTF8字符集每个汉字是以3个自己存放,这样,有可能就要增大某些表的字段长度。ORACLE提供了CSSCAN工具来查看字符集转换的时候哪些字段需要增大到多大的长度。(当然,也可以写PL/SQL将所有的表为CHAR,VARCHAR2的字段都增大1.5倍),既然ORACLE提供了CSSCAN工具,我们就要学着使用。
在我们创建数据库的时候,CSSCAN工具一般不会被安装,我们需要手动执行$ORACLE_HOME/rdbms/admin/csminst.sql手动创建,这个操作很简单:
SQL> @?/rdbms/admin/csminst.sql
用户已创建。
授权成功。
…
…
有可能会遇到如下的错误
drop public synonym csm$parameters
*
第 1 行出现错误:
ORA-01432: 要删除的公用同义词不存在
查看csminst.sql文件,您会发现
drop public synonym csm$parameters
/
create public synonym csm$parameters for csmig.csm$parameters
/
这个错误是由于csminst.sql这个脚本在创建同义词csm$parameters之前,会先删除这个同义词,然后在创建,也就是如果这个同义词存在,先删除,然后再创建,这个错误可以忽略。
CSSCAN工具安装完后,就可以使用CSSCAN工具查看字符集转换之后,哪些字段需要增大长度。操作也很简单,我们拿T_FONDS_INFO表为例:
SQL> desc T_FONDS_INFO
名称 是否为空? 类型
----------------------------------------- -------- -------------------
FOND_CODE VARCHAR2(2)
FOND_NAME VARCHAR2(33)
FONDS_NUMBER VARCHAR2(40)
ADMITER VARCHAR2(3)
TERM VARCHAR2(35)
FILE_SUM NUMBER
PRINCIPLE VARCHAR2(315)
FOND_INTRO CLOB
REMARK VARCHAR2(70)
利用CSSCAN查看哪些字段需要增大
C:\Documents and Settings\Administrator>csscan system/oracle table= T_FONDS_INFO tochar= al32utf8 log=d:\css.log
Character Set Scanner v2.2 : Release 11.2.0.1.0 - Production on 星期五 9月 23 16:36:01 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Enter array fetch buffer size: 1024000 >--默认为1M,可根据实际情况输入大一点的值
Enter number of scan processes to utilize(1..64): 1 >--多少个进程来检查
Enumerating tables to scan...
. process 1 scanning STREAM.T_FONDS_INFO[AAASgRAAEAAAADwAAA]
Creating Database Scan Summary Report...
Creating Individual Exception Report...
Scanner terminated successfully.
执行完之后会在D:\下看到css.log.err、css.log1.out和css.log1.txt文件,检查css.log1.err文件可以看到哪些字段需要修改到多大。例如:
User : STREAM
Table : T_FONDS_INFO
Column: TERM
Type : VARCHAR2(35)
Number of Exceptions : 3
Max Post Conversion Data Size: 46
此部分说明STREAM下的T_FONDS_INFO表的TERM字段更改为AL32UTF8字符集之后,有3条数据需要增大字段长度,需要从原长度35增大到46,下面是数据在AL32UTF8字符集下的实际长度:
ROWID Exception Type Size Cell Data(first 30 bytes)
------------------ ------------------ ----- ------------------------------
AAASgRAAEAAAAD0AAB exceed column size 46 光绪二十七年1901--光绪二十八年
AAASgRAAEAAAAD1AAB exceed column size 40 光绪二十八年1902--宣统三年1911
AAASgRAAEAAAAD1AAC exceed column size 40 光绪二十一年1895--宣统四年1912
------------------ ------------------ ----- ------------------------------
可以通过csscan help=y查看CSSCAN的各个参数
C:\Documents and Settings\Administrator>csscan help=y
You can let Scanner prompt you for parameters by entering the CSSCAN
command followed by your username/password:
Example: CSSCAN \"SYSTEM/MANAGER AS SYSDBA\"
Or, you can control how Scanner runs by entering the CSSCAN command
followed by various parameters. To specify parameters, you use keywords:
Example: CSSCAN \"SYSTEM/MANAGER AS SYSDBA\" FULL=y TOCHAR=utf8 ARRAY=1024000 PROCESS=3
Keyword Default Prompt Description
---------- ------- ------ -------------------------------------------------
USERID yes username/password
FULL N yes scan entire database
USER yes owner of tables to be scanned
TABLE yes list of tables to scan
COLUMN yes list of columns to scan
EXCLUDE list of tables to exclude from scan
TOCHAR yes new database character set name
FROMCHAR current database character set name
TONCHAR new national character set name
FROMNCHAR current national character set name
ARRAY 1024000 yes size of array fetch buffer
PROCESS 1 yes number of concurrent scan process
MAXBLOCKS split table if block size exceed MAXBLOCKS
CAPTURE N capture convertible data
SUPPRESS maximum number of exceptions logged for each table
FEEDBACK report progress every N rows
BOUNDARIES list of column size boundaries for summary report
LASTRPT N generate report of the last database scan
LOG scan base file name of report files
PARFILE parameter file name
PRESERVE N preserve existing scan results
LCSD N no enable language and character set detection
LCSDDATA LOSSY no define the scope of the detection
HELP N show help screen (this screen)
QUERY N select clause to scan subset of tables or columns
Scanner terminated successfully.
Linux系统下运行CSSCAN工具可能会遇到csscan: error while loading shared libraries: libclntsh.so.11.1: cannot open shared object file: No such file or directory错误,可以用下面的方法解决:
[oracle@dbserver2 bin]$ env|grep LD_LIB
LD_LIBRARY_PATH=/etc/emc/rsa/cst/lib
[oracle@dbserver2 bin]$ export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
当然,也可以利用数据字典生成脚本直接修改所有表的CHAR类型的字段长度为原值的1.5倍,例如:
创建spool.sql文件,内容如下:
set heading off
set feedback OFF
set echo off
set pages 1000 lines 120
spool d:\length.sql
select 'alter table '||table_name||' modify'||'('||COLUMN_NAME||' '||data_type||'('||round(DATA_LENGTH*1.5)||'));' from user_tab_columns where data_type like '%CHAR%';
spool off
在SQL*PLUS里执行
@d:\spool.sql
这样生成的length.sql文件内容如下:
alter table T_SPECIAL modify(CLASS_NUM VARCHAR2(45));
alter table T_SPECIAL modify(CLASS_NUM_A VARCHAR2(45));
alter table T_FONDS_INFO_RAW modify(FOND_ID VARCHAR2(15));
alter table T_FONDS_INFO_RAW modify(FOND_CODE VARCHAR2(60));
直接在SQL*PLUS里面运行D:\length.sql即可讲所有表的%CHAR%类型的字段长度都增大1.5倍。