Top DBA Shell Scripts for Monitoring the Database

8 08 2008

As I was browsing the internet, I found this article and script collection of Daniel T. Liu http://www.dbazine.com/oracle/or-articles/liu2#instance

Please feel free to test his scripts and hope it could help you too in your day to day database administrations.
Introduction

This article focuses on the DBA’s daily responsibilities for monitoring Oracle databases and provides tips and techniques on how DBAs can turn their manual, reactive monitoring activities into a set of proactive shell scripts. The article first reviews some commonly used Unix commands by DBAs. It explains the Unix Cron jobs that are used as part of the scheduling mechanism to execute DBA scripts. The article covers eight important scripts for monitoring Oracle database:

+ Check instance availability
+ Check listener availability
+ Check alert log files for error messages
+ Clean up old log files before log destination gets filled
+ Analyze tables and indexes for better performance
+ Check tablespace usage
+ Find out invalid objects
+ Monitor users and transactions

UNIX Basics for the DBA
Basic UNIX Command

The following is a list of commonly used Unix command:

+ ps – Show process
+ grep – Search files for text patterns
+ mailx – Read or send mail
+ cat – Join files or display them
+ cut – Select columns for display
+ awk – Pattern-matching language
+ df – Show free disk space

Here are some examples of how the DBA uses these commands:

+ List available instances on a server:

$ ps -ef | grep smon
oracle 21832     1  0   Feb 24 ?       19:05 ora_smon_oradb1
oracle   898     1  0   Feb 15 ?        0:00 ora_smon_oradb2
dliu 25199 19038  0 10:48:57 pts/6    0:00 grep smon
oracle 27798     1  0 05:43:54 ?        0:00 ora_smon_oradb3
oracle 28781     1  0   Mar 03 ?        0:01 ora_smon_oradb4

+ List available listeners on a server:

$ ps -ef | grep listener | grep -v grep
oracle 23879    1  0   Feb 24 ?  33:36 /8.1.7/bin/tnslsnr listener_db1 -inherit
oracle 27939    1  0 05:44:02 ?  0:00  /8.1.7/bin/tnslsnr listener_db2 -inherit
oracle 23536    1  0   Feb 12 ?  4:19  /8.1.7/bin/tnslsnr listener_db3 -inherit
oracle 28891    1  0   Mar 03 ?  0:01  /8.1.7/bin/tnslsnr listener_db4 -inherit

+ Find out file system usage for Oracle archive destination:

$ df -k | grep oraarch
/dev/vx/dsk/proddg/oraarch 71123968 4754872 65850768  7%  /u09/oraarch

+ List number of lines in the alert.log file:

$ cat alert.log | wc -l
2984

+ List all Oracle error messages from the alert.log file:

$ grep ORA- alert.log
ORA-00600: internal error code, arguments: [kcrrrfswda.1], [], [], [], [], []
ORA-00600: internal error code, arguments: [1881], [25860496], [25857716], []

CRONTAB Basics

A crontab file is comprised of six fields:
Minute     0-59
Hour     0-23
Day of month     1-31
Month     1 – 12
Day of Week     0 – 6, with 0 = Sunday
Unix Command or Shell Scripts

+ To edit a crontab file, type:

Crontab -e

+ To view a crontab file, type:

Crontab -l
0  4 * * 5       /dba/admin/analyze_table.ksh
30 3  * * 3,6    /dba/admin/hotbackup.ksh /dev/null 2>&1

In the example above, the first entry shows that a script to analyze a table runs every Friday at 4:00 a.m. The second entry shows that a script to perform a hot backup runs every Wednesday and Saturday at 3:00 a.m.
Top DBA Shell Scripts for Monitoring the Database

The eight shell scripts provided below cover 90 percent of a DBA’s daily monitoring activities. You will need to modify the UNIX environment variables as appropriate.
Check Oracle Instance Availability

The oratab file lists all the databases on a server:

$ cat /var/opt/oracle/oratab
###################################################################
## /var/opt/oracle/oratab                                        ##
###################################################################
oradb1:/u01/app/oracle/product/8.1.7:Y
oradb2:/u01/app/oracle/product/8.1.7:Y
oradb3:/u01/app/oracle/product/8.1.7:N
oradb4:/u01/app/oracle/product/8.1.7:Y

The following script checks all the databases listed in the oratab file, and finds out the status (up or down) of databases:

###################################################################
## ckinstance.ksh ##
###################################################################
ORATAB=/var/opt/oracle/oratab
echo “`date`   “
echo  “Oracle Database(s) Status `hostname` :\n”

db=`egrep -i “:Y|:N” $ORATAB | cut -d”:” -f1 | grep -v “\#” | grep -v “\*”`
pslist=”`ps -ef | grep pmon`”
for i in $db ; do
echo  “$pslist” | grep  “ora_pmon_$i”  > /dev/null 2>$1
if (( $? )); then
echo “Oracle Instance – $i:       Down”
else
echo “Oracle Instance – $i:       Up”
fi
done

Use the following to make sure the script is executable:

$ chmod 744 ckinstance.ksh
$ ls -l ckinstance.ksh
-rwxr–r–   1 oracle     dba     657 Mar  5 22:59 ckinstance.ksh*

Here is an instance availability report:

$ ckinstance.ksh
Mon Mar  4 10:44:12 PST 2002
Oracle Database(s) Status for DBHOST server:
Oracle Instance – oradb1:   Up
Oracle Instance – oradb2:   Up
Oracle Instance – oradb3:   Down
Oracle Instance – oradb4:   Up

Check Oracle Listener’s Availability

A similar script checks for the Oracle listener. If the listener is down, the script will restart the listener:

#######################################################################
## cklsnr.sh                                                         ##
#######################################################################
#!/bin/ksh
DBALIST=”primary.dba@company.com,another.dba@company.com”;export DBALIST
cd /var/opt/oracle
rm -f lsnr.exist
ps -ef | grep mylsnr | grep -v grep  > lsnr.exist
if [ -s lsnr.exist ]
then
echo
else
echo “Alert” | mailx -s “Listener ‘mylsnr’ on `hostname` is down” $DBALIST
TNS_ADMIN=/var/opt/oracle; export TNS_ADMIN
ORACLE_SID=db1; export ORACLE_SID
ORAENV_ASK=NO; export ORAENV_ASK
PATH=$PATH:/bin:/usr/local/bin; export PATH
. oraenv
LD_LIBRARY_PATH=${ORACLE_HOME}/lib;export LD_LIBRARY_PATH
lsnrctl start mylsnr
fi

Check Alert Logs (ORA-XXXXX)

Some of the environment variables used by each script can be put into one profile:

#######################################################################
## oracle.profile ##
#######################################################################
EDITOR=vi;export EDITOR ORACLE_BASE=/u01/app/oracle; export
ORACLE_BASE ORACLE_HOME=$ORACLE_BASE/product/8.1.7; export
ORACLE_HOME LD_LIBRARY_PATH=$ORACLE_HOME/lib; export
LD_LIBRARY_PATH TNS_ADMIN=/var/opt/oracle;export
TNS_ADMIN NLS_LANG=american; export
NLS_LANG NLS_DATE_FORMAT=’Mon DD YYYY HH24:MI:SS’; export
NLS_DATE_FORMAT ORATAB=/var/opt/oracle/oratab;export
ORATAB PATH=$PATH:$ORACLE_HOME:$ORACLE_HOME/bin:/usr/ccs/bin:/bin:/usr/bin:/usr/sbin:/
sbin:/usr/openwin/bin:/opt/bin:.; export
PATH DBALIST=”primary.dba@company.com,another.dba@company.com”;export
DBALIST

The following script first calls oracle.profile to set up all the environment variables. The script also sends the DBA a warning e-mail if it finds any Oracle errors:

####################################################################
## ckalertlog.sh                                                  ##
####################################################################
#!/bin/ksh
. /etc/oracle.profile
for SID in `cat $ORACLE_HOME/sidlist`
do
cd $ORACLE_BASE/admin/$SID/bdump
if [ -f alert_${SID}.log ]
then
mv alert_${SID}.log alert_work.log
touch alert_${SID}.log
cat alert_work.log >> alert_${SID}.hist
grep ORA- alert_work.log > alert.err
fi
if [ `cat alert.err|wc -l` -gt 0 ]
then
mailx -s “${SID} ORACLE ALERT ERRORS” $DBALIST < alert.err
fi
rm -f alert.err
rm -f alert_work.log
done

Clean Up Old Archived Logs

The following script cleans up old archive logs if the log file system reaches 90-percent capacity:

$ df -k | grep arch
Filesystem                kbytes   used     avail    capacity  Mounted on
/dev/vx/dsk/proddg/archive 71123968 30210248 40594232   43%  /u08/archive

#######################################################################
## clean_arch.ksh                                                    ##
#######################################################################
#!/bin/ksh
df -k | grep arch > dfk.result
archive_filesystem=`awk  -F” “  ‘{ print $6 }’ dfk.result`
archive_capacity=`awk  -F” “  ‘{ print $5 }’ dfk.result`

if [[ $archive_capacity > 90% ] ]
then
echo “Filesystem ${archive_filesystem} is ${archive_capacity} filled”
# try one of the following option depend on your need
find $archive_filesystem -type f -mtime +2 -exec rm -r {} \;
tar
rman
fi

Analyze Tables and Indexes (for Better Performance)

Below, I have shown an example on how to pass parameters to a script:

####################################################################
## analyze_table.sh ##
####################################################################
#!/bin/ksh #
input parameter: 1: password # 2: SID if (($#<1)) then echo “Please enter
‘oracle’
user password as the first parameter !” exit 0 fi if (($#<2)) then echo
“Please enter
instance name as the second parameter!” exit 0 fi

To execute the script with parameters, type:

$ analyze_table.sh manager oradb1

The first part of script generates a file analyze.sql, which contains the syntax for analyzing table. The second part of script analyzes all the tables:

#####################################################################
## analyze_table.sh ##
#####################################################################
sqlplus -s <<!
oracle/$1@$2
set heading off
set feed off
set pagesize 200
set linesize 100
spool analyze_table.sql
select ‘ANALYZE TABLE ‘ || owner || ‘.’ || segment_name ||
‘ ESTIMATE STATISTICS SAMPLE 10 PERCENT;’
from dba_segments
where segment_type = ‘TABLE’
and owner not in (‘SYS’, ‘SYSTEM’);
spool off
exit
!
sqlplus -s <<!
oracle/$1@$2
@./analyze_table.sql
exit
!

Here is an example of analyze.sql:

$ cat analyze.sql
ANALYZE TABLE HIRWIN.JANUSAGE_SUMMARY ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE HIRWIN.JANUSER_PROFILE ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE APPSSYS.HIST_SYSTEM_ACTIVITY ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE HTOMEH.QUEST_IM_VERSION ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE JSTENZEL.HIST_SYS_ACT_0615 ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE JSTENZEL.HISTORY_SYSTEM_0614 ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE JSTENZEL.CALC_SUMMARY3 ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE IMON.QUEST_IM_LOCK_TREE ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE APPSSYS.HIST_USAGE_SUMMARY ESTIMATE STATISTICS SAMPLE 10 PERCENT;
ANALYZE TABLE PATROL.P$LOCKCONFLICTTX ESTIMATE STATISTICS SAMPLE 10 PERCENT;

Check Tablespace Usage

This scripts checks for tablespace usage. If tablespace is 10 percent free, it will send an alert e-mail.

#####################################################################
## ck_tbsp.sh ##
#####################################################################
#!/bin/ksh
sqlplus -s <<!
oracle/$1@$2
set feed off
set linesize 100
set pagesize 200
spool tablespace.alert
SELECT F.TABLESPACE_NAME,
TO_CHAR ((T.TOTAL_SPACE – F.FREE_SPACE),’999,999′) “USED (MB)”,
TO_CHAR (F.FREE_SPACE, ‘999,999′) “FREE (MB)”,
TO_CHAR (T.TOTAL_SPACE, ‘999,999′) “TOTAL (MB)”,
TO_CHAR ((ROUND ((F.FREE_SPACE/T.TOTAL_SPACE)*100)),’999′)||’ %’ PER_FREE
FROM   (
SELECT       TABLESPACE_NAME,
ROUND (SUM (BLOCKS*(SELECT VALUE/1024
FROM V\$PARAMETER
WHERE NAME = ‘db_block_size’)/1024)
) FREE_SPACE
FROM DBA_FREE_SPACE
GROUP BY TABLESPACE_NAME
) F,
(
SELECT TABLESPACE_NAME,
ROUND (SUM (BYTES/1048576)) TOTAL_SPACE
FROM DBA_DATA_FILES
GROUP BY TABLESPACE_NAME
) T
WHERE F.TABLESPACE_NAME = T.TABLESPACE_NAME
AND (ROUND ((F.FREE_SPACE/T.TOTAL_SPACE)*100)) < 10;
spool off
exit
!
if [ `cat tablespace.alert|wc -l` -gt 0 ]
then
cat tablespace.alert -l tablespace.alert > tablespace.tmp
mailx -s “TABLESPACE ALERT for ${2}” $DBALIST < tablespace.tmp
fi

An example of the alert mail output is as follows:

TABLESPACE_NAME     USED (MB)   FREE (MB)         TOTAL (MB)            PER_FREE
——————- ——— ———– ——————- ——————
SYSTEM              2,047             203               2,250                9 %
STBS01                302              25                327                 8 %
STBS02                241              11                252                 4 %
STBS03                233              19                252                 8 %

Find Out Invalid Database Objects

The following finds out invalid database objects:

#####################################################################
## invalid_object_alert.sh ##
#####################################################################
#!/bin/ksh
. /etc/oracle.profile
sqlplus -s <<!
oracle/$1@$2
set          feed off
set heading off
column object_name format a30
spool invalid_object.alert
SELECT  OWNER, OBJECT_NAME, OBJECT_TYPE, STATUS
FROM    DBA_OBJECTS
WHERE   STATUS = ‘INVALID’
ORDER BY OWNER, OBJECT_TYPE, OBJECT_NAME;
spool off
exit
!
if [ `cat invalid_object.alert|wc -l` -gt 0 ]
then
mailx -s “INVALID OBJECTS for ${2}” $DBALIST < invalid_object.alert
fi
$ cat invalid_object.alert

OWNER        OBJECT_NAME           OBJECT_TYPE          STATUS
———————————————————————-
HTOMEH       DBMS_SHARED_POOL            PACKAGE BODY          INVALID
HTOMEH       X_$KCBFWAIT                 VIEW                  INVALID
IMON         IW_MON                      PACKAGE               INVALID
IMON         IW_MON                      PACKAGE BODY          INVALID
IMON         IW_ARCHIVED_LOG             VIEW                  INVALID
IMON         IW_FILESTAT                 VIEW                  INVALID
IMON         IW_SQL_FULL_TEXT            VIEW                  INVALID
IMON         IW_SYSTEM_EVENT1            VIEW                  INVALID
IMON         IW_SYSTEM_EVENT_CAT         VIEW                  INVALID
LBAILEY      CHECK_TABLESPACE_USAGE      PROCEDURE             INVALID
PATROL       P$AUTO_EXTEND_TBSP          VIEW                  INVALID
SYS          DBMS_CRYPTO_TOOLKIT         PACKAGE               INVALID
SYS          DBMS_CRYPTO_TOOLKIT         PACKAGE BODY          INVALID
SYS          UPGRADE_SYSTEM_TYPES_TO_816 PROCEDURE             INVALID
SYS          AQ$_DEQUEUE_HISTORY_T       TYPE                  INVALID
SYS          HS_CLASS_CAPS               VIEW                  INVALID
SYS          HS_CLASS_DD                 VIEW                  INVALID

Monitor Users and Transactions (Dead Locks, et al)

This script sends out an alert e-mail if dead lock occurs:

###################################################################
## deadlock_alert.sh ##
###################################################################
#!/bin/ksh
. /etc/oracle.profile
sqlplus -s <<!
oracle/$1@$2
set feed off
set heading off
spool deadlock.alert
SELECT   SID, DECODE(BLOCK, 0, ‘NO’, ‘YES’ ) BLOCKER,
DECODE(REQUEST, 0, ‘NO’,'YES’ ) WAITER
FROM     V$LOCK
WHERE    REQUEST > 0 OR BLOCK > 0
ORDER BY block DESC;
spool off
exit
!
if [ `cat deadlock.alert|wc -l` -gt 0 ]
then
mailx -s “DEADLOCK ALERT for ${2}” $DBALIST < deadlock.alert
fi





Tracking oracle Logons

1 08 2008

There are a couple ways of tracking logon and logoff information within an Oracle database, however few offer the simplicity and flexibility of the use of an logon and logoff trigger. Here is how I built a simple set of triggers and accompanying table to track usernames, logon time, logoff time, machine connected from and program used.

I performed these steps as system. If you want to install these as another user you will need to assign the appropriate permissions first.

The first step is to create a table to store the information. This table may grow quickly if this is a very active system so we want to put it somewhere other than the system tablespace. For reference, mine currently has around 3000 records in it and is around 200k in size. Of course mileage may vary.

CREATE TABLE session_audit
(
user_id VARCHAR2(30),
session_id NUMBER,
host VARCHAR2(30),
program VARCHAR2(60),
logon_time DATE,
logoff_time DATE
)
TABLESPACE users;

This will be the table you query to determine login information. Of course you will probably want to purge data from here occasionally to keep it from getting too big.

Next we create the logon trigger which will populate all but the logoff_time of the session_audit table. This uses the sys_context function to lookup the host and session id (different from the SID) of the session. The program is retrieved by a subquery on the V$SESSION table.

CREATE OR REPLACE TRIGGER tr_session_audit_logon
AFTER LOGON ON DATABASE
DECLARE
session_id number;
BEGIN
select sys_context(’USERENV’,'SESSIONID’) into session_id from dual;
IF session_id != 0 –ignore internal connections
THEN
BEGIN
INSERT INTO session_audit (
user_id,
session_id,
host,
program,
logon_time,
logoff_time
)
VALUES(
user,
session_id,
sys_context(’USERENV’,'HOST’),
(SELECT program FROM v$session
WHERE sys_context(’USERENV’,'SESSIONID’) = AUDSID),
sysdate,
NULL
);
END;
END IF;
END;

UPDATE: I have found that the old version of this resulted in an ORA-1427 error when someone made an internal connection. The problem also came up when dbms_jobs were run. The code now ignores internal connections (where the session id is 0).

Finally we create the logoff trigger to fill in the logoff_time.

CREATE OR REPLACE TRIGGER tr_session_audit_logoff
BEFORE LOGOFF ON DATABASE
BEGIN
UPDATE session_audit
SET logoff_time = sysdate
WHERE sys_context(’USERENV’,'SESSIONID’) = session_id;
END;

That’s it. You can now search the session_audit table for logins durring a time window or all logins for a specific user. You could even check who is using a certain application.

I mentioned one possible issue, that you don’t want the session_audit table to fill up your system tablespace, however it is also important to mention that if your logon trigger fails for some reason people will not be able to log in, so watch the space, no matter where you put it.

Donald Burleson (who desperately needs a new portrait for his website) of Burleson Consulting offers these instructions for a similar trigger, however he is grabbing more information and fills most of it in at the end.

oracle, sql, dba, database administration, database development, database security, database, oracle security

  1. Gary Says:
    November 9th, 2005 at 2:38 pm Very nice! When I set this up on my test server I found that logins from network machines that weren’t “hosts” did not populate the host column. I modified the login trigger as follows to catch the v$session.terminal field if sys_context(’USERENV’,’HOST’) is null
    since both fields are char(30).


    case
    when sys_context(’USERENV’,’HOST’) is not null then sys_context(’USERENV’,’HOST’)
    else
    (SELECT TERMINAL FROM v$session
    WHERE sys_context(’USERENV’,’SESSIONID’) = AUDSID)
    end,

    Thank you.