Systems Engineering and RDBMS

Archive for the ‘Unix’ Category

Learning Hadoop

Posted by decipherinfosys on March 20, 2012

In a recently concluded project, we had the opportunity to work on Hadoop.  There was a learning curve since none of us had worked in Hadoop before.  Here are some URLs to help you get started with your learning process in this regard:

Basics of Hadoop:

The article on gigaom or the series of articles on cloudera’s site will get your started:

http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop/

http://www.cloudera.com/what-is-hadoop/

Sign up with Cloudera and you will have access to a lot of very good learning material on Hadoop, example:

http://www.cloudera.com/resource/introduction-to-apache-mapreduce-and-hdfs/  is a good starter’s video on MapReduce and HDFS.

or this one: http://www.cloudera.com/resource/apache-hadoop-ecosystem/ for understanding the Hadoop ecosystem.

And this whitepaper from Gartner on Hadoop and MapReduce for Big Data Analytics:

http://info.cloudera.com/GartnerReportHadoopJanuary2011.html

If you like to have text available for your learning purposes rather than video tutorials, here is a good chapter on HDFS: http://www.aosabook.org/en/hdfs.html

Setting up Hadoop cluster:

And once you are ready to jump in, there are some excellent tutorials by Michael G. Noll to guide you:

To set up your first Hadoop node: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

And then multiple node cluster: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

And here are some additional good tutorial references: http://www.delicious.com/jhofman/tutorials+hadoop

Microsoft and BigData

Recently, MSFT also announced their support for Apache Hadoop.  You can read more on MSFT’s big data solution from here:

http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data-solution.aspx

and the work done by HortonWorks for extending Apache Hadoop to Windows:

http://hortonworks.com/blog/extending-apache-hadoop-to-millions-of-new-microsoft-users/

Posted in Big Data, Linux, SQL Server, Unix, Windows | 1 Comment »

High Availability and Disaster Recovery

Posted by decipherinfosys on January 28, 2011

These two terms are often used very interchangeably.

High Availability typically refers to the solutions that use fault tolerance and /or load balancing in order to provide high availability of the applications.  HA is essentially the ability to continue operations when a component fails – could be a CPU, memory failure, disk failure, complete failure of a server etc..  With HA, there is usually no loss of service.  Clustering servers together is a form of HA, having redundant power supplies or redundant network controllers is a form of HA, having proper RAID arrays is a form of HA, having proper load balancing in place is a form of HA.  The primary goal of a HA environment is uptime/providing continuous service w/o disruption.

Disaster Recovery provides increased availability.  It is the process/ability to restore operations that are critical for the business to run after a disaster (human induced or natural causes example – power failure in the production site, floods, earthquake, hurricane etc.).  The key difference between DR and HA is that the recovery time taken in the case of DR is typically more than a HA.  With DR, there is a small loss of service while the DR systems are activated and take over the load in the DR site.

Here are some posts/articles which delve into these differences in more detail:

http://www.channeldb2.com/profiles/blogs/disaster-recovery-high

http://nexus.realtimepublishers.com/sgudb.php

http://www.drj.com/2010-articles/online-exclusive/understanding-high-availability-and-disaster-recovery-in-your-overall-recovery-strategy.html

http://gestaltit.com/all/tech/storage/bas/define-high-availability-disaster-recovery/

Posted in DB2 LUW, Oracle, SQL Server, Technology, Unix | 1 Comment »

NMON (Nigel’s monitor) Analyzer for AIX

Posted by decipherinfosys on August 6, 2009

Nmon is a free tool available for both AIX and Linux to monitor the performance of the AIX and Linux servers in terms of I/O, CPU Usage, top processes etc. It is bundled with AIX and is available free for Linux. It is widely used by AIX system administrators and performance tuning specialists.

Nmon has very small footprint when in use but it captures very useful information.  Normally it is located under /usr/local/bin directory.

┌─nmon────────U=Top-with-WLM─────Host=xxxxx────────Refresh=1 secs───18:00.26─┐
│ CPU-Utilisation-Small-View ──────────────────────────────────────────────────│
│                           0———-25———–50———-75———-100│
│CPU User%  Sys% Wait% Idle%|           |            |           |            |│
│  0   0.0   0.0   0.0 100.0|        >                                        |│
│  1   0.0   0.0   0.0 100.0|>                                                |│
│  2   4.0   1.0   0.0  95.0|U         >                                      |│
│  3   0.0   0.0   0.0 100.0|     >                                           |│
│  4   4.0   1.0   0.0  95.0|U         >                                      |│
│  5   0.0   0.0   0.0 100.0|>                                                |│
│Physical Averages          +———–|————|———–|————+│
│All   1.8   0.6   0.0  97.7|>                                                |│
│                           +———–|————|———–|————+│

Above is small fragmented screen shot of nmon at work. It can be scheduled as cronjob and output can be sent to file as well for later diagnostics.

With this brief introduction of nmon, we will talk about nmon analyzer, which is the topic of the blog. Again this is free tool available by IBM, which consolidates data from nmon output and represent it in a very user friendly graphs and charts. Basically it takes output files generated by nmon tool as an input and churns out various graphs/charts in a excel format which one can print, mail or even publish on the web.

There is only one caveat. IBM does not support the tool officially so one cannot seek any help from IBM. Here is the link from which it can be downloaded for free. The tool will be downloaded as a zip file that contains excel file, sample input file and user documentation.

Resources:

  • Wikipedia entry – here.
  • Article – here.
  • Article – here.

Posted in Unix | 1 Comment »

Getting previous date in korn shell on AIX (ksh)

Posted by decipherinfosys on July 28, 2009

Recently at one of our client site we have to write shell script, which runs every night and goes through the database errors from logs of previous day.  Now there are different ways of manipulating date in the korn shell it self but for all of that we have to write logic. Since we have to go through files of previous day, we have to be extra careful for range conditions. Ex. For 1st March we have to process data of 28th or 29th February based on the leap year. On 1st January we have to process data for 31st December of the previous year etc.

Instead of doing date arithmetic in the Korn shell, we decided to use Oracle database’s date logic since we can manipulate system date (sysdate) easily. In the following shell script we are showing 3 methods of obtaining the previous day’s date. Please copy the following contents into one file on AIX box and save it as a test.sh file and then execute the script. Before execution please make sure that you have proper privilege to execute the script.

#!/bin/ksh

DB_USER=${1}
DB_PSWD=${2}
DB_NAME=${3}

############
#Method 1:
############
v_prev_date=`TZ=bb24 date +%Y%m%d`
echo ‘Yesterday –> ‘$v_prev_date

##############################################
# Method 2:
# one way of getting system date from Oracle
# and pass it to shell script variable.
##############################################
SQLQUERY=”select to_char(sysdate – 1,’YYYYMMDD’),to_char(sysdate,’YYYYMMDD’) from dual”
print ”
set pagesize 0;
set feedback off;
set heading off;
$SQLQUERY;
“| sqlplus -S $DB_USER/$DB_PSWD@$DB_NAME > TEMP_FILE

v_prev_date=`cat TEMP_FILE | awk ‘{print $1}’`
v_curr_date=`cat TEMP_FILE | awk ‘{print $2}’`
print ‘Method 2 –> ‘$v_prev_date
print ‘Method 2 –> ‘$v_curr_date

###################################################
# Method 3:
# Alternate way of getting system date from Oracle
# and pass it to shell script variable.
###################################################
var1=`sqlplus -S $DB_USER/$DB_PSWD@$DB_NAME << EOF
set echo off term off feed off ver off pages 0
select to_char(sysdate – 1,’YYYYMMDD’), to_char(sysdate,’YYYYMMDD’) from dual;
exit;
EOF`

v_prev_date=`echo $var1 | awk ‘{print $1}’`
v_curr_date=`echo $var1 | awk ‘{print $2}’`

print ‘Method 3 –> ‘$v_prev_date
print ‘Method 3 –> ‘$v_curr_date

Upon executing above shell script, we will get following results.

./test.sh scott tiger orcl

Result is

Yesterday –> 20090726
Method 2 –> 20090726
Method 2 –> 20090727
Method 3 –> 20090726
Method 3 –> 20090727

To display first line in the output, we are using AIX’s date command along with TZ (timezone) variable. 24 is offset to get the previous date and bb is just the string. It can be any string. Instead of ‘bb’ one can use ‘aa’ or ‘deci’ etc. If instead of 24 we use 48, then it will display day before yesterday’s date.

In method 2, output of SQL is written to file (temp_file) and then we are reading the file to get the date and previous date both.

In method 3, instead of writing the output to file, we are assigning it straight to shell variable and then from that shell variable, we are using awk command to get the desired result. Since we are relying on database’s date every condition is taken care of. So be it a 1st January 2010 or 1st March of 2008, we are going to get correct result of 20091231 and 20080229 respectively.

This shell script also serves as an example of how we can assign SQL*Plus output to shell variable. In most scenarios, we pass shell variable to SQL script but in this case it is reverse.

Resources:

  • Article – here.
  • Unix.com article – here.

Posted in Oracle, Unix | 4 Comments »

Creating an empty file in Unix and Windows

Posted by decipherinfosys on February 15, 2009

Recently at a client site, we came across a situation where we had to create empty files with a specific size on Windows.  We had to do this for R&D purposes in order to be able to mimic ASM in Windows environment without actually having different disks.

Windows has a utility called fsutil.exe. This is a command line utility. Mainly it is used for maintaining file and disk properties and is an advance level utility. But one of its fine usages is to create the empty file of a specific size. Following is the command to create an empty file.

C:\>fsutil file createnew c:\ttt.txt 2048
File c:\ttt.txt is created

C:\>dir ttt.txt
Volume in drive C is C-Drive
Volume Serial Number is 2482-1E9C

Directory of C:\

02/12/2009 05:10 PM 2,048 ttt.txt

Similarly in Unix, there is a touch command. The touch command updates the access and modification times of each file specified by the parameter or command line argument. If time value is not specified then touch command will use current time. If we specify a file that does not exist, it will create that file with current date time unless –c flag is specified. When –c flag is specified it will not create the file, if it does not exist. Following command shows that.

$ touch -c ttt.txt
$ ls -l ttt.txt
ls: 0653-341 The file ttt.txt does not exist.

Now let us issue same command without using –c flag.

$ touch ttt.txt
$ ls -l ttt.txt
-rw-r–r– 1 b01234 uga 0 Feb 13 17:19 ttt.txt

Posted in Linux, Unix, Windows | 1 Comment »

Back to the basics: Frequently used VI editor commands

Posted by decipherinfosys on April 4, 2008

VI is most popular editor on UNIX. Today we will recap some of the most frequently used VI editor commands. To edit the file using VI, use vi filename. Following is the list of frequently used VI commands.

Single character commands:

a => append after the cursor
A => append at the end of the line
b => move back one word at a time.
B => move back last word before the space.
C => change entire line
e => move cursor at the end of the next word.
G => takes to the end of the file.
h => move one position on left
H => takes to the first line or top of the display screen. If it is preceded with number, then it takes you to that line. e.g. 5H takes to the 5th line from the top.
i => insert from current position or before cursor.
I => insert at the beginning of the line.
j => move one line down.
J => append the next line to the current line on which cursor is positioned.
k => move one line up.
l => move one position on right.
L => takes to last line or bottom of display screen.
m => set the mark. e.g. ma will mark current position as a. similarly we can use mb or mc etc. to mark different positions. Marked positions can be recalled from anywhere in the file using ‘a or ‘b or ‘c. ( ‘ is single quote key left of the <ENTER> key.)
n => repeat last search in forward direction. If you reach the end of the file, then search will again start from the beginning of the file.
N => repeat last search in backward direction. If you reach the beginning of the file, search will again start from the end of the file.
o => open the new line after current line.
O => open the new line before current line.
p => insert from buffer. (yanking ‘yy’ will put lines in buffer)
r => replace current character where cursor is positioned. It can be replaced by other single character.
R => replace characters from current position onward, one character at a time until ‘Esc’ key is pressed.
s => substitute current character. Substitution could be one or more characters.
S => substitute entire line.
u => undo previous change.
U => undo entire line.
w => move one word forward.
x => delete current character.
{ => move back one paragraph. Paragraph is considered as text between two blank lines.
} => move forward one paragraph
% => match the parenthesis
~ => reverse the case (if upper then do it lower and vice versa).
$ => move to last column.
. => repeat the last command.
0 => move to beginning of the line.
Esc => exit out of the edit mode.

Double character commands:

dd => delete the current line. If number precedes dd, then those many lines will be deleted. e.g. 10dd will delete 10 lines from current position.
cc => change entire line
yy => yank/copy the line in buffer. It can be placed in the file using p option. If number precedes yy, then those many lines will be copied in buffer. e.g. 10yy will copy 10 lines in buffer, which can be put into file anywhere using p.
cw => change word.
ce => change word.
ZZ => save changes and exit the file.

Commands with <Ctrl> Key:

Ctrl-l => refresh screen
Ctrl-b => move one screen backward.
Ctrl-f => move one screen forward.
Ctrl-g => status of the file.

These are some of the basic and frequently used commands used in VI editor. For a full reference sheet for the VI commands, you can just google for “vi commands” and you will find a lot of good concise information.

Posted in Unix | 1 Comment »

Putting files on a remote Server (Oracle/Unix)

Posted by decipherinfosys on October 15, 2007

Sometimes, when we are working with the Oracle database, we come across the need to put files on remote server which are to be picked up by some other process or we have to ftp file(s) for backups. One can transfer the file manually by using ftp either to put on the remote server or to get it from the remote server. Manual process has its own disadvantage as it requires human intervention. We can always automate the process and schedule it as a cronjob so that script gets invoked at specific time and file transfer takes place automatically. Following is the small shell script which automates the process of putting the file on the remote server.  Cut and paste following text on a UNIX machine and save it as a shell file (.sh extension).

#!/bin/sh
ftp_server=”10.80.10.27″
exp_dir=”/home/oracle/decipher”

ftp -n ${ftp_server} << EOF
user ftpuser ftpuser\*
cd /customerdb/ora_bkup
lcd ${exp_dir}
bin
put decipher_backup.dmp
EOF

•    First line indicates that we are using bourne shell. This is default for UNIX. Shell is nothing but the command interpreter. It reads command from the key board and executes it. When more than one command is put together in the file, it is called shell script. We can execute the shell script which in turn will execute all the commands within the script.
•    Second and third lines, we are declaring variable and assigning value to it.
•    Fourth line we are invoking ftp connection with –n option.  When –n option is used, it turns off the automatic login and ftp client will not ask for userid and password to connect to remote server. If not, then it will try to login immediately and will prompt for userid and password. ‘<< ’ is a redirection syntax and indicates that anything defined after this will be considered as an input to the ftp command.
•    On the fifth line, we are giving credentials to login to the remote server by providing userid and password.
•    On the sixth line we are changing the directory where we want to put the file.
•    Seventh line, we are using lcd command to change the directory on local machine and hence (lcd), from which we would like to transfer the file.
•    On the eighth line, we are changing the transfer mode to binary. Default is ascii mode. If it is a zip file or export dump file, it is advisable to transfer it in binary mode otherwise during transfer it may get corrupted.
•    In the ninth line, we are using put command to put the file on the remote server.
•    10th line indicates EOF, means redirection is over at this point. So if at all, any other command is used in the script after this point will not be considered as an input and will be executed individually.

In similar fashion, one can get the file from remote server using get command. Script can be enhanced further to check for the success or failure of ftp process as well.

Posted in Oracle, Unix | Leave a Comment »

Sharing Files between Windows and Linux/Unix machines

Posted by decipherinfosys on June 10, 2007

It can get quite frustrating at times when you want to access files on your linux/unix machines which host your Oracle or DB2 LUW databases. MSFT offers a simple solution via the program suite called Services for Unix (SFU). Samba also has a suite of tools for UNIX systems for sharing files over the network with MSFT Windows. Here are a couple of resources to help you further to sort out file sharing issues:

http://samba.org –> Samba’s web-site and a lot of very good information and tools.

http://support.microsoft.com/kb/304040 –> Configuring File Sharing in Windows XP

Windows Services for Unix download – here and the homepage for SFU information – here.

Posted in Linux, Unix, Windows | Leave a Comment »

Some useful Unix commands

Posted by decipherinfosys on February 23, 2007

Here is a cheat-sheet for the Unix commands to find swap, RAM and OS version on different Unix and Linux versions:
To Find Swap, RAM, and OS Version

OS     SWAP             RAM                     OS VERSION
———————————————————————————-
AIX     /usr/sbin/lsps -a    /usr/sbin/lsattr -HE -l sys0 -a realmem    oslevel

HP     swapinfo -q        dmesg | grep -i mem            uname -a

Tru64     swapon -s        vmstat -P                /usr/sbin/sizer -v

Solaris     swap -s        /usr/sbin/prtconf | grep -i memory    uname -r

Linux     free             free                    uname -a

Posted in Linux, Unix | Leave a Comment »

Detecting CPU Bottleneck

Posted by decipherinfosys on February 22, 2007

When do you categorize a server to be running into a CPU bottleneck?  And how do you go about collecting that information on Windows and other Unix systems?  The yard stick that is very commonly used is that if an individual user process is consuming greater than 10% of the CPU then there is a need to investigate further.  Also, if the cumulative CPU usage is consistently more than 80%, then this also indicates a CPU bottleneck. If there are no individual processes using a lot of CPU, then you are trying to put too much work on the server, and should upgrade to either faster processors or add more processing power. Here is what you can do to automatically capture this information on different OS:

HP/UX or AIX or Linux or Solaris:

“sar –o <filename> 60 10 >/dev/null 2>&1 &” will sample the CPU activity every minute for 10 minutes and save the output in a file.  If you are running into issues, reduce the time-frame for the data collection.

If user+nice+system > 80% for an extended period of time, then you have a CPU bottleneck.

AIX specific:

“vmstat 10 500” will display system CPU usage every 10 secs. If id=0 or us+sy>80 for an extended period of time, then this indicates that the system is CPU-bound.

Solaris specific:

“top” will show you the top processes by CPU activity

Note: You may have to install SUNWaccu and SUNWaccr packages if sar is not installed.

Windows:

a) Use the task manager and select the processes tab.
b) Then click either the “CPU” column title to sort the processes

A better approach though is to use perfmon (Performance Monitor) and schedule it to collect the data:

c) Type perfmon from the command line (Start/Run and then perfmon and enter)
d) Then click on the “+” symbol and add “%processor time” for each of the processors you have on your system.  You can save this as a blg file and then do further analysis on it later on.

Posted in Linux, Unix, Windows | Leave a Comment »