Monday, September 13, 2010

Running a perl program

Explicitly invoking the Perl compiler/interpreter
(note that >>> is prompt only)
>>>perl program.pl
Many command line options are available here eg.
-w print all warnings
-c check but do not execute
-d run under debugger control

Implicitly invoking the Perl compiler/interpreter
It is system specific. e.g. Unix
#!/usr/bin/perl
print "Please enter your name: ";
 This is popular shebang notation. Also you can do following:
#!/usr/bin/perl -w
for warning check etc like we do with command line options.

More operators in perl

x         The repetition operator. Returns a string consisting of the left operand repeated the number of times specified by the right operand. In an array context, if the left operand is a list in parens, it repeats the list.

print '-' x 80; # print row of dashes
print '-' x80; # illegal, x80 is identifier
print "\t" x ($tab/8), ' ' x ($tab%8); # tab over
@ones = (1) x 80; # an array of 80 1's
@ones = (5) x @ones; # set all elements to 5

x=        The repetition assignment operator. Only works on scalars.

..          The range operator, which is really two different operators depending on the context. In an array context, returns an array of values counting (by ones) from the left value to the right value. This is useful for writing "for (1..10)" loops and for doing slice operations on arrays.

In a scalar context, .. returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each .. operator maintains its own boolean state. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again. (It doesn't become false till the next time the range operator is evaluated. It can test the right operand and become false on the same evaluation it became true (as in awk), but it still returns true once. If you don't want it to test the right operand till the next evaluation (as in sed), use three dots (...) instead of two.) The right operand is not evaluated while the operator is in the "false" state, and the left operand is not evaluated while the operator is in the "true" state. The precedence is a little lower than

and &&. The value returned is either the null string for false, or a sequence number (beginning with 1) for true. The sequence number is reset for each range encountered. The final sequence number in a range has the string 'E0' appended to it, which doesn't affect its numeric value, but gives you something to search for if you want to exclude the endpoint. You can exclude the beginning point by waiting for the sequence number to be greater than 1. If either operand of scalar .. is static, that operand is implicitly compared to the $. variable, the current line number.



Examples:



As a scalar operator:

if (101 .. 200) { print; } # print 2nd hundred lines



next line if (1 .. /^$/); # skip header lines



s/^/> / if (/^$/ .. eof()); # quote body



As an array operator:

for (101 .. 200) { print; } # print $_ 100 times



@foo = @foo[$[ .. $#foo]; # an expensive no-op

@foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items

More operators in perl

x         The repetition operator. Returns a string consisting of the left operand repeated the number of times specified by the right operand. In an array context, if the left operand is a list in parens, it repeats the list.

print '-' x 80; # print row of dashes
print '-' x80; # illegal, x80 is identifier
print "\t" x ($tab/8), ' ' x ($tab%8); # tab over
@ones = (1) x 80; # an array of 80 1's
@ones = (5) x @ones; # set all elements to 5

x=        The repetition assignment operator. Only works on scalars.

..          The range operator, which is really two different operators depending on the context. In an array context, returns an array of values counting (by ones) from the left value to the right value. This is useful for writing "for (1..10)" loops and for doing slice operations on arrays.

In a scalar context, .. returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each .. operator maintains its own boolean state. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again. (It doesn't become false till the next time the range operator is evaluated. It can test the right operand and become false on the same evaluation it became true (as in awk), but it still returns true once. If you don't want it to test the right operand till the next evaluation (as in sed), use three dots (...) instead of two.) The right operand is not evaluated while the operator is in the "false" state, and the left operand is not evaluated while the operator is in the "true" state. The precedence is a little lower than

and &&. The value returned is either the null string for false, or a sequence number (beginning with 1) for true. The sequence number is reset for each range encountered. The final sequence number in a range has the string 'E0' appended to it, which doesn't affect its numeric value, but gives you something to search for if you want to exclude the endpoint. You can exclude the beginning point by waiting for the sequence number to be greater than 1. If either operand of scalar .. is static, that operand is implicitly compared to the $. variable, the current line number.



Examples:



As a scalar operator:

if (101 .. 200) { print; } # print 2nd hundred lines



next line if (1 .. /^$/); # skip header lines



s/^/> / if (/^$/ .. eof()); # quote body



As an array operator:

for (101 .. 200) { print; } # print $_ 100 times



@foo = @foo[$[ .. $#foo]; # an expensive no-op

@foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items

Comparison operators in perl


cmp --String comparison, returning -1, 0, or 1.
<=>    Numeric comparison, returning -1, 0, or 1. (looks like flying saucer)

 =~       Certain operations search or modify the string "$_" by default. This operator makes that kind of operation work on some other string. The right argument is a search pattern, substitution, or translation. The left argument is what is supposed to be searched, substituted, or translated instead of the default "$_". The return value indicates the success of the operation. (If the right argument is an expression other than a search pattern, substitution, or translation, it is interpreted as a search pattern at run time. This is less efficient than an explicit search, since the pattern must be compiled every time the expression is evaluated.) The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.


!~       Just like =~ except the return value is negated.

Operators in perl - bitwise operators

Bitwise operators
>> - Bitwise shift shifting the bits in the left operand to the right by the number of bits indicated by the right operand.


<< - Bitwise shift shifting the bits in the left operand to the left by the number of bits indicated by the right operand.

~ - Ones complement

Operators in perl - relational operators

Note eq is for string and == is for integers. So here are operator for int and corresponding operator for strings.


Comparison Numeric String
Equal == eq
Not equal != ne
Greater than > gt
Less than < lt
Greater than or equal to >= ge
Less than or equal to <= le

Logical operators
 &&,and
||      or

Logical expression
The next few structures rely on a test being true or false. In Perl any non-zero number and non-empty string is counted as true. The number zero, zero by itself in a string, and the empty string are counted as false. Here are some tests on numbers and strings.

$a == $b                # Is $a numerically equal to $b?
# Beware: Don't use the = operator.
$a != $b # Is $a numerically unequal to $b?
$a eq $b # Is $a string-equal to $b?
$a ne $b # Is $a string-unequal to $b?


You can also use logical and, or and not:

($a && $b)              # Is $a and $b true?
($a || $b) # Is either $a or $b true?
!($a) # is $a false?

Note that perl is loosely typed language and allows you anything..which you do with
int etc to do with strings. So here is one eg.
So far we have been testing numbers, but there is more to life than numbers. There are strings too, and these need testing too.
$name  = 'Mark';

$goodguy = 'Tony';

if ($name == $goodguy) {
print "Hello, Sir.\n";
} else {
print "Begone, evil peon!\n";
}
Something seems to have gone wrong here. Obviously Mark is different to Tony, so why does perl consider them equal?
Mark and Tony are equal -- numerically. We should be testing them as strings, not as numbers. To do this, simply substitute == for eq and everything will work as expected.

Understanding 0 in perl

There are many Perl functions which test for Truth. Some are if, while, unless . So it is important you know what truth is, as defined by Perl, not your tax forms. There are three main rules:
  1. Any string is true except for "" and "0".
  2. Any number is true except for 0. This includes negative numbers.
  3. Any undefined variable is false. A undefined variable is one which doesn't have a value, ie has not been assigned to.
Some example code to illustrate the point:
&isit;                   # $test1 is at this moment undefined

$test1="hello"; # a string, not equal to "" or "0"
&isit;

$test1=0.0; # $test1 is now a number, effectively 0
&isit;

$test1="0.0"; # $test1 is a string, but NOT effectively 0 !
&isit;

sub isit {
if ($test1) { # tests $test1 for truth or not
print "$test1 is true\n";
} else { # else statement if it is not true
print "$test1 is false\n";
}
}


The first test fails because $test1 is undefined. This means it has not been created by
assigning a value to it. So according to Rule 3 it is false. The last two tests are
interesting. Of course, 0.0 is the same as 0 in a numeric context. But it is not the
same as 0 in a string context, so in that case it is true.
So here we are testing single variables. What's more useful is testing the result of an expression. For example, this is an expression; $x * 2 and so is this; $var1 + $var2 . It is the end result of these expressions that is evaluated for truth.
An example demonstrates the point:
$x=5;
$y=5;

if ($x - $y) {
print '$x - $y is ',$x-$y," which is true\n";
} else {
print '$x - $y is ',$x-$y," which is false\n";
}
The test fails because 5-5 of course is 0, which is false. The print statement might look a little strange. Remember that print is a list operator? So we hand it a list. First item, a single-quoted string. It is single quoted because it we do not want to perform variable interpolation on it. Next item is an expression which is evaluated, and the result printed. Finally, a double-quoted string is used because we want to print a newline, and without the doublequotes the \n won't be interpolated.
What is probably more useful than testing a specific variable for truth is equality testing. For example, has your lucky number been drawn?
$lucky=15;
$drawnum=15;

if ($lucky == $drawnum) {
print "Congratulations!\n";
} else {
print "Guess who hasn't won!\n";
}
The important point about the above code is the equality operator, == .

if else and unless in perl


Perl provides two forms of simple conditional statement. if-else
Of course Perl also allows if/then/else statements. These are of the following form:

if ($a)
{
print "The string is not empty\n";
}
else
{
print "The string is empty\n";
}
For this, remember that an empty string is considered to be false. It will also give an "empty" result if $a is the string 0.
Using ! and unless
Similarly we can have ! not operator.
In this, it is important to notice that the elsif statement really does have an "e" missing.
Sometimes, it is more readable to use unless instead of if (!...) . The switch-case statement familiar to C programmers are not available in Perl. You can simulate it in other ways. See the manual pages.

elsif 

Note that e is absent in else.
if ($age > $max) {
print "Too old !\n";
} elsif ($age < $min) {
print "Too young !\n";
} else {
print "Just right !\n";
}

Escape character in perl

There's something else new in the code above. The \ . You can see what this does -- it 'escapes' the special meaning of $ .
Escaping means that just the $ symbol is printed instead of it referring to a variable.
Actually \ has a deeper meaning -- it escapes all of Perl's special characters, not just $ . Also, it turns some non-special characters into something special. Like what ? Like n . Add the magic \ and the humble 'n' becomes the mighty NewLine ! The \ character can also escape itself. So if you want to print a single \ try:
print "the MS-DOS path is c:\\scripts\\";   
Oh, '\' is also used for other things like references.

Datatypes in perl

Perl has 3 main datatypes - scalars, arrays and hashes which are prefixed with a symbol such as $ @ % when used as variables.

Scalars
So Perl is working, and you are working with Perl. Now for something more interesting than simple printing. Variables. Let's take simple scalar variables first. A scalar variable is a single value. Like $var=10 which sets the variables $var to the value of 10. Later, we'll look at lists like arrays and hashes, where @var refers to more than one value. For the moment, remember that Scalar is Singular.

Also note that these scalar value can be int , float strings what you have seen in some languages like c.
More of arrays and hashes see in other posts.

Loosely typed language
The idea of forcing a programmer to declare what sort of variable is being created is called typing. As Perl doesn't by default enforce any rules on typing, it is said to be a loosely typed language, as opposed to something like C++ which is strongly typed. One must keep in mind that perl was designed for shorter and fast things...rather than v.big projects.

Shebang

#!usr/bin/perl
#! is called shebang.

The function of the 'shebang' line is to tell the shell how to execute the file. Under UNIX, this makes sense. Under Win32, the system must already know how to execute the file before it is loaded so the line is not needed.

However, the line is not completely ignored, as it is searched for any switches you may have given Perl (for example -w to turn on warnings).

You may also choose to add the line so your scripts run directly on UNIX without modification, as UNIX boxes probably do need it. Win32 systems do not.

Perl Reserved Words

Reserved Literals    




__END__ indicates the logical end of the script (^D and ^Z are synonyms)
__FILE__ current filename
__LINE__ current line number


Reserved Filehandles






<> Null filehanedle - input from either stdin, or files specified on the command line. Psuedonym for internal filehandle .
DATA read data only from the main script, but not from any required file or evaluated string.
STDERR Output to stderr
STDIN Input from stdin
STDOUT Output to stdout


Reserved Variables

















































$_ The default input and pattern-searching space. The following pairs are equivalent:

    while (<>) {...}    while ($_ = <>) {...}   # only equivalent in while!

    /^Subject:/         $_ =~ /^Subject:/

    tr/a-z/A-Z/         $_ =~ tr/a-z/A-Z/

    chop                chop($_)

Also: $ARG
@_ The parameters passed to a subroutine. The array itself is local to the subroutine, but its values are references to the variables that are passed so updates to the members of the array will update the corresponding parameter value.
<digit> Contains the subpattern from the corresponding set of parentheses in the last pattern matched, not counting patterns matched in nested blocks that have been exited already. These variables are all read-only.
$& The string matched by the last successful pattern match (not counting any matches hidden within a BLOCK or eval() enclosed by the current BLOCK). This variable is read-only.

Also: $MATCH
$` The string preceding whatever was matched by the last successful pattern match, not counting any matches hidden within a BLOCK or eval enclosed by the current BLOCK. This variable is read-only.

Also: $PREMATCH
$' The string following whatever was matched by the last successful pattern match (not counting any matches hidden within a BLOCK or eval() enclosed by the current BLOCK). Example: 

     $_ = 'abcdefghi';

     /def/;

     print "$`:$&:$'\n";         # prints abc:def:ghi

This variable is read-only.

Also: $POSTMATCH
$+ The last bracket matched by the last search pattern. This is useful if you don't know which of a set of alternative patterns matched. For example:

     /Version: (.*)|Revision: (.*)/ && ($rev = $+);

This variable is read-only.

Also: $LAST_PAREN_MATCH
$* Set to 1 to do multiline matching within a string, 0 to tell Perl that it can assume that strings contain a single line, for the purpose of optimizing pattern matches. Pattern matches on strings containing multiple newlines can produce confusing results when " $* " is 0. Default is 0. Note that this variable only influences the interpretation of " ^ " and " $ ". A literal newline can be searched for even when $* == 0 . The '/m' modifier should be used instead when pattern matching.

Also: $MULTILINE_MATCHING
$. The current input line number of the last filehandle that was read. This variable should be considered read-only. Remember that only an explicit close on the filehandle resets the line number. Since " <> " never does an explicit close, line numbers increase across ARGV files.

Also: $NR, $INPUT_LINE_NUMBER, input_line_number HANDLE EXPR
$/ The input record separator, newline by default. Works like awk 's RS variable, including treating blank lines as delimiters if set to the null string. You may set it to a multicharacter string to match a multi-character delimiter. Note that setting it to "\n\n" means something slightly different than setting it to "" , if the file contains consecutive blank lines. Setting it to "" will treat two or more consecutive blank lines as a single blank line. Setting it to "\n\n" will blindly assume that the next input character belongs to the next paragraph, even if it's a newline. 

   undef $/;

    $_ = ;          # whole file now here

    s/\n[ \t]+/ /g;

Also: $RS, $INPUT_RECORD_SEPARATOR, input_record_separator HANDLE EXPR
$| If set to nonzero, forces a flush after every write or print on the currently selected output channel. Default is 0. Note that STDOUT will typically be line buffered if output is to the terminal and block buffered otherwise. Setting this variable is useful primarily when you are outputting to a pipe, such as when you are running a Perl script under rsh and want to see the output as it's happening.

Also: $OUTPUT_AUTOFLUSH, autoflush HANDLE EXPR
$, The output field separator for the print operator. Ordinarily the print operator simply prints out the comma separated fields you specify. In order to get behavior more like awk, set this variable as you would set awk 's OFS variable to specify what is printed between fields.

Also: $OFS, $OUTPUT_FIELD_SEPARATOR, output_field_separator HANDLE EXPR
$\ The output record separator for the print operator. Ordinarily the print operator simply prints out the comma separated fields you specify, with no trailing newline or record separator assumed.

Also: $ORS, $OUTPUT_RECORD_SEPARATOR, output_record_separator HANDLE EXPR
$" Delimiter used when interpreting an array as a scalar. This value separates each element of the array in the resulting string - default a space.

Also: $LIST_SEPARATOR
$; The subscript separator for multi-dimensional array emulation, which can be declared directly in v.5. If you refer to a hash element as 

         $foo{$a,$b,$c}

it really means 

         $foo{join($;, $a, $b, $c)}

But don't put 

         @foo{$a,$b,$c}      # a slice--note the @

which means 

         ($foo{$a},$foo{$b},$foo{$c})

Default is "\034", the same as SUBSEP in awk . Note that if your keys contain binary data there might not be any safe value for " $; ".

Also: $SUBSEP, $SUBSCRIPT_SEPARATOR
$# The output format for printed numbers. The initial value is '%. 20g'. Deprecated in Perl 5.

Also: $OFMT
$% The current page number of the currently selected output channel.

Also: $FORMAT_PAGE_NUMBER, format_page_number HANDLE EXPR
$= The current page length (printable lines) of the currently selected output channel. Default is 60.

Also: $FORMAT_LINES_PER_PAGE, format_lines_per_page HANDLE EXPR
$- The number of lines left on the page of the currently selected output channel.

Also: $FORMAT_LINES_LEFT, format_lines_left HANDLE EXPR
$~ The name of the current report format for the currently selected output channel. Default is name of the filehandle.

Also: $FORMAT_NAME, format_name HANDLE EXPR
$^ The name of the current top-of-page format for the currently selected output channel. Default is name of the filehandle with _TOP appended.

Also: $FORMAT_TOP_NAME, format_top_name HANDLE EXPR
$: The current set of characters after which a string may be broken to fill continuation fields (starting with ^) in a format. Default is S<" \n-">, to break on whitespace or hyphens.

Also: $FORMAT_LINE_BREAK_CHARACTERS, format_line_break_characters HANDLE EXPR
$^ L What formats output to perform a formfeed. Default is \f.

Also: $FORMAT_FORMFEED, format_formfeed HANDLE EXPR
$^ A The current value of the write() accumulator for format() lines. A format contains formline() commands that put their result into $^ A . After calling its format, write() prints out the contents of $^ A and empties. So you never actually see the contents of $^ A unless you call formline() yourself and then look at it.

Also: $ACCUMULATOR
$? The status returned by the last pipe close, backtick ( `` ) command, or system() operator. Note that this is the status word returned by the wait() system call, so the exit value of the subprocess is actually ( $? >> 8 ). Thus on many systems, $? & 255 gives which signal, if any, the process died from, and whether there was a core dump.

Also: $CHILD_ERROR
$! If used in a numeric context, yields the current value of errno, with all the usual caveats. (This means that you shouldn't depend on the value of " $! " to be anything in particular unless you've gotten a specific error return indicating a system error.) If used in a string context, yields the corresponding system error string. You can assign to " $! " in order to set errno if, for instance, you want " $! " to return the string for error n , or you want to set the exit value for the die() operator.

Also: $ERRNO, $OS_ERROR
$@ The Perl syntax error message from the last eval() command. If null, the last eval() parsed and executed correctly (although the operations you invoked may have failed in the normal fashion).

Also: $EVAL_ERROR
$$ The process number of the Perl running this script.

Also: $PID, $PROCESS_ID
$< The real uid of this process.

Also: $UID, $REAL_USER_ID
$> The effective uid of this process. Example: 

         $< = $>;            # set real to effective uid

         ($<,$>) = ($>,$<);  # swap real and effective uid

Also: $EUID, $EFFECTIVE_USER_ID
$( The real gid of this process. If you are on a machine that supports membership in multiple groups simultaneously, gives a space separated list of groups you are in. The first number is the one returned by getgid() , and the subsequent ones by getgroups() , one of which may be the same as the first number.

Also: $GID, $REAL_GROUP_ID
$) The effective gid of this process. If you are on a machine that supports membership in multiple groups simultaneously, gives a space separated list of groups you are in. The first number is the one returned by getegid() , and the subsequent ones by getgroups() , one of which may be the same as the first number.

Note: " $& lt; ", " $& gt; ", " $( " and " $) " can only be set on machines that support the corresponding set[re][ug] id() routine. " $( " and " $) " can only be swapped on machines supporting setregid() .

Also: $EGID, $EFFECTIVE_GROUP_ID
$0 Contains the name of the file containing the Perl script being executed. Assigning to " $0 " modifies the argument area that the ps(1) program sees. This is more useful as a way of indicating the current program state than it is for hiding the program you're running.

Also: $PROGRAM_NAME
$[ The index of the first element in an array, and of the first character in a substring. Default is 0, but you could set it to 1 to make Perl behave more like awk (or Fortran) when subscripting and when evaluating the index() and substr() functions.

As of Perl 5, assignment to " $[ " is treated as a compiler directive, and cannot influence the behavior of any other file. Its use is discouraged.
$] The string printed out when you say perl -v . It can be used to determine at the beginning of a script whether the perl interpreter executing the script is in the right range of versions. If used in a numeric context, returns the version + patchlevel / 1000. Example: 

    # see if getc is available

    ($version,$patchlevel) =

             $] =~ /(\d+\.\d+).*\nPatch level: (\d+)/;

    print STDERR "(No filename completion available.)\n"

             if $version * 1000 + $patchlevel < 2016;

or, used numerically, 

    warn "No checksumming!\n" if $] < 3.019;

Also: $PERL_VERSION
$^ D The current value of the debugging flags.

Also: $DEBUGGING
$^ F The maximum system file descriptor, ordinarily 2. System file descriptors are passed to exec() ed processes, while higher file descriptors are not. Also, during an open() , system file descriptors are preserved even if the open() fails. (Ordinary file descriptors are closed before the open() is attempted.) Note that the close-on-exec status of a file descriptor will be decided according to the value of $^ F at the time of the open, not the time of the exec. 

Also: $SYSTEM_FD_MAX
$^ I The current value of the inplace-edit extension. Use undef to disable inplace editing.

Also: $INPLACE_EDIT
$^ P The internal flag that the debugger clears so that it doesn't debug itself. You could conceivable disable debugging yourself by clearing it.

Also: $PERLDB
$^ T The time at which the script began running, in seconds since the epoch (beginning of 1970). The values returned by the -M , -A and -C filetests are based on this value.

Also: $BASETIME
$^ W The current value of the warning switch, either TRUE or FALSE.

Also: $WARNING
$^ X The name that the Perl binary itself was executed as, from C's argv[0] .

Also: $EXECUTABLE_NAME
$ARGV The name of the current file when reading from <>.
@ARGV The array @ARGV contains the command line arguments intended for the script. Note that $# ARGV is the generally number of arguments minus one, since $ARGV [0] is the first argument, NOT the command name. See " $0 " for the command name.
@INC The array @INC contains the list of places to look for Perl scripts to be evaluated by the do EXPR , require , or use constructs. It initially consists of the arguments to any -I command line switches, followed by the default Perl library, probably "/usr/local/lib/perl", followed by ".", to represent the current directory.
%INC The hash %INC contains entries for each filename that has been included via do or require . The key is the filename you specified, and the value is the location of the file actually found. The require command uses this array to determine whether a given file has already been included. 
$ENV {expr} The hash %ENV contains your current environment. Setting a value in ENV changes the environment for child processes.
$SIG {expr} The hash %SIG is used to set signal handlers for various signals.

Using Environment Variables

These are available as the '%ENV' hash, and retrieved as follows:
    $variable = $ENV{'variable'};
The following are commonly used:
CONTENT_LENGTH The number of bytes of data passed through standard input via the POST method.
DOCUMENT_ROOT Path on the server to the web site (root URL) being accessed.
GATEWAY_INTERFACE Protocol used to communicate with the server.
HTTP_COOKIE The contents of all cookies that are visible to the script - each cookie delimited by a semicolon (';').
HTTP_USER_AGENT The name (eg. 'Mozilla') and version (eg. '4.03') of the browser used to access the server and the type of platform used by the browser (eg. 'Macintosh;I;PPC').
HTTP_REFERER The URL of the page that was being viewed when the script was accessed, either via a link or by direct entering o fthe URL. Can be used to check that the user has come from a recognised source, to help with security.
QUERY_STRING Holds the data input via the GET method.
REQUEST_METHOD Either "POST" or "GET", depending on what is set on the method attributes of the form tag. NB This value will be "POST" if method="POST", even if some data is passed via GET (eg. data appended directly to the action URL).
The following will cause both methods to be used, even though $REQUEST_METHOD="POST":
   
SERVER_SOFTWARE Name and version of the web server.

Running a perl program

Explicitly invoking the Perl compiler/interpreter
(note that >>> is prompt only)
>>>perl program.pl
Many command line options are available here eg.
-w print all warnings
-c check but do not execute
-d run under debugger control

Implicitly invoking the Perl compiler/interpreter
It is system specific. e.g. Unix
#!/usr/bin/perl
print "Please enter your name: ";
One can also use -w etc options here also
Eg. #!/usr/bin/perl -w

Other thing is changed the permission of perl file by chmod command to executable and run it as normal executable file.

Command line arg in perl

In perl, command-line arguments are stored in the array named @ARGV.

$ARGV[0] contains the first argument, $ARGV[1] contains the second argument, etc. So if you're just looking for one command line argument you can test for  $ARGV[0], and if you're looking for two you can test for  $ARGV[1], and so on. (Or you can use a loop, as shown below.)
$#ARGV is the subscript of the last element of the @ARGV array, so the number of arguments on the command line is $#ARGV + 1.

Here's a simple Perl program that prints the number of command-line arguments it's given, and the values of the arguments:
$numArgs = $#ARGV + 1;
print "thanks, you gave me $numArgs command-line arguments:\n";

foreach $argnum (0 .. $#ARGV) {

print "$ARGV[$argnum]\n";

}
When a program is executed, the command line arguments go into a special array called @ARGV.And $_
is the special variable which is used as current element of @ARGV.
Eg.

foreach (@ARGV) {
print "$_\n";
}

This program prints the command line arguments on the screen.


Another useful way to get parameters into a program -- this time without user input. The relevance to filehandles is as follows. Run the following perl script as:
perl myscript.pl stuff.txt out.txt
while (<>) {
print;
}
If you don't specify anything in the angle brackets, whatever is in @ARGV is used instead. And after it finishes with the first file, it will carry on with the next and so on. You'll need to remove non-file elements from @ARGV before you use this.

It can be shorter still:

perl myscript.pl stuff.txt out.txt

print while <>;

This takes a little explanation. As you know, many things in Perl, including filehandles, can be evaluated in list or scalar context. The result that is returned depends on the context.
If a filehandle is evaluated in scalar context, it returns the first line of whatever file it is reading from. If it is evaluated in list context, it returns a list, the elements of which are the lines of the files it is reading from.
The print function is a list operator, and therefore evaluates everything it is given in list context. As the filehandle is evaluated in list context, it is given a list !
Who said short is sweet? Not my girlfriend, but that's another story. The shortest scripts are not usually the easiest to understand, and not even always the quickest. Aside from knowing what you want to achieve with the program from a functional point of view, you should also know wheter you are coding for maximum performance, easy maintenance or whatever -- because chances those goals may be to some extent mutually exclu

Avoiding mistakes in perl

You can control perl's level of "strictness" using command line switches and pragmas

the -w switch

#!/usr/local/bin/perl -w
# deliberate_mistake.pl


$variable = 5; $varaible++;

print "new value = $variable\n";


>>>deliberate_mistake.pl
Name "main::varaible" used only once: possible typo at deliberate_mistake.pl line 5.
new value = 5

the strict pragma

This forces all variables to be explicitly declared using the my keyword.

#!/usr/local/bin/perl -w
# deliberate_mistake2.pl

use strict; # forces all variables to be declared

my $variable = 5; $varaible = $variable + 1;

print "new value = $variable\n";

Notice this would not get caught by the -w switch

>>> deliberate_mistake2.pl
Global symbol "$varaible" requires explicit package name at deliberate_mistake2.pl line 7.
Execution of lectures/1_intro/deliberate_mistake2.pl aborted due to compilation errors.

Thursday, August 26, 2010

Introduction to perl

Perl is a script language which was originally developed by Larry Wall. The source code can directly be "executed" using perl and there is no explicit compilation step involved. This perl program is usually installed in /usr/bin/perl. Perl is in many aspects quite similar to the classic unix programs awk and sed but perl has gone a long way from there. Today you can even do object oriented programming and design graphical user interfaces with perl. Perl can easily be extended in its capabilities with libraries. The perl archive at CPAN has many of them. This first article will however not go into advanced topics. Instead I would like to show you some basics and have more advanced things in later articles.
Perl is a very useful scripting language. It is a universal tool for everyone with some programming skills.

A simple program


  1. #!/usr/bin/perl

  2. print "Hello World!\n";
When you click Run > Run or press F9, the program will launch and print the message “Hello World!”

Let's look at the program in more details.

The first line of the program is a special comment. Comments in Perl program start from the pound sign (#) to the rest of line. There is no sign for block comment. On UNIX systems, two characters #! starting in a line indicates the program is stored in the file /usr/bin/perl


The second line starts with print statement followed by a string “Hello World” and then followed by a semi-comma. Every Perl statement has to be ended with a semi-comma (;).

You can also launch Perl program by double click on the source code file. There is no *.exe or *.dll file when you write Perl programs. If you want to distribute Perl program just copy the source code to other.

Sunday, August 8, 2010

DB programming in Perl

In the need of creation of a database and a way to add an interface for interaction with that, today I learned how to work on MySQL.

Also got to work at Perl DBI for interaction with MySQL. Database access steps are similar to those in other languages:

for using MySQl, go here.

once your MySQL database is ready and assuming that you have created the database named mydb (how ? refer the link above !), you can continue for you safari into Perl DBI.

we begin our database programming by connecting to it with following steps.

use DBI; # to use Perl DBI ;

my $dsn = ‘DBI:mysql:mydb:localhost’; # DSN string

my $db_user_name = “myusername”; # user name for db
my $db_password = “mypasswd”; # passwd for db

my $dbh = DBI->connect($dsn, $db_user_name, $db_password); # connect to mydb

Now, we are ready for processing queries and statements on our database.

for a query like ‘select’ which return some record sets, the process is:

$statement = “select * from mytable”;
my $sth = $dbh->prepare($statement);
$sth->execute or die “can’t execute the query: ” . $sth->errstr;

Now for traversing through the records (rows) we have APIs to help:

while(@row = $sth->fetchrow_array) {

# access @row as an array:
# $row[0] is first field(column)
# $row[1] is second field (column) …
}

# … other stuff

$sth->finish; # reinitialize the handle when needed. we are done.

for the queries like ‘insert’ which do not return any records:

$statement = “insert into mytable values(0, “Manoj”, “402A, West Avenue”)”; $dbh->do($statement) or print $DBI::errstr;

finally,

$dbh->disconnect; # disconnect the db.

following are some links that can come to your rescue in case you get toads !


http://safari.oreilly.com/1565926994
http://www.perl.com/pub/a/1999/10/DBI.html

Saturday, August 7, 2010

Reading entire file in one go.

Solution 1:
std::ifstream in("circle.cc");
std::istreambuf_iterator < char > begin(in), end;
std::string str(begin, end);

Solution 2:
std::ifstream input("circle.cc");
std::ostringstream temp;
temp << input.rdbuf();
std::string str = temp.str();

In both the cases,
std::cout << str;
will print out entire file.

Perl solution:
open CIRCLE, "circle.cc";
@mylist = ;
close CIRCLE;

C++ and Perl both take 3 lines of code.
Not bad for a system programming language!!

Thursday, August 5, 2010

File Handling in perl: open, read, write and close files

Some Files Are Standard

In an effort to make programs more uniform, there are three connections that always exist when your program starts. These are STDIN, STDOUT, and STDERR. Actually, these names are file handles. File handles are variables used to manipulate files. Just like you need to grab the handle of a hot pot before you can pick it up, you need a file handle before you can use a file. Table 9.1 describes the three file handles.
Table 9.1 - The Standard File Handles
Name Description
STDIN Reads program input. Typically this is the computer's keyboard.
STDOUT Displays program output. This is usually the computer's monitor.
STDERR Displays program errors. Most of the time, it is equivalent to STDOUT, which means the error messages will be displayed on the computer's monitor.
You've been using the STDOUT file handle without knowing it for every print() statement in this book. The print() function uses STDOUT as the default if no other file handle is specified. Later in this chapter, in the "Examples: Printing Revisited" section, you will see how to send output to a file instead of to the monitor.
The association of these file handles can be altered
We can also use:< input # STDIN
$ perl prog.pl > output # STDOUT
$ perl prog.pl 2> errors # STDERR

Using STDIN

Reading a line of input from the standard input, STDIN, is one of the easiest things that you can do in Perl. This following three line program will read a line from the keyboard and then display it. This will continue until you press Ctrl+Z on DOS systems or Ctrl-D on UNIX systems.

while (< STDIN >) {
print();
}
The <> characters, when used together, are called the diamond operator. They tell Perl to read a line of input from the file handle inside the operators. In this case, STDIN. Later, you'll use the diamond operators to read from other file handles.
In this example, the diamond operator assigned the value of the input string to $_. Then, the print() function was called with no parameters, which tells print() to use $_ as the default parameter. Using the $_ variable can save a lot of typing but I'll let you decide which is more readable. Here is the same program without using $_.

while ($inputLine = ) {
print($inputLine);
}
When you pressed Ctrl+Z or Ctrl+D, you told Perl that the input file was finished. This caused the diamond operator to return the undefined value which Perl equates to false and caused the while loop to end. In DOS (and therefore in all of the flavors of Windows), 26 - the value of Ctrl+Z - is considered to be the end-of-file indicator. As DOS reads or writes a file, it monitors the data stream and when a value of 26 is encountered the file is closed. UNIX does the same thing when a value of 4 - the value of Ctrl+D - is read.

#Making echo command in perl:
#echo0.pl
 while (< STDIN >) {
print();
}
#echo1.pl
$line = <STDIN>;
print "echo:$line";
#!/usr/local/bin/perl
# echo2.pl : reads in multiple lines and prints them all
# keeps echoing back what the user types while ( $line = ) { print "echo:$line"; } 

$_ as the default input variable

#!/usr/local/bin/perl
# echo3.pl

# keeps echoing back what the user types
while ( ) {
print "echo:$_"; #instead of just writing print(); like in echo0.pl
}


Using Redirection to Change STDIN and STDOUT

DOS and UNIX let you change the standard input from being the keyboard to being a file by changing the command line that you use to execute Perl programs. Until now, you probably used a command line similar to:
perl -w 09lst01.pl
In the previous example, Perl read the keyboard to get the standard input. But, if there was a way to tell Perl to use the file 09LST01.PL as the standard input you could have the program print itself. Pretty neat, huh? Well, it turns out that you can change the standard input. It's done this way:

perl -w 09lst01.pl < 09lst01.pl
The < character is used to redirect the standard input to the 09LST01.PL file. You now have a program that duplicates the functionality of the DOS type command. And it only took three lines of Perl code!
You can redirect standard output to a file using the > character. So if you wanted a copy of 09LST01.PL to be sent to OUTPUT.LOG you could use this command line:

perl -w 09lst01.pl <09lst01.pl >output.log
Keep this use of the < and > characters in mind. You'll be using them again shortly when we talk about the open() function. The < character will signify that files should be opened for input and the > will be used to signify an output file. But first, let's continue talking about accessing files listed on the command line.

Using the Diamond Operator (<>)

If no file handle is used with the diamond operator, Perl will examine the @ARGV special variable. If @ARGV has no elements, then the diamond operator will read from STDIN - either from the keyboard or from a redirected file.


while (<>) {
print();
}

The command line to run the program might look like this:

perl -w 09lst02.pl 09lst01.pl 09lst02.pl
And the output would be:

while () {
print();
}
while (<>) {
print();
}
Perl will create the @ARGV array from the command line. Each file name on the command line - after the program name - will be added to the @ARGV array as an element. When the program runs the diamond operator starts reading from the filename in the first element of the array. When that entire file has been read, the next file is read from, and so on, until all of the elements have been used. When the last file has be finished, the while loop will end.
Using the diamond operator to iterate over a list of filenames is very handy. You can use it in the middle of your program by explicitly assigning a list of filenames to the @ARGV array. Listing 9.3 shows what this might look like in a program.


@ARGV = ("09lst01.pl", "09lst02.pl");
while (<>) {
print();
}

This program displays:

while () {
print();
}
while (<>) {
print();
}
Next, we will take a look at the ways that Perl lets you test files, and following that, the functions that can be used with files.

Reading into variables in perl
Though we have already seen it how to use here is some more detail of it.
Lets suppose we have handle to read from file handle FILE
– read one line at a time (scalar context)
– read all lines at once (array context)
– returns undef at EOF
Eg.
$first = ; # first line
$second = ; # second line
@rest = ; # everything else


• The new-line character is at the end of each line
– can be removed with chomp()

#Eg making wc function of unix in perl 
#!/usr/local/bin/perl
# wc.pl - imitates unix wc (word count) program








$lines = 0;
$words = 0;
$characters = 0;
while ( <> ) {
$lines++;
@words = split(' ', $_);
$words += scalar(@words);
$characters += length($_);
}
print " $lines $words $characters\n";

Input from a file

Use open function - 

open(filehandle, mode, filename);
– returns undef if file cannot be opened

#!/usr/local/bin/perl
# sort.pl - sorts the lines in a file alphabetically

# open a file and assign the filehandle F
open(F, "myfile.txt") or die("can't open myfile.txt: $!\n");
# read in the whole file into an array of lines
@lines = ();
while() {
push(@lines, $_);
}
close(F); # close the filehandle

# sort the lines
@lines = sort @lines; #note that simply using sort @lines won't work, we need new
#array here;
# print out the lines, now sorted
foreach (@lines) {
print; # means the same as print $_;
}

#!/usr/local/bin/perl
# sort.pl - sorts the lines in a file alphabetically


# open a file and assign the filehandle F
open(F, "myfile.txt") or die("can't open myfile.txt: $!\n");


# read in the whole file into an array of lines
@lines = ();
while(<F>) {
push(@lines, $_);
}
close(F); # close the filehandle


# sort the lines
@lines = sort @lines; #note that simply using sort @lines won't work, 
#we need new array here;


# print out the lines, now sorted
foreach (@lines) {
print; # means the same as print $_;
}

Writing to file

open(F, ">output.txt"
We add > before file for writing. 
 
 
# filewrite.pl
open(F, ">output.txt") or die( "cannot write to output.txt : $!\n");
print F "Hello World!\n";
close(F);

Appending

open(F, ">>output.txt");
print F "Goodbye world!\n";
close(F);

File Test Operators

Perl has many operators that you can use to test different aspects of a file. For example, you can use the -e operator to ensure that a file exists before deleting it. Or you can check that a file can be written to before appending to it. By checking the feasibility of the impending file operation, you can reduce the number of errors that you program will encounter.