[Skip Navigation] [CSUSB] / [CNS] / [Comp Sci & Eng Dept] / [R J Botting] / [Samples] / perl
[Index] [Contents] [Source Text] [About] [Notation] [Copyright] [Comment/Contact] [Search ]
Tue Nov 18 11:17:14 PST 2008

Contents


    Notes on Perl

      Introduction

      Perl inherits many ideas from UNIX tools but rejects the UNIX philosophy that several simple tools are better than a single complex tool. It combines ideas from the Bourne shell, and the following programs: tr, sed, awk, ed, grep, egrep, plus the programming language C. The result is available for most operating systems and popular for writing scripts that react to requests from the WWW for pages using the Common Gateway Interface.

      See [ importance_0498.html ] for a discussion of the importance of Perl.

      Perl is a dynamically scoped, block structured language. It uses special characters ($,@,%,&) to distinguish different types of data(scalar, array,associative array, subprogram).

      PERL on Windows

      Question From Alumin (Anonimized)
        Hello, I hope everything is going good at the school for you both. I have continued to work at XYZ Systems since I graduated and we've run into a problem that requires us to use PERL 5.8.4 and I have been unable to locate a pre-compiled or source code with compilation instruction for a windows system.

      Answer From Dr. Gomez
      1. Try: [ index.php?title=Main_Page ]
      2. Also: [ win32 in download ]
      3. These guys have 5.8.8 binaries for windows, but this will work only if you are not sensitive to minor version number: [ index.mhtml ]

      Warnings for UNIX Experts

    1. Look out when assigning to scalars:
               $var = whatever
      is ok in Perl, but abnormal in 'sh'.
      Operations on an element in array @a or associative array %a use $a.
      Don't use the sed \(....\) brackets for subpatterns. Use (....).
      Don't use the sed \1..\9 but $1..$9 for strings that match patterns.
      Parentheses like ( ... ) are not like in egrep.
    2. Not $i but $ARGV[i] for arguments to subprograms.
    3. 1/2 is 0.5 not 0!
    4. Security gotcha: Strings handed to a shell are reinterpreted and not quoted. This lets malicious users execute unexpected commands on your server.

      Lexicon

      Perl uses the American Standard Code for Information Interchange or:
    5. |-ASCII
    6. (ASCII)|-
      1. white_space::= ASCII.SP | ASCII.HT | ..., -- space, tab, etc.
      2. doublequote::="\"",
      3. quote::="'",
      4. backquote::="`",
      5. backslash::="\\".
      6. semicolon::=";".
      7. colon::=":".
      8. dollar::="$".
      9. left_brace::="{".
      10. right_brace::="}".
      11. EOLN::=End of line.

    7. type_indicator::= dollar | "@" | "%".
    8. variable::= all_variable | scalar_variable | array_variable | associative_array_variable.
    9. all_variable::= "*" identifier, -- "*"i indicates "$"i,"%"i, and "@"`i.
    10. scalar_variable::= dollar identifier.
    11. array_variable::= "@" identifier.
    12. associative_array_variable::= "%" identifier.
    13. subprogram::= "&" identifier.

    14. get_line_of_standard_input::= "<>", also see the [ read_while_file f doing p ] cliche.

    15. default_variable::= "$_" | "@_" | "%_".
    16. environment_variable::= "%ENV".
    17. (above)|-environment_variable ==>associative_array_variable.
    18. arguments::= "@ARGV".
    19. (above)|-arguments ==>array_variable.
    20. name_formal_arguments::= "local(" List(variable) ")" "=" arguments";".

    21. string::=single_quoted_string | double_quoted_string | back_quoted_string.
    22. single_quoted_string::= quote #non(quote) quote.
    23. double_quoted_string::= doublequote # (non(doublequote|backslash) | escape ) doublequote.
    24. back_quoted_string::= backquote #non(backquote) backquote.

    25. comment::= C_comment | shell_comment.
    26. C_comment::= "/*" #character "*/".
    27. shell_comment::= "#" #non_end_of_line

      Perl Patterns

    28. perl_sub_string::= "(" regular_expression ")", -- like UNIX "\(....\)".
    29. sub_string_variable::= dollar digit. but nesting is ok and UNIX/sed/ed backslashes must not used.
    30. perl_anchors::= "\\" ( "b" | "B" ),
               b    word boundary (between \w and \W)
               B    non-word boudary.

    31. perl_wild_characters::= "\\" ("d" | "D" | "w" | "W" | "s" | "S"),
               d    decimal digit [0-9]
               D    non-digit
               w    word-character [0-9a-z_A-z]
               W    non-word character
               s    whitespace [  \t\n\r\f]
               S    non-whitespace character.

      Some Special Variables

          $|   -- indicates buffering of file, set to 1 to force I/O on each command.
          $%, $=, $-, $~, $^ -- used in page and column lay out for reports
          $1 .. $9  -- Parts of matched patterns in parentheses
          $&   -- the last pattern matched
          $`,$&,$'  -- Before match, match, after match
          $$   -- UNIX Process Id
          $?   -- status report form pipe, sub-shell, etc
          $*   -- set to 1 to do multiline matches, 0 for efficient one line matching
       	$/	-- end of input record separator, set to "" to treat paragraphs as a record.
          $0   -- name of perl script
          $[   -- base of arrays, defaults to 0.
          $;   -- separates dimensions in multidimensional index
          $!   -- error number
          $@   -- Perl eror message
          $<, $>,$(, $) -- user and group ids on UNIX
          $:   -- Characters after which a string can word wrapped for preference
          $#A  -- number of scalars in array @A
          ARGV -- Command line arguments
          <ARGV>    -- file_handle that becomes each argument in turn
          $ARGV     -- Current <ARGV>
          @ARGV     -- array of arguments
          %ENV -- environment handed to perl program by operating system.
          STDERR    -- Standard error output
          STDIN     -- Standard input
          STDOUT    -- Standard output

      Some Perl Operators

    32. infix_operator::= "," | "=" | ".." | "||" | "&&" | "|" | "^" | "&" | "<<" | ">>" | "+" | "-" | "." | "*" | "/" | "%" | "x" | "**" .
    33. relational_operator::= "==" | "eq" | "!=" | "ne" | "<=>" | "cmp" | "<" | "lt" | ">" | "gt" | ">=" | "ge" | "<=" | "le".
    34. pattern_binding_operator::= "=~" | "!~".
    35. assignment_operator::= "=" | infix_operator "=".
    36. prefix::= "!" | "~" | "++" | "--" | "-".
    37. postfix::="++" | "--".

      Some Perl Functions


      "If it looks like a function call then it is a function call."[Camel]


      (math): atan2(x,y), cos(r), exp(x), log (base e), rand(expr)(in 0..(_)), sin(r), sqrt, srand,

    38. chop::=`removes the last character in a string and returns it. Often used to remove the newline character(s) from input. Defaults to operate on "$_"`.
    39. defined::=true if lvalue has a been given a value, else false. delete a{k}::= remove item with key k from associative array a.
    40. die(s)::=` exit from eval or perl and produce error message as $@ (from eval) or to STDER.
    41. dump::statement, produce core dump.
    42. each::associative_array->iterator, converts next item in an associative array as an array of a Key and a Value.
    43. eval::expression->statement, treat result of expression as a program and execute it as a subprogram - dangerous if the expression is unchecked.
    44. exec(s)::statement, replace this perl program by program in list, is dangerous when used with unchecked tainted user supplied data.
    45. exit(n)::statement, terminate perl program ahead of time.
    46. goto(l)::statement, transfer control -- inefficient but used for translating from 'sed'.
    47. grep(e,l)::=`Sets $_ to each element in list l and returns an array of those that make e true.
    48. hex(s)::=convert number to hexadecimal string.
    49. index(s,ss,p)::= `return first position after p in s which starts with string ss, if any, else returns $[-1.
    50. index(s,ss)::=index(s,ss,$[).
    51. index(ss)::=index($_,ss,$[).
    52. int(e)::=integer part of an expression, ??is this a floor operation rounding down or does it round towards to zero??.
    53. join(e,l)::= concatenate items in l with e as a separator.
    54. keys::associative_array->array=list of keys in associative array.
    55. length(s)::= number of characters in s.
    56. oct(n)::= n in octal notation.
    57. ord(e)::= ASCII code value for first character in value of expression e,
    58. pop::statement, take last element out of array.
    59. print(e,...)::statement, outputs one or more expressions as strings to a file or output.
    60. printf(f,e,...)::statement, formatted print -- like C.
    61. push(a, l)::=put l at end of array a.
      (range): e1 .. e2 = and array of numbers/characters between e1 and e2 inclusive.
    62. reverse(a)::=reverse scalar(string) or order of items in an array.
      (rindex): see index.... but in reverse order.
    63. s(p,r)::= substitute string r for pattern p.
    64. scalar(e)::=evaluate expression e in scalar context.
    65. shift(a)::=remove first item of array a and shift items up.
    66. sort(s,l)::=reorder items in l so that for each pair $ a,$ b in result, s($ a,$ b)>o.
    67. splice(a,p,n l)::= remove items p .. p+m-1 from a and replace by l.
    68. split(/pattern/, e, limit)::=`value of expression e is split by occurrences of pattern (up to l occurrences)...`.
    69. substr(s, p, l)::=substring s[p].. s[p+l-1].
    70. tr::statement, translate strings.
    71. undef(v)::statement, make value of v undefined. unshift(a): opposite of shift.
    72. values::associative_array->array, values found in an associative array).
    73. vec(s)::array, turn string into an array of character codes.
    74. write::statement, writes formatted recorded see Formats below.

      Syntax

        There is a good but informal description at [ perlsyn.html ] what follows are some incomplete (but formal) jottings. I've been scratching down the syntax of the various languages I use like this since roughly 1966. I found the following quote from the Perl FAQ a good excuse for not trying a complete description:
          In the words of Chaim Frenkel: 'Perl's grammar can not be reduced to BNF. The work of parsing perl is distributed between yacc, the lexer, smoke and mirrors.'

      1. element_in_array::= dollar identifier "[" numeric_expression "]".
      2. element_in_associative_array::= dollar identifier "{" expression "}".

        String Handling

      3. pattern_bind::= variable ("=~" | "!~" ) pattern.

      4. pattern::= "/" regular_expression "/".
      5. regular_expression::= `the standard UNIX RE plus some special Perl features, perl_anchors, perl_wild_characters, perl_sub_string, minus \(...\)`.
            .    any single character except end of line
            *    any number of previous including none
            +    one or more of the previous
            ?    zero or one of the previous ("optional" -- compare $O)
            [abc]     One of the listed elements
            [^ab]     one of the unlisted elements
         	...
      6. translate::= |[t:char](tr t pattern_with_no(t) t #non(t) t tr_options).
      7. substitute::= |[t:char](s t pattern_with_no(t) t #non(t) t s_options).
      8. match::= |[t:char](m t pattern_with_no(t) t m_options).
      9. pattern_with_no(t)::=patern with no unescaped t.

      10. chop_of_end_of_line::= "chop(" expression ")", the easiest way to remove a "\n" at the end of a string.

        File Handling

      11. file_operations::= "open(" filename "," mode ")" | "close(" filename ")" | "print" filename "," expression | ...
      12. file_tests::= "-" ("r" | "w" | "x" | ...)
        • -r readable
        • -w writable
        • -x executable
        • ...
      13. get_next_line_of_input_on(f)::= "<" f ">".
      14. read_while_file f doing p::= "while(<"f">)" p,
      15. |-perl(while(<f>)p) = perl(while($_= <f>) p),
      16. |-perl(while(<>)p) = perl(while($_= <STDIN>) p).

        Subprograms

      17. subprogram_declaration::= "sub" identifier ";".
      18. subprogram_definition::= "sub" identifier block, note that the formal parameters are not declared as part of the subprogram. They are implicitly available as "@_". The perl way to define a subprogram with three arguments x,y,z is to write:
         	sub foo{
         		local($x,$y,$z)=@_;
         		...
         	}

      19. local_declaration::= "local(" List(variable) ")" O( initialization ) -- dynamic scoping.
      20. private_decalaration::= "my(" List(variable) ")" O( initialization ) -- static scoping, added in Perl 5.

      21. dangerous_shell_escape::="system(" os_command_line ")".
      22. possibly_safe_shell_escape::="system(" os_command "," arguments ")" -- ??.

      23. evaluate string::= "eval" "(" expression ")".

        Control Structure

      24. selection::= simple_selection | "if" "(" expresssion ")" block #( "elsif" "(" expression")" block )O( "else" block)

      25. simple_selection::= statement O(("if" | "unless") expression).
      26. if_statement::= "if" "(" expresssion ")" block.
      27. if_then_else_statment::="if" "(" expresssion ")" block "else" block.

      28. block::= "{" #statement "}".

      29. loop::=simple_loop | O(label ":") O(loop_clause) block.
      30. simple_loop::=statement O(("while" | "until") expression).

      31. loop_clause::= ("while"|"until")"(" condition ")" | C_for_loop | "foreach" variable "(" array ")".
      32. C_for_loop::= "for" "(" O expression";" Oexpression ";" Oexpression ")".
      33. while_statement::= "while" "(" expresssion ")" block.

      34. inner_loop_control::= ("next" | "last" | "redo" ) O(label), note that redo and last can be used inside any (labeled) block.

      35. statement::= block | loop | selection | local_declaration | private_declaration | inner_loop_control | expression";"|...

        Long Literal

        Perl inherits an interesting idea from the UNIX Bourne shell: that the data for an operation can be supplied by a series of lines:
      36. long_string::= "<<" single_quoted_string ";" #line line_equal_to_content_of_string.
         	$example = << 'ENDIT';
         	Even the longest string
         	must end it somewhere.
         	ENDIT
        Each line can contain variables that are interpretted and placed in the string. This is very handy for form letters...

        Formats

        Perl has a very special technique for formattting data into reports. For example all write statements that go to a particular filehandle can be forced to fit a given page layout using:
      37. page_header::="format" "top" "=" #line dot_line.

      38. formated_output::= "format" file_handle "=" #(format_line data_line) dot_line.

        Modules

        Files and directories form a module hierarchy as of Perl 5 (a bit like Java here!) with the C++ double-colon symbol:
         		directory::file

        Objects and Structures

        Added in Perl 5 and very incomplete.
         		$variable={};	-- assign empty object
        		$variable->{FIELD};  -- access field/method...

      . . . . . . . . . ( end of section Syntax) <<Contents | End>>

      Glossary

    75. lvalue::="left hand value" -- an expression that can be put meaningfully on the left hand of an assignment, from BCPL.
    76. filehandle::="an identifier that which is an internal name for a program identify an open file", cf filename.
    77. filename::="a string that is used by an operatiing system to identify a file".

      Notation

    78. O::=optional (_).
    79. non::=any character except those in (_).
    80. List::= (_) #("," (_)).
    81. #::= any number including none of (_)
    82. |-#X = O( X # X ).

      See Also

      Online documentation for perl can often be accessed by running the command:
       	perldoc perl


      (Camel): Larry Wall & Randal L Schwartz, Programming perl, O'Reilly & Associates (Nutshell book).


      (FAQ): //ftp.flirble.org/pub/languages/perl/CPAN/doc/manual/html/pod/perlfaq.html


      (reference_on_WWW): [ http://reference.perl.com/ ]


      (smith99): B. Smith <BCS@DNAI.COM >posted the following on comp.software_eng in April 1999:



        (home_page): [ perl ]


        (archives): [ CPAN.html ] (CPAN archive, comprehesive collection of Perl programs and modules including most recent versions of Perl itself)



      (usenet): The following place is full of rather rude experts:
       		comp.lang.perl.misc
      so don't forget to try this command
       	perldoc perl
      to see if your system has the documentation and then (the newsgroup archive), [ http://www.dejanews.com/ ] to search to see if your question/problem has already been addressed, or at least get some backround. Then ask your question at //comp.lang.perl.misc

    83. ASCII::= See http://www.csci.csusb.edu/dick/samples/comp.text.ASCII.html.
    84. C::= See http://www.csci.csusb.edu/dick/samples/c.html.
    85. MATHS::= See http://www.csci.csusb.edu/dick/maths/.
    86. regular_expression::= See http://www.csci.csusb.edu/dick/samples/regular_expressions.html.
    87. UNIX::=an operating system, See [ http://www.csci.csusb.edu/cs360/ ] [ unix.syntax.html] . [ unix.commands.html] .

    . . . . . . . . . ( end of section Notes on Perl) <<Contents | End>>

End