(March 2000): At this time the match is case sensitive but allows the full awk
regular_expressions. So to select all items that contain "Object", "object", "OBJECT", etc.
you need to use the pattern:
[Oo][Bb]Jj][Ee][Cc][Tt]After three or four months of thought I decided that it would be nice to have optional case sensitive, but is is worth have non-optional case insensitivity then unoptional case sensitivity:
| Option | Value |
|---|---|
| Case senstive search | Worst |
| Case insenstive search | Better |
| Case sensitive option | Best |
THe prgram executes as a CGI on a Solaris UNIX box. In time the box may become either a Linux Box or a BSD server. The solution must survive porting. Older awk does not have a simple way to turn off case sensitivity in pattern matching. A quick test (testawk below) shows that I can't use IGNORECASE on a solaris box.
The standard technique is to change all letters to the same case before the match.
However this must not be done for every line in the file being searched! Case is significant in identifying items in the bibliography and in recognising the begining and end of items. The layout of the file(in XBNF) is as follows:
!/\./{ $0 = toupper( $0 ); };
if it worked ( see testawk ) on a Sun Solaris.
The UNIX sed command can do the prefilter with a 'y' command:
sed '/^\./!y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/'This works when tested(testsed).
Code
If this had not worked (or if I need the speed) I can recode in C.
The operations are
| Number | Description | When |
|---|---|---|
| 1 | putchar(ch); | Once for each character in a directive line |
| 2 | putchar(toupper(ch)) | Once for each character in a normal line |
| 3 | ch=getchar(); | Read ahead one character at start and Read replace each character. |
. . . . . . . . . ( end of section A Filter to help Case Insensitive Searches on a WWW Bibliography) <<Contents | End>>
Other Data
orion:/u/faculty/dick
$ awk 'BEGIN{IGNORECASE=1;}
> /x/{print "x is in " $0}
> !/x/{print "x is not in " $0}'
xxx
x is in xxx
x
x is in x
X
x is not in X
XXX
x is not in XXX
Apparently even the 'toupper' function is disfunctional:
$ awk '{print toupper($0);}'
test
test
xxx yyy zzz AAA
xxx yyy zzz AAA
(testsed): with outputs tabbed in:
orion:/u/faculty/dick
$ sed '/^\./!y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/'
This is a test
.Set
THIS IS A TEST
.Key99
.Set
sample title object
.Key99
SAMPLE TITLE OBJECT
. . . . . . . . . ( end of section Other Data) <<Contents | End>>
Glossary and Links