Tải bản đầy đủ - 0 (trang)
The gawk Scripting Language (Linux in a Nutshell, 3rd Edition)

The gawk Scripting Language (Linux in a Nutshell, 3rd Edition)

Tải bản đầy đủ - 0trang

The gawk Scripting Language (Linux in a Nutshell, 3rd Edition)





Variable and array assignment







Group listing of commands







Alphabetical summary of commands



For more information, see the O'Reilly book sed & awk, 2d ed., by Dale Dougherty and

Arnold Robbins.



13.1. Conceptual Overview

With gawk, you can:





Conveniently process a text file as though it were made up of records and fields in a

textual database.







Use variables to change the database.







Execute shell commands from a script.







Perform arithmetic and string operations.







Use programming constructs such as loops and conditionals.







Define your own functions.







Process the result of shell commands.







Process command-line arguments more gracefully.







Produce formatted reports.



12.5. Alphabetical Summary

of sed Commands



13.2. Command-Line Syntax



Copyright © 2001 O'Reilly & QKFIN. All rights reserved.



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_01.htm (2 of 2) [15/05/2002 18:11:52]



Command-Line Syntax (Linux in a Nutshell, 3rd Edition)



Linux in a Nutshell, 3rd

Edition



13.2. Command-Line Syntax

gawk's syntax has two forms:

gawk

gawk



[options]

[options]



'script' var=value file(s)

-f scriptfile var=value file(s)



You can specify a script directly on the command line, or you can store a script in a scriptfile

and specify it with -f. Multiple -f options are allowed; awk concatenates the files. This feature

is useful for including libraries.

gawk operates on one or more input files. If none are specified (or if - is specified), gawk

reads from the standard input.

Variables can be assigned a value on the command line. The value assigned to a variable can

be a literal, a shell variable ($name), or a command substitution (`cmd`), but the value is

available only after a line of input is read (i.e., after the BEGIN statement).

For example, to print the first three (colon-separated) fields of the password file, use -F to set

the field separator to a colon:

gawk -F: '{print $1; print $2; print $3}' /etc/passwd

Numerous examples are shown later in Section 13.3, "Patterns and Procedures".



13.2.1. Options

All options exist in both traditional POSIX (one-letter) format and GNU-style (long) format.

Some recognized options are:

-Treat all subsequent text as commands or filenames, not options.



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_02.htm (1 of 3) [15/05/2002 18:11:55]



Command-Line Syntax (Linux in a Nutshell, 3rd Edition)



-f scriptfile, --file=scriptfile

Read gawk commands from scriptfile instead of command line.

-v var=value, --assign=var=value

Assign a value to variable var. This allows assignment before the script begins

execution.

-Fc, --field-separator=c

Set the field separator to character c. This is the same as setting the variable FS. c may

be a regular expression. Each input line, or record, is divided into fields by whitespace

(blanks or tabs) or by some other user-definable record separator. Fields are referred to

by the variables $1, $2,..., $n. $0 refers to the entire record.

-W option

All -W options are specific to gawk, as opposed to awk. An alternate syntax is -option (i.e., --compat). option may be one of:

compat

Same as traditional.

copyleft

Print copyleft notice and exit.

copyright

Same as copyleft.

help

Print syntax and list of options, then exit.

lint

Warn about commands that might not port to other versions of awk or that

gawk considers problematic.

lint-old



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_02.htm (2 of 3) [15/05/2002 18:11:55]



Command-Line Syntax (Linux in a Nutshell, 3rd Edition)



Like lint but compares to an older version of awk.

posix

Expect exact compatibility with POSIX; additionally, ignore \x escape

sequences, **, and **=.

re-interval

Allow use of {n,m} intervals in regular expressions.

source=script

Treat script as gawk commands. Like the 'script' argument but lets you mix

commands from files (using -f options) with commands on the gawk command

line.

traditional

Behave exactly like traditional (non-GNU) awk.

usage

Same as help.

version

Print version information and exit.



13. The gawk Scripting

Language



13.3. Patterns and Procedures



Copyright © 2001 O'Reilly & QKFIN. All rights reserved.



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_02.htm (3 of 3) [15/05/2002 18:11:55]



Patterns and Procedures (Linux in a Nutshell, 3rd Edition)



Linux in a Nutshell, 3rd

Edition



13.3. Patterns and Procedures

gawk scripts consist of patterns and procedures:

pattern



{procedure}



Both are optional. If pattern is missing, {procedure} is applied to all records. If {procedure} is

missing, the matched record is printed. By default, each line of input is a record, but you can

specify a different record separator through the RS variable.



13.3.1. Patterns

A pattern can be any of the following:

/regular expression/

relational expression

pattern-matching expression

pattern,pattern

BEGIN

END

Some rules regarding patterns include:





















Expressions can be composed of quoted strings, numbers, operators, functions, defined

variables, or any of the predefined variables described later under "gawk System

Variables."

Regular expressions use the extended set of metacharacters and are described in

Chapter 9, "Pattern Matching".

In addition, ^ and $ can be used to refer to the beginning and end of a field,

respectively, rather than the beginning and end of a record.

Relational expressions use the relational operators listed under "Operators" later in this

chapter. Comparisons can be either string or numeric. For example, $2 > $1 selects

lines for which the second field is greater than the first.

Pattern-matching expressions use the operators ~ (match) and !~ (don't match). See



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_03.htm (1 of 3) [15/05/2002 18:11:58]



Patterns and Procedures (Linux in a Nutshell, 3rd Edition)



















"Operators" later in this chapter.

The BEGIN pattern lets you specify procedures that take place before the first input

record is processed. (Generally, you set global variables here.)

The END pattern lets you specify procedures that take place after the last input record

is read.

If there are multiple BEGIN or END patterns, their associated actions are taken in the

order in which they appear in the script.

pattern,pattern specifies a range of lines. This syntax cannot include BEGIN or END

as a pattern.



Except for BEGIN and END, patterns can be combined with the Boolean operators || (OR),

&& (AND), and ! (NOT).

In addition to other regular-expression operators, GNU awk supports POSIX character lists,

which are useful for matching non-ASCII characters in languages other than English. These

lists are recognized only within [ ] ranges. A typical use would be [[:lower:]], which in

English is the same as [a-z]. See Chapter 9, "Pattern Matching" for a complete list of POSIX

character lists.



13.3.2. Procedures

Procedures consist of one or more commands, functions, or variable assignments, separated

by newlines or semicolons and contained within curly braces. Commands fall into four

groups:





Variable or array assignments







Printing commands







Built-in functions







Control-flow commands



13.3.3. Simple Pattern-Procedure Examples

1. Print first field of each line (no pattern specified):

{ print $1 }

2. Print all lines that contain "Linux":

/Linux/

3. Print first field of lines that contain "Linux":

file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_03.htm (2 of 3) [15/05/2002 18:11:58]



Patterns and Procedures (Linux in a Nutshell, 3rd Edition)



/Linux/{ print $1 }

4. Print records containing more than two fields:

NF > 2

5. Interpret each group of lines up to a blank line as a single input record:

BEGIN { FS = "\n"; RS = "" }

6. Print fields 2 and 3 in switched order but only on lines whose first field matches the

string "URGENT":

$1 ~ /URGENT/ { print $3, $2 }

7. Count and print the number of instances of "ERR" found:

/ERR/ { ++x }; END { print x }

8. Add numbers in second column and print total:

{total += $2 }; END { print "column total is", total}

9. Print lines that contain fewer than 20 characters:

length() < 20

10. Print each line that begins with "Name:" and that contains exactly seven fields:

NF == 7 && /^Name:/

11. Reverse the order of fields:

{ for (i = NF; i >= 1; i--) print $i }



13.2. Command-Line Syntax



13.4. gawk System Variables



Copyright © 2001 O'Reilly & QKFIN. All rights reserved.



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_03.htm (3 of 3) [15/05/2002 18:11:58]



gawk System Variables (Linux in a Nutshell, 3rd Edition)



Linux in a Nutshell, 3rd

Edition



13.4. gawk System Variables

Variable



Description



$n



nth field in current record; fields are separated by FS



$0



Entire input record



ARGC



Number of arguments on command line



ARGIND



Current file's place in command line (starting with 0)



ARGV



An array containing the command-line arguments



CONVFMT



Conversion format for numbers (default is %.6g)



ENVIRON



An associative array of environment variables



ERRNO



Description of last system error



FIELDWIDTHS List of field widths (whitespace-separated)

FILENAME



Current filename



FNR



Like NR, but relative to the current file



FS



Field separator (default is any whitespace; null string separates into

individual characters)



IGNORECASE



If true, make case-insensitive matches



NF



Number of fields in current record



NR



Number of the current record



OFMT



Output format for numbers (default is %.6g)



OFS



Output field separator (default is a blank)



ORS



Output record separator (default is a newline)



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_04.htm (1 of 2) [15/05/2002 18:12:01]



gawk System Variables (Linux in a Nutshell, 3rd Edition)



RLENGTH



Length of the string matched by match function



RS



Record separator (default is a newline)



RSTART



First position in the string matched by match function



SUBSEP



Separator character for array subscripts (default is \034)



13.3. Patterns and Procedures



13.5. Operators



Copyright © 2001 O'Reilly & QKFIN. All rights reserved.



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_04.htm (2 of 2) [15/05/2002 18:12:01]



Operators (Linux in a Nutshell, 3rd Edition)



Linux in a Nutshell, 3rd

Edition



13.5. Operators

The following table lists the operators, in order of increasing precedence, that are available in

gawk.

Symbol



Meaning



= += -= *= /= %= ^= **= Assignment

?:



C conditional expression



||



Logical OR



&&



Logical AND



~ !~



Match regular expression and negation



< <= > >= != ==



Relational operators



(blank)



Concatenation



+ -



Addition, subtraction



* / %



Multiplication, division, and modulus



+ - !



Unary plus and minus and logical negation



^ **



Exponentiation



++ --



Increment and decrement, either prefix or postfix



$



Field reference



in



Array membership (see for command)



13.4. gawk System Variables



13.6. Variable and Array

Assignments



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_05.htm (1 of 2) [15/05/2002 18:12:04]



Operators (Linux in a Nutshell, 3rd Edition)



Copyright © 2001 O'Reilly & QKFIN. All rights reserved.



file:///E|/O'Reilly/O'Reilly%20-%20Linux%20in%20a%20Nutshell,%203rd%20Edition/Pages/ch13_05.htm (2 of 2) [15/05/2002 18:12:04]



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

The gawk Scripting Language (Linux in a Nutshell, 3rd Edition)

Tải bản đầy đủ ngay(0 tr)

×