Tải bản đầy đủ - 0 (trang)
13 Indexing vectors, matrices and lists

13 Indexing vectors, matrices and lists

Tải bản đầy đủ - 0trang

21

INTRODUCTION TO R

1.13.1 Vector indexing

It is possible to print or refer to a subset of a vector by appending an index vector

(enclosed in square brackets, []), to the vector name. There are four common forms

of vector indexing used to extract a sub-set of vectors:

(i) Vector of positive integers. A set of integers that indicate which elements of the

vector are to be selected. Selected elements are concatenated in the speciﬁed order.

– Select the nth element

> TEMPERATURE[2]

Q2

30.6

– Select elements n through m

> TEMPERATURE[2:5]

Q2

Q3

Q4

Q5

30.6 31.0 36.3 39.9

– Select a speciﬁc set of elements

> TEMPERATURE[c(1, 5, 6, 9)]

Q1

Q5

Q6

Q9

36.1 39.9 6.5 9.7

(ii) Vector of negative integers. A set of integers that indicate which elements of the

vector are to be excluded from concatenation.

– Select all but the nth element

> TEMPERATURE[-2]

Q1

Q3

Q4

Q5

36.1 31.0 36.3 39.9

Q6

Q7

Q8

6.5 11.2 12.8

Q9 Q10

9.7 15.9

(iii) Vector of character strings. This form of vector indexing is only possible for vectors

whose elements have been named. A vector of element names can be used to select

elements for concatenation.

– Select the named element

> TEMPERATURE["Q1"]

Q1

36.1

– Select the names elements

> TEMPERATURE[c("Q1", "Q4")]

Q1

Q4

36.1 36.3

22

CHAPTER 1

(iv) Vector of logical values. The vector of logical values must be the same length as

the vector being sub-setted and usually are the result of an evaluated condition. Logical

values of T (TRUE) and F indicate respectively to include and exclude corresponding

elements of the main vector from concatenation.

– Select elements for which the logical condition is true

> TEMPERATURE[TEMPERATURE < 15]

Q6

Q7

Q8

Q9

6.5 11.2 12.8 9.7

> TEMPERATURE[SHADE == "no"]

Q1

Q3

Q5

Q7

Q9

36.1 31.0 39.9 11.2 9.7

– Select elements for which multiple logical conditions are true

> TEMPERATURE[TEMPERATURE < 34 & SHADE == "no"]

Q3

Q7

Q9

31.0 11.2 9.7

– Select elements for which one or other logical conditions are true

> TEMPERATURE[TEMPERATURE < 10 | SHADE == "no"]

Q1

Q3

Q5

Q6

Q7

Q9

36.1 31.0 39.9 6.5 11.2 9.7

1.13.2 Matrix indexing

Like vectors, matrices can be indexed from vectors of positive integers, negative

integers, character strings and logical values. However, whereas vectors have only

a single dimension (length) (thus enabling each element to be indexed by a single

number), matrices have two dimensions (height and width) and, therefore, require

a set of two numbers for indexing. Consequently, matrix indexing takes on the

form of [row.indices, col.indices], where row.indices and col.indices

respectively represent sequences of row and column indices of the form described for

vectors in section 1.13.1.

Before proceeding, re-examine the XY matrix generated in section 1.11.1:

> XY

X

Y

A 16.92 8.37

B 24.03 12.93

C 7.61 16.65

D 15.49 12.20

E 11.77 13.12

attr(,"description")

[1] "coordinates of quadrats"

INTRODUCTION TO R

23

The following examples will illustrate the variety of matrix indexing possibilities:

> XY[3, 2]

[1] 16.65

# select the element at row 3,

column 2

> XY[3, ]

X

Y

7.61 16.65

# select the entire 3rd row

> XY[, 2]

# select the entire 2nd column

A

B

C

D

E

8.37 12.93 16.65 12.20 13.12

> XY[, -2]

A

B

16.92 24.03

C

D

E

7.61 15.49 11.77

# select all columns except the

2nd

> XY["A", 1:2]

X

Y

16.92 8.37

#select columns 1 through 2 for

row A

> XY[, "X"]

A

B

16.92 24.03

#select the column named 'X'

C

D

E

7.61 15.49 11.77

> XY[XY[, "X"] > 12, ]

X

Y

A 16.92 8.37

B 24.03 12.93

D 15.49 12.20

#select all rows for which the

value of the column X is

greater than 12

1.13.3 List indexing

Lists consist of collections of objects that need not be of the same size or type. The

objects within a list are indexed by appending an index vector (enclosed in double

square brackets, [[]]), to the list name. A single object within a list can also be referred

to by appending a string character (\$) followed by the name of the object to the list

names (e.g. list\$object). The elements of objects within a list are indexed according

to the object type. Vector indices to objects within other objects (lists) are placed within

their own square brackets outside the list square brackets:

Recall the EXPERIMENT list generated in section 1.11.2

> EXPERIMENT

\$SITE

[1] "A1" "A2" "B1" "B2" "C1" "C2" "D1" "D2" "E1" "E2"

24

CHAPTER 1

\$COORDINATES

[1] "16.92,8.37" "24.03,12.93" "7.61,16.65"

[5] "11.77,13.12"

\$TEMPERATURE

Q1

Q2

Q3

Q4

Q5

36.1 30.6 31.0 36.3 39.9

[1] no

full no

Levels: no full

Q6

Q7

Q8

6.5 11.2 12.8

full no

full no

"15.49,12.2"

Q9 Q10

9.7 15.9

full no

full

The following examples illustrate a variety of list indexing possibilities:

> #select the first object in the list

> EXPERIMENT[[1]]

[1] "A1" "A2" "B1" "B2" "C1" "C2" "D1" "D2" "E1" "E2"

> #select the object named 'TEMPERATURE' within the list

> EXPERIMENT[['TEMPERATURE']]

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9 Q10

36.1 30.6 31.0 36.3 39.9 6.5 11.2 12.8 9.7 15.9

> #select the first 3 elements of 'TEMPERATURE' within

> #'EXPERIMENT'

> EXPERIMENT[['TEMPERATURE']][1:3]

Q1

Q2

Q3

36.1 30.6 31.0

> #select only those 'TEMPERATURE' values which correspond

> #to SITE's with a '1' as the second character in their name

> EXPERIMENT\$TEMPERATURE[substr(EXPERIMENT\$SITE,2,2) == '1']

Q1

Q3

Q5

Q7

Q9

36.1 31.0 39.9 11.2 9.7

1.14

Pattern matching and replacement (character search and replace)

It is often desirable to select a subset of data on the basis of character entries that match

more general patterns. Furthermore, the ability to search and replace character strings

within a character vector can be very useful.

1.14.1 grep - pattern searching

The grep() function searches within a vector for matches to a pattern and returns the

index of all matching entries.

INTRODUCTION TO R

25

# select only those 'SITE' values that contain an 'A'

> grep("A", EXPERIMENT\$SITE)

[1] 1 2

> EXPERIMENT\$SITE[grep("A", EXPERIMENT\$SITE)]

[1] "A1" "A2"

By default, the pattern comprises any valid regular expressionh which provides great

pattern searching ﬂexibility.

# convert the EXPERIMENT list into a data frame

> EXP <- as.data.frame(EXPERIMENT)

# select only those rows that contain correspond to a 'SITE'

value of either an A, B or C followed by a '1'

> grep("[A-C]1", EXP\$SITE)

[1] 1 3 5

> EXP[grep("[A-C]1", EXP\$SITE), ]

SITE COORDINATES TEMPERATURE SHADE

Q1

A1 16.92,8.37

36.1

no

Q3

B1 7.61,16.65

31.0

no

Q5

C1 11.77,13.12

39.9

no

1.14.2 regexpr - position and length of match

Rather than return the indexes of matching entries, the regexpr() function returns

the position of the match within each string as well as the length of the pattern

within each string (-1 values correspond to entries in which the pattern is not

found).

#recall the AUST character vector that lists the Australian

capital cities

> AUST

[1] "Adelaide" "Brisbane" "Canberra" "Darwin"

[5] "Hobart"

"Melbourne" "Perth"

"Sydney"

#get the position and length of string of characters containing

an 'a' and an 'e' separated by any number of characters

> regexpr("a.*e", AUST)

[1] 5 6 2 -1 -1 -1 -1 -1

attr(,"match.length")

[1] 4 3 4 -1 -1 -1 -1 -1

h

A regular expression is a formal computer language consisting of normal printing characters and

special metacharacters (which represent wildcards and other features) that together provide a concise

yet ﬂexible way of matching strings.

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

13 Indexing vectors, matrices and lists

Tải bản đầy đủ ngay(0 tr)

×