Tải bản đầy đủ - 0 (trang)
6 Application: Causal Learning Augmented Problem Solving Search Process with Learning of Heuristics

# 6 Application: Causal Learning Augmented Problem Solving Search Process with Learning of Heuristics

Tải bản đầy đủ - 0trang

62

2 Rapid Unsupervised Effective Causal Learning

b

START

E.g., 360 search nodes,

each 1 apart from the

neighboring ones

a

GOAL

Agent

F

START

RA = Relative Angle

GOAL

Fig. 2.14 (a) The spatial-movement-to-goal (SMG) problem. (b) A search process for solving the

problem

problem – will be treated in a later (Chap. 6). To solve this problem, one way is to

rely on a built-in a heuristic rule that says that the best solution is to move in a

straight line continuously from the START to the GOAL position. However, that

would be too contrived. In fact, this begs the question of how this heuristic rule

could be learned or emerge from a problem solving process in the first place. Below,

we show that in fact a heuristic like that can be learned from a search process

enhanced with effective causal learning, which is a general learning framework as

discussed above. And, subsequently, an agent can then always apply the learned

heuristic, obviating any further need to carry out search.

Figure 2.14b shows how a typical search process can be used to solve the

problem. In the worst case, a blind search process can proceed as follows. First,

all possible directions of movement of the Agent from the START location is

considered. In our framework, the fact that a force, either internally generated

within the Agent or externally applied to the Agent, can be used to move the

Agent across space is learned from experience through the causal learning process

(such as that illustrated in Fig. 2.12), whereas in a traditional AI search problem this

piece of knowledge is built-in. Suppose the system considers only 360 possible

directions – 1 apart – as shown in Fig. 2.14b. Suppose also that the system

considers an elemental move (i.e., an elemental spatial displacement) each time.

This would take the Agent to 360 elementally displaced locations from the START

location, only two of which are shown in the search tree in Fig. 2.14b for clarity.

Then, from each of these 360 locations, each of which is called a “node” in the

search process, a further consideration of 360 possible locations “expanded” out

from this current node is carried out. Out of these 360 possible elemental moves,

one of them would bring the Agent back to the original node from which the current

node was expanded. The search process can ignore this and in a subsequent

expansion considers only the other 359 nodes. This process continues until one of

the moves reaches the GOAL location. If the record to all the elemental moves is

kept in the search process, when the expansion finally reaches the GOAL location,

2.6 Application: Causal Learning Augmented Problem Solving Search Process with. . .

63

the path making up of all the elemental moves from the START to the GOAL

locations would be the solution to the problem, and this would turn out to be more

or less a straight line from the START to the GOAL. (It would be “more or less” a

straight line and not an “exact” straight line due to the discrete number of angles

considered – 360 discrete angles – for the elemental movement expansion from

each mode.)

The blind search suffers from a major problem – the search space is incredibly

big. Therefore, typically an A* search algorithm (Hart et al. 1968) is used that

would vastly reduce the search space. This algorithm uses a basic search framework

augmented by a “heuristic” measure at any given point in the search. The heuristic

is typically a general domain independent heuristic such as “pick a next move that is

closest to the GOAL.” (In the SMG problem, this “closeness” translates into a

distance measure from the current location of consideration to the GOAL. In other

problems, there might be other kinds of measures for closeness to the GOAL.) The

A* search algorithm assumes that there is a way, for each domain encountered, to

generate such a heuristic measure.

Our causal learning augmented search algorithm will take advantage of this

heuristic measure feature of A* search but it is able to further reduce the entire

search space drastically through the general causal learning mechanism described

above in Sect. 2.4. Let us first consider the typical amount of computation needed in

an A* search for this SMG problem that is built upon the basic search framework

depicted in Fig. 2.14b. In A*, firstly, similar to that discussed above, at each node

360 nodes are expanded (strictly speaking 359 nodes as explained above, but the

difference is immaterial), each of which corresponds to an elemental movement in

one of the directions. Then, having expanded the node, a selection of a “best” node

out of the 360 expanded nodes for further expansion is carried out. Since the

purpose is finally to reach the GOAL, a reasonable heuristic for a current node

would be the distance from the location corresponding to the current node to the

GOAL – h(node). This selected best node – the one with the smallest heuristic value

– is then expanded and the same process is repeated. A* typically also includes the

distance from the START node to the current node as part of the heuristic measure

as well – g(node). The heuristic measure used is then f(node) ẳ g(node) ỵ h(node).

Because not every node at every level of expansion is selected for further expansion, which would be the case for a totally blind search as discussed above, and only

the heuristically selected node is selected for further expansion, this A* algorithm

reduces the search space drastically compared to that of the blind search. This is a

good baseline against which to compare any further improvement to the search

process. Even though the A* search represents a great reduction in the search space

compared to the blind search, it still has to expand the heuristically selected node

into 360 nodes at each level of the search. This could still represent a lot of

computational effort, especially if the goal is far from the starting location.

Now, consider the process of the causal learning augmented search depicted in

Fig. 2.15. The top part of the figure shows a physical scenario which is a process of

an Agent searching for a path to go from a START position to a GOAL position. In

line with our general causal learning framework, through causal learning through

64

2 Rapid Unsupervised Effective Causal Learning

Physical

Scenario

RA = Relative Angle

LX

L4

L1

A3

START

Search

Process

GOAL

GL = Goal’s Location = Lg

RD = Relative Distance

Parameters obtained

through computer

vision, etc.

AA = Absolute Angle = A3

Agent

AL = Absolute Location

Fm(1)

Reference Frame

Time

t2

START

L1

F

x 360

L2 L3 L4 L5 NODE4

NODE3

NODE1

x 360

NODE2

LX

Obviated

search

effort

GOAL

t3

t1

t2

t3

t1

t2

t3

t1

t2

t3

t1

t2

t3

F

0

F

0

0

F

0

0

F

0

0

F

0

AL

L1

L1

L2

L1

L1

L3

L1

L1

L4

L1

L1

L5

RD

D1

D1

D2

D1

D1

D3

D1

D1

D4

D1

D1

D5

AA

*

A1

*

*

A2

*

*

A3

*

*

A4

*

RA

*

30

*

*

20

*

*

0

*

*

10

*

GL

Lg

Lg

Lg

Lg

Lg

Lg

Lg

Lg

Lg

Lg

Lg

Lg

NODE1

NODE2

NODE3

NODE4

CLOSEST/

MINIMUM

DISTANCE

TO GOAL

Synchronic “enabling” conditions that

cause the minimum distance to goal

Fig. 2.15 Causal learning augmented search process with learning of heuristics applied to the

SMG problem. See text for explanation. * represents undefined value (With kind permission

from Springer Science ỵ Business Media: Proceedings of the 7th International Conference on

Artificial General Intelligence, “On Effective Causal Learning,” 2014, Page 49, Seng-Beng Ho,

Fig. 3)

experience (e.g., such as that shown in Fig. 2.12), it is learned that a “force” – “F” –

can physically push the Agent involved toward an elemental next location in the

process of finding a good path to the GOAL. This could be the situation of an agent

actuating a force within itself to propel itself or an agent applying a force externally

to push another agent. Throughout the discussion below, we will use the case of the

Agent propelling itself to represent both situations.

Suppose the sensory system (computer vision, natural vision, proprioceptive

sensors, etc.) is able to supply the various parameters associated with the Agent and

the force. These would be parameters such as the absolute location of the Agent,

AL, the relative distance of the Agent to the GOAL, RD, the absolute angle (with

respect to the entire frame of reference), AA, of the applied force, and the relative

angle, RA, of the applied force with respect to a straight line joining the START and

GOAL positions. These supplied parameters from the sensory system(s) would also

include the START and GOAL locations, L1 and GL, respectively.

The bottom left corner of the figure has a search tree that represents the basic

search framework. The picture depicts four nodes expanded at the first level –

NODE1 – NODE4 – and these respectively correspond to the locations of L2, L3,

L4, L5 to which the object is moved by F from the START location, L1. Similar to

the A* process described above in Fig. 2.14, we also assume that there are

360 possible directions to move the Agent to and they are separated by 1 . There

2.6 Application: Causal Learning Augmented Problem Solving Search Process with. . .

65

is also a temporal aspect to the process – at t1 the object stays stationary at L1, at t2

F is applied, and at t3 the object is moved to one of the 360 possible locations.

On the bottom right of the figure is a tabular representation of the search process

much in the spirit of the temporal representation of Figs. 2.6 and 2.8, etc. showing

the causalities involved. In this representation, each node and the parameters

associated with it occupy a small segment of the temporal space. The physical

process associated with each node is a force, F, applied between t1 and t2, which is

a change of state or an event labeled with a short yellow vertical bar. The

parameters associated with the force are the AA and RA which are defined only

at t2 when the force is in effect. The starting parameters AL and RD of the Agent

are indicated in the t1 column and after the application of the force, AL and RD

change to other values in t3. These changes are also indicated with a short yellow

vertical bar. In the parlance of our effective causal learning framework described in

Sects. 2.2, 2.3, and 2.4, it is established that F causes the particular movement or the

changes in the associated parameters AL and RD of the Agent and is a diachronic

cause. The synchronic “enabling” causal conditions are the values of all the

parameters observed in the environment at t2 which include AL, RD, AA, RA,

and GL.

Thus, at NODE1, say, the force was applied with its associated synchronic

parameter values F(L1, D1, A1, 30, Lg) and this causes a movement in which the

Agent moves from (L1, D1) to (L2, D2) – Move(Agent, L1, D1, L2, D2). (“30” is

the value of the relative angle RA, which is 30 , that corresponds to NODE1.)

Applying the nearest distance to goal heuristic, NODE3 is chosen for further

expansion in which the Agent is at (L4, D4) which is the closest location to the

GOAL among all the locations (L2, D2), (L3, D3), (L5, D5), . . . corresponding to

the other nodes NODE1, NODE2, NODE 4, . . . respectively. Physically, this node

corresponds to the Agent at location L4, which lies on a straight line connecting

START to GOAL as shown in the physical scenario. This is a movement in the

absolute direction, AA ¼ A3, say. The associated force for NODE3 with its synchronic parameters is F(L1, D1, A3, 0, Lg). We use Fm(N) to represent the force

that causes the resultant location to be selected by the minimum distance heuristic

at level N of the node expansion and hence Fm(1) ¼ F(L1, D1, A3, 0, Lg).

Next, consider that this NODE3 has been selected for expansion and the process

creates a next-level 360 nodes for evaluation by the minimum distance heuristic.

Suppose now it is the node at this expanded level that corresponds to the Agent at

location LX shown in the physical scenario that satisfies the heuristic of minimum

distance from the GOAL. This location LX is arrived at by applying F in the

direction of A3 from location L4. (This direction actually lies on the straight line

connecting the START and the GOAL locations and that is why it is the one

selected by the heuristics but we want the system to discover this for itself.) The

full synchronic parametric specification of the force for this movement to LX is Fm

(2) ¼ F(L4, D4, A3, 0, Lg). Now, comparing this force and the one above, Fm(1) ¼

F(L1, D1, A3, 0, Lg), that also satisfies the heuristic but at a different level of node

expansion and absolute location, one can see that some of the synchronic parameters are different – L1 vs L4 and D1 vs D4 – while others stay the same – A3, 0, and

66

2 Rapid Unsupervised Effective Causal Learning

Lg. Both these forces cause the selection of the resultant node by the minimum

distance heuristic. Using the rapid effective causal learning process described

above (Sects. 2.2, 2.3, and 2.4), through dual-instance generalization (at a medium

level of desperation, say), we derive a more general force rule that causes the

selection of the resultant node by the minimum distance heuristic, namely Fm(*) ¼

F(*, *, A3, 0, Lg). The “*” indicates that the corresponding parameters, namely N

(the level of search expansion), AL (absolute location) and RD (relative distance to

the GOAL from the Agent current location), are not relevant as synchronic enabling

conditions for the force that causes the resultant node to be selected by the heuristic.

The “*” in Fm(*) implies that no matter at what level N of search expansion, the

generalization holds.

Therefore, at this point, after two levels of expansion, the general causal rule

learned is that no matter at what level of expansion, the force needed to satisfy the

minimum distance heuristic is F(*, *, A3, 0, Lg), i.e., Fm(*) ¼ F(*, *, A3, 0, Lg).

Hence, for the next level onward, there is no further node expansion needed. One

just needs to keep applying F(*, *, A3, 0, Lg) – i.e., whatever location (represented

by the parameter AL and RD that are now labeled “*”) is reached at whatever level

of search expansion, one just keeps applying the force in the A3 direction – namely

the direction pointing toward the current GOAL at Lg – with a relative angle

RA ¼ 0 that is subtended between the direction of the force and the straight line

connecting START and GOAL. The last enabling synchronic condition, Lg, is

always satisfied as the GOAL is stationary. This indirectly implies that the A3

direction is the correct one provided the GOAL does not move in the process of the

search, otherwise the desired AA will keep changing as the Agent continues its

movement in the direction where the GOAL moves to. Hence there is just one node

that needs to be expanded/considered at all levels of movement all the way to the

GOAL. This obviates the need for continued expansion of 360 nodes at every level

of node expansion all the way to the GOAL, which is needed in A* (shown with a

big red cross in the search tree in the figure), and drastically reduces the search

space compared to A* search – only two levels of 360 node expansions each are

needed in total.

Now, consider that the entire process is repeated at a different START (L11) and

GOAL (Lg1) location as shown in Fig. 2.16 with a different absolute angle, A31,

for the straight line joining START and GOAL, and a different starting relative

distance between START and GOAL. Let us define this as a different situation.

Now, in this situation, Fm(1) ¼ F(L11, D11, A31, 0, Lg1) and Fm(2) ¼ F(L41, D41,

A31, 0, Lg1), assuming that the first minimum distance heuristic selected node

corresponds to location L41, and the corresponding relative distance to GOAL is

D41. This results in a generalization to Fm(*) ¼ F(*, *, A31, 0, Lg1) and it obviates

any expansion beyond the second level just like the situation above in Fig. 2.15.

A further generalization can be made after comparing the above two situations of

Figs. 2.15 and 2.16 (let us call them Situation(1) and Situation(2)). Fm(*)[Situation

(1)] ¼ F(*, *, A3, 0, Lg) and Fm(*)[Situation(2)] ¼ F(*, *, A31, 0, Lg1) implies that

Fm(*)[Situation(*)] ¼ F(*, *, *, 0, *). What this means is, in general in any situation

2.6 Application: Causal Learning Augmented Problem Solving Search Process with. . .

Fig. 2.16 A different

START and GOAL location

(a different situation) for the

SMG problem

67

GOAL

Lg1

L41

L11

A31

START

Agent

with any START and any GOAL locations, the solution is to move straight in the

direction of the GOAL.

In a sense, F(*, *, *, 0, *) is a very general domain specific heuristic (in the

domain of spatial movement) that is derived from the general causal learning

augmented search process. After these two situations and a total of two times of

360 node expansions, this heuristic extracted obviates all further search in any

SMG problem without obstacles. This heuristic is thus learnable through the causal

learning process to expedite all future problem solving processes as regards spatial

movement to goal and does not have to be built-in in a contrived manner. As for the

basic A*, not only much more search is required in each situation, the entire A*

search has to be repeated in a different situation when the START and GOAL

locations are changed. There is no transfer of any kind of information or learning

from one situation to the other. (We can think of the learning of heuristics as a kind

of transfer learning.)

Thus, in this section, we have not only demonstrated that it is possible to

drastically reduce the amount of search through rapid effective causal learning,

which results in a kind of search process that is more similar to a typical human

intelligent search process, it is also possible to learn the heuristic(s) involved that

will further reduce the need for search (and in fact in this case of the SMG problem,

obviate entirely the need for search) for future similar situations.

In the above discussion of using either the conventional A* search or our causal

learning augmented search for the SMG problem, we have assumed that the system

has already encoded the knowledge of using a force to cause an elemental movement and the search process uses this elemental movement rule to try various

possibilities to arrive at the solution. As mentioned above, the learning of the

elemental movement rule, in fact, can also be achieved using causal learning as

discussed in Sect. 2.4 in connection with Fig. 2.12.

2.6.2

Encoding Problem Solution in a Script: Chunking

In the solution for the case of the SMG problem as discussed above or for that

matter, for the case of any general and typical problem, there is usually a series of

68

2 Rapid Unsupervised Effective Causal Learning

actions to be performed. An efficient strategy for a noological system would be to

encode this solution so that no repeated future effort in looking for the solution is

needed should the same problem situation be encountered again. Moreover, this

sequence of actions forms a knowledge “chunk” that can participate as part of the

solution to more complex problems that require even longer sequences of actions to

solve. That way the solutions to ever increasingly complex problems can be

discovered faster and more efficiently. This “incremental” chunking method will

be described in more detail in Chap. 3. In this section, we focus on describing the

structure of a “script” that encodes this chunked piece of knowledge.

There are two ways a script can be learned/created. One is through a problem

solving process of an agent as discussed above. The other is through observation of

other activities that may be performed by other agents in their problem solving

processes using basically the same effective causal learning process as discussed

above in Sects. 2.2, 2.3, and 2.4, except that it may need to be applied to an

extended period of time that results in a longer sequence of activities. This will

be described in detail in Chap. 7.

There is some similarity between our idea of scripts and that proposed by Schank

and Abelson in their work in the 1970s (Schank and Abelson 1977). The main

difference in the idea of scripts here compared to that of Schank and Abelson is that

the scripts here are learned from an agent’s problem solving processes or from

observations of activities taking place in the environment which could in turn be

executed solutions of other agents’ problem solving processes. In Schank and

Abelson (1977), no detailed computational process was described for the learning

of scripts. In addition, the focus of Schank and Abelson’s application of scripts is on

question-answering, and for us, it is on problem solving. As a result, there are other

elements in our version of the script, such as counterfactual information, which will

be discussed in Sect. 2.6.3 and which are not present in Schank and Abelson’s

script.

The basic structure of a script here consists of four main portions – SCENARIO,

START, ACTIONS, and OUTCOME – as shown in Fig. 2.17a, which is based on

the movement scenario of the SMG problem.

For comparison, Fig. 2.17b shows a part of a script structure devised by Schank

and Abelson (Schank and Abelson 1977) and in this particular instance, it is a

Restaurant Script – encoding knowledge about going to a restaurant. We use blue

rectangles to indicate the portions of their script that correspond to the various parts

of our script in Fig. 2.17a. The Restaurant Script consists of an Entry conditions

portion, which corresponds to our START condition, that specifies the start state of

the main role – the person S visiting the restaurant – and in this case it specifies that

the person is hungry and has money. Though not specified in Fig. 2.17b, the START

state could conceivably also include the state of the restaurant – such that it is

located within a reachable location and that it is open. The Results portion of the

script corresponds to our OUTCOME portion. In this case, the person S ends up

having less money, not hungry, and so on. The Scene 1 portion and the rest of the

script in Fig. 2.17b that are not shown correspond to our ACTIONS portion which

consists of a causal chain of actions and events specifying, in this case, the activities

2.6 Application: Causal Learning Augmented Problem Solving Search Process with. . .

69

a

MOVEMENT-SCRIPT (SPECIFIC)

SCENARIO

Agent

GOAL

GL

RD

AL

START

A3

START

ACTIONS

OUTCOME

Object Parameters

Actions on Agent

Parameter Changes

AL = AL1

RD = RD1

GL = GL1

AL = GL1

F(A3, 0°)

+ΔAL = GL1–AL1

F(A3, 0°)

RD = 0

F(A3, 0°)

–ΔRD = RD1

.

GL = GL1

AA .

RA

ΔGL = 0

b

START

OUTCOME

ACTIONS

Fig. 2.17 (a) A specific SMG Script – MOVEMENT-SCRIPT (SPECIFIC). All parameters AL,

AA, RD, RA, and GL have the same definitions as those in Fig. 2.15. (b) Part of Schank’s

Restaurant Script (Republished with permission of Taylor and Francis Group LLC Books, from

Scripts Plans Goals and Understanding, Roger Schank and Robert Abelson, 1977; permission

conveyed through Copyright Clearance Center, Inc.) The blue rectangles and words show correspondences with our version of script. See Fig. 10.1 in Chap. 10 for a complete version of the

Restaurant Script

that take place in a restaurant. The header portion of the script corresponds

approximately to our SCENARIO portion.

In our process, a script is first created from a problem solving process by

encoding the specific values of the parameters involved. For example, when the

70

2 Rapid Unsupervised Effective Causal Learning

problem situation of a SMG problem is presented to/encountered by an Agent the

first time, the Agent has a specific START location, say AL1, and a specific relative

distance to the GOAL, say, RD1. The GOAL also has a specific location, say GL1,

as defined by the problem. Together these constitute the START state (Fig. 2.17a).

At the end of the movement process, the Agent’s final location is the same as that of

the GOAL location, GL1. In the OUTCOME portion we also encode the changes or

absence of changes in the parameters involved. The ACTION portion of the script

encodes the specific actions taken to solve the problem – i.e., go from the START to

the GOAL location. Figure 2.17a shows the specific script encoded in the above

script creation process. Each elemental action is a force, F(AA, RA), that takes two

arguments, AA ¼ Absolute Angle and RA ¼ Relative Angle, just like in Fig. 2.15.

(There are other synchronic causal parameters, such as the specific value of GL that

have been included with F in the process described in Fig. 2.15, that are now

omitted for clarity.)

Note that at any given time when an agent is exploring and interacting with the

environment, there could be many other entities and their associated parameters

that can potentially participate in any script that the agent is attempting to encode.

However, in the current example, the reason why only the Agent and its GOAL’s

associated parameters are encoded in the script is that in a problem solving process,

there is always a focus on certain aspects of the environment. For the case of the

current SMG problem, the Agent begins with a need to be satisfied. This need is for

it to change its current location to a GOAL location and this can be achieved by

movement. It knows that for itself, as far as movement is concerned, the relevant

parameters are the AL, RD, AA, RA, etc. mentioned above, as it has earlier learned

the causal connections between the force applied to itself and the changes in these

parameters. The GOAL’s parameters are also relevant as the reaching of the GOAL

is a need to be satisfied in this problem solving process. The fact that any problem

solving process is a purposeful act, together with some earlier knowledge learned

(say, in the form of heuristics to be discussed further in Sect. 6.4 of Chap. 6), allow

the script with the relevant entities and their associated parameters that it contains

to be carved out of the environment. If more than enough entities/parameters are

included initially, they will be weeded out in subsequent learning processes. If

insufficient entities/parameters are included, subsequent learning processes will

discover this and include them. In Chap. 6, notably Sect. 6.4, there will be more

discussion on the learning of knowledge and heuristics that allows the system to

discover relevant or irrelevant causal entities and parameters in the environment

that may influence a problem solving process and hence the encoding of scripts.

In Fig. 2.17a, it is shown that the absolute location, AL, of the agent has changed

(increased) from AL1 to GL1 and the change is indicated as ỵAL ẳ GL1 AL1.

The relative distance from the agent to the GOAL, RD, has decreased by an amount

ÀΔRD ¼ RD1 and the final value is RD ¼ 0. GL remains unchanged. The actions in

the ACTIONS portion consist of a sequence of force applications in the absolute

direction of A3, say, and a relative direction of 0 (zero degree) with respect to the

direction connecting the START and GOAL locations (as in Fig. 2.15). The script is

named MOVEMENT-SCRIPT and it is a SPECIFIC script because it captures

2.6 Application: Causal Learning Augmented Problem Solving Search Process with. . .

71

movement starting from a specific location – AL1 – and ending in a specific

location – GL1.

The SCENARIO portion of the script basically describes the scenario involved –

that there is an Agent that is currently at a START location and that it intends to

move to a GOAL location. This portions also contains the specification of the

parameters involved. The representation of this portion could be in analogical/

pictorial or other forms.

Now, suppose another similar problem situation of an SMG problem is presented

to/encountered by the agent with different START and GOAL locations, and, based

on the causal learning augmented search process discussed above, the agent finds

the same solution – i.e., at each elemental step of movement, always elect to move

directly toward the GOAL. Based on a moderate level of desperation and dual

instance generalization as depicted in Fig. 2.13, a general movement script can be

created as shown in Fig. 2.18.

The “*’s” in front of the values of AL1, RD1, and GL1 indicate that these can

take on any value. The forces in the ACTIONS portion are now F(*, 0 ), indicating

that they can be pointing at any absolute angle but they must be pointing toward the

GOAL (RA ¼ 0 ). This general script in Fig. 2.18 basically encodes the result of the

problem solving process as discussed in Sect. 2.6.1 with the general learned

heuristic of always moving in the direction of the GOAL if the minimum distance

requirement is to be satisfied.

Figure 2.19 shows the basic process by which a script is constructed. In the case

of a known problem solving process initiated by the system, the START and GOAL

states and the SCENARIO (e.g., a spatial movement problem) are known (these are

at a meta-level to the problem solving process itself). These are then used to

construct the SCENARIO section of the script. In the case of Fig. 2.17, the

SCENARIO description may contain an analogical representation of the problem

itself as this is a problem of a spatial nature, as with most of the problems that we

will encounter in this book. A vector representation can still be used to store the

relationships between different entities in the scenario for the sake of storage

efficiency but later when the need arises to match a newly encountered SCENARIO

to the scripts stored in the Script Base (assuming this is a place where the earlier

learned scripts are stored), an analogical/spatial matching may be carried out for the

SCENARIO portion. After the problem solving process has completed, the

ACTIONS and OUTCOME portions of the script are then constructed accordingly.

As indicated in the SCRIPT CONSTRUCTION process in Fig. 2.19, a script can

be constructed either as a consequence of a problem solving process or through

observation – i.e., observing the activities in the environment such as in this case of

spatial movement of an agent, or observing the spatial movements or other activities of other agents including interactions between agents, etc. and constructing the

script accordingly.

In the SCRIPT QUERY process (Fig. 2.20), presumably there is a problem at

hand and the SCENARIO and START of the problem are matched to the SCENARIO, START and OUTCOME portions of the scripts in the Script Base (where

earlier learned scripts are stored). The matched script will then be retrieved and the

72

2 Rapid Unsupervised Effective Causal Learning

MOVEMENT-SCRIPT (GENERAL)

START

ACTIONS

OUTCOME

Object Parameters

Actions on Agent

Parameter Changes

SCENARIO

Agent

GOAL

GL

RD

AL

START

A3

AL = *AL1

RD = *RD1

GL = *GL1

F(*, 0°)

F(*, 0°)

F(*, 0°)

.

.

AL = GL1

+ΔAL = GL1 – AL1

RD = 0

–ΔRD = RD1

GL = GL1

ΔGL = 0

Fig. 2.18 A general MOVEMENT-SCRIPT. “*” indicates that the corresponding parameter can

be of any value

SCRIPT CONSTRUCTION

Intention

TO-REACH(GOAL STATE)

(E.g., Goal Location = GL in Spatial

Movement Scenario)

SCENARIO

Given

CURRENT STATE

(E.g., Start Location = AL in Spatial

Movement Scenario)

(In the Spatial

Movement Scenario in

Fig. 2.17a, the AL and

GL values are of

specific values)

Problem Solving or Observation

Derives SOLUTION

(E.g., Move along a path from AL to GL

in Spatial Movement Scenario)

Record

PARAMETER CHANGES

ASSOCIATED WITH SOLUTION

(E.g., Changes of AL and RD in the

Spatial Movement Scenario)

Construct

SCRIPT

Fig. 2.19 The script construction process

solution retrieved accordingly as shown in Fig. 2.20. In the case of the MOVEMENT-SCRIPT (SPECIFIC) of Fig. 2.17, it can only be used to solve a spatial

movement problem that matches its AL ¼ AL1 and GL ¼ GL1 values exactly,

because it is a specific script that has not been generalized yet. For the

### Tài liệu bạn tìm kiếm đã sẵn sàng tải về

6 Application: Causal Learning Augmented Problem Solving Search Process with Learning of Heuristics

Tải bản đầy đủ ngay(0 tr)

×