Tải bản đầy đủ - 0 (trang)
1 Spatial Movement, Effort Optimization, and Need Satisfaction

1 Spatial Movement, Effort Optimization, and Need Satisfaction

Tải bản đầy đủ - 0trang

3.1 Spatial Movement, Effort Optimization, and Need Satisfaction



91



minimum path length priority. But if these other priorities are not at work and only

the basic effort one is active, then the minimum path length behavior is the priority.

(If one is a lumber jack and cutting wood in a forest, one would definitely choose

the shortest distance path from point to point to carry the cut lumber.) Sometimes,

built-in effort conservation priority is manifested or characterized as “laziness.”

A deeper question concerning the priority to conserve energy as much as

possible is whether this tendency can be learned. In Chap. 1 we mentioned that

the priority to look for food or energy in a biological system has to be built-in, so

that when the food level has lowered to a certain critical level the system would take

the necessary actions, otherwise if the organism is dead, there is no further learning

possible and therefore learning must have taken place at the evolutionary level and

built-in at the bio-noo boundary. (And as mentioned above, this priority is not

absolute, especially in the case of higher form organisms such as human beings in

which hunger strike or hunger strike leading to death is possible.) And we have also

noted that for an artificial system, if we want it to continue to function normally and

do not want the interruption of “death from lacking in energy” we would also have

put in place a built-in high priority for it to look for energy when the level is low.1

Therefore for both biological and artificial agents, the recharging priority should be

built-in. However, the energy conservation priority is tied to the recharging priority.

It is possible for an agent to learn that if it exhausts itself in one task unnecessarily

(such as going along a longer path to a desired destination unnecessarily), then it

will have to carry out recharging more often, and that would affect the satisfaction

of other internal needs, leading to an overall suboptimal situation. It will then adjust

its behavior accordingly.

Each biological agent such as a human being probably has some components of

“built-in energy conservation” and “learning to conserve energy.” As discussed

above, for the travel from start to goal situation, typically there exists a built-in

priority to conserve energy unless it is overridden by other priorities. There could be

other task situations in which a person may know initially that there are ways to

conserve energy but does not pursue them. Later, some associated negative consequences may ensue, and then she would learn from this and change her

behavior accordingly.



1

As discussed in Chap. 1, in principle learning to give priority for “recharging” for an artificial

system is possible since memory of previous experiences can be carried “across death” for an

artificial system. If the artificial agent can remember that it did not give priority to its energy level

maintenance and it led to its death in its “previous life” – and the establishment of this causality

can be made through our causal learning method – it can adjust its priority accordingly in the “new

life.” But since the recharging priority is not difficult to build-in it is probably better for the creator

of the artificial being to have it built-in rather than have the agent go through the trouble of

learning it.



92



3.1.1



3 A General Noological Framework



MOVEMENT-SCRIPT with Counterfactual

Information



Figure 3.1 shows a knowledge structure, a MOVEMENT-SCRIPT, that contains a

COUNTERFACTUAL INFORMATION (CFI) portion that encapsulates the concept of a movement event in which an entity (inanimate object or animate Agent)

moves from one location to another (much like the counterfactual script of Fig. 2.23

of Chap. 2). One the leftmost part of the figure is a pictorial representation of the

various paths that allow the movement from one specific location to another specific

location to be executed. In the middle of the figure are basic MOVEMENTSCRIPTS such as the one in Fig. 2.18 with a detailed specification of the START

state, ACTIONS sequence and OUTCOME of the movement event. These are

alternative MOVEMENT-SCRIPTS to capture the alternative movements from

START to GOAL. In addition to the parameters in the MOVEMENT-SCRIPT of

Fig. 2.18 for the START and OUTCOME portions, an energy parameter, E, is

included that represents the energy level of the Agent involved (this is indicated in

red). The action sequence specifications contain different action sequences to

encode the various possible paths. On the rightmost side of the script is a graph

showing “counterfactual information” in which the alternative paths and their

consequences in terms of energy consumption are captured. The graph shows that

there is an optimal path – namely the straight line path from the starting location to

the ending location – which corresponds to the minimum amount of energy/effort

consumed. (One way to interpret the CFI portion would be to read each point on the

graph as corresponding to the description – “had this path been taken, the energy

consumed would have been. . .,” etc.)

Much like the MOVEMENT-SCRIPT of Fig. 2.18, the MOVEMENT-SCRIPT

of Fig. 3.1 can be constructed through a problem solving process or learned through

causal learning from an agent’s experience with the environment including consideration of its internal states’ changes (e.g., the changes in the energy, E). Figure 3.1

contains the knowledge structure that fully specifies the concept of movements and

their consequences, and from it one can see that the reason to select the minimum

distance path is to achieve an optimal consumption of energy. This MOVEMENTSCRIPT or conceptual specification is one of the simplest examples we will

encounter in this book that embodies the noological principle of including the

motivations and purposes as part of the consideration whenever we address the

issues of concepts. Minimum distance could in itself be a mathematical concept.

However, noological systems carry out actions for a purpose. The ground level

reason why a noological system prefers a minimum distance when moving from

point to point has to do with its internal needs and priorities. A noological system’s

actions are noological in nature, not mathematical. Therefore, there is always a

“why” associated with a noological system’s behavior and this has to be adequately

addressed before a proper understanding of noological systems can be achieved.

Similar to Fig. 2.24, Fig. 3.2 shows how the MOVEMENT-SCRIPT of Fig. 3.1

containing counterfactual information is constructed from either a problem solving



3.1 Spatial Movement, Effort Optimization, and Need Satisfaction



MOVEMENT-SCRIPT

START

AL = *AL1

RD = *RD1

GL = *GL1

E = *E1



SCENARIO



GOAL

GL



Agent

AL

START



START

AL = *AL1

RD = *RD1

GL = *GL1

E = *E1



ACTIONS

F(*, 0°)

F(*, 0°)

F(*, 0°)



.

.



ACTIONS

F(*, 45°)

F(*, 40°)

F(*, 35°)



.

.



OUTCOME

AL = GL1

+ΔAL = GL1–AL1

RD = 0

–ΔRD = RD1

GL = GL1

ΔGL = 0

–ΔE = E1–E2



93



COUNTERFACTUAL

INFORMATION (CFI)



ΔE



OUTCOME

AL = GL1

+ΔAL = GL1–AL1

RD = 0

–ΔRD = RD1

GL = GL1

ΔGL = 0

–ΔE = E1–E3



Trajectory



Fig. 3.1 MOVEMENT-SCRIPT with energy (E, in red) and COUNTERFACTUAL INFORMATION (CFI)



process or from observation. The differences between Figs. 2.24 and 3.2 are

highlighted in dark red. The script can also be queried as shown in Fig. 3.3 in a

similar manner as in Fig. 2.25. It can be seen that the processes shown in Figs. 2.24

and 2.25 on the one hand and Figs. 3.2 and 3.3 on the other are very similar, and

hence they are general and would be applicable to other situations (the main

difference is, whereas for Fig. 2.24 the focus is on “actions,” in Figs. 3.2 and 3.3

the focus is on “solutions.”)

In the script of Fig. 3.1, other than the information on the change in effort or

energy for each of the paths going from a START location to a GOAL location,

there can also be other parameters of interests. For example, Fig. 3.4 shows, in the

CFI portion, the graph of another parameter, ΔAD, that is the total absolute distance

traveled from the START to the GOAL location, that varies with the different paths/

trajectories. (ΔAD is derived from ΔAX and ΔAY, the change in the absolute X

and Y locations of the Agent or object undergoing the movement.) In this case, it

happens to have the same shape as the ΔE vs trajectory graph. This graph conveys

the information that suppose the Agent needs to find the shortest distance to the

goal, it should use the script that contains elemental movements that are all in a

direction pointing at the GOAL location (the topmost script in Fig. 3.4). In general,

the CFI portion of the script can contain many graphs capturing the counterfactual

information of various parameters’ relationships to the alternative movement paths/

trajectories. These can be generated automatically in anticipation of their potential

uses (method outlined below) or from other processes that require some relevant

information for their purposes.

In addition to the energy and distance traveled counterfactual information stored

with a script as shown in Fig. 3.4, there could be other “instance statistics” stored as



94

Fig. 3.2 Counterfactual

script construction process

for scripts containing

solutions to problems.

Contrast this with Fig. 2.24

– the differences are

highlighted in dark red



3 A General Noological Framework



COUNTERFACTUAL

SCRIPT CONSTRUCTION

Intention

TO-REACH(GOAL STATE)

(E.g., Goal Location = GL in Spatial

Movement Scenario)

Given

CURRENT STATE

(E.g., Start Location = AL in Spatial

Movement Scenario)

Problem Solving or Observation

SOLUTIONS(1, 2, 3, … N)

(E.g., Move along various paths from AL

to GL in Spatial Movement Scenario)

Record

PARAMETER CONSEQUENCE

ASSOCIATED WITH SOLUTIONS

(E.g., Energy expanded in path in the

Spatial Movement Scenario)

Construct

SCRIPT with COUNTERFACTUAL

INFORMATION



well. This information could include any pattern or context observed in the usage of

the script. For example, the time and location at which the script is usually

triggered, the other scripts that are usually triggered together with this script in

some earlier problem solving processes, the frequency the script is used in the

course of a certain typical time interval (such as in the course of a day), etc.

Below we outline a method by which the counterfactual information is

constructed. Basically, parameter changes (or lack of changes) across instances

(e.g., different trajectories from start to goal states) are kept track off, whether it be

something related to “internal needs,” such as energy ΔE, or something that is a

physical parameter, such as ΔAD – the total distance traveled. In a simple situation

such as that of Fig. 3.1, there are not many parameters of interest and the system can

simply list all possible parameters that are available (supplied by the visual and

other sensory systems) and that either change or do not change across the instances.

When the situation scales up to something more complex like the real world

situation (such as in the case of the Restaurant Script to be described below shortly),

extra knowledge will be needed to narrow down the relevant parameters to be

considered, otherwise there will be an explosion in the number of parameters being

listed in the CFI portion. This will be part of a “building knowledge from ground



3.1 Spatial Movement, Effort Optimization, and Need Satisfaction



95



COUNTERFACTUAL SCRIPT QUERY

Intention

TO-REACH(GOAL STATE)

(E.g., Goal Location = GL in Spatial

Movement Scenario)

Given

CURRENT STATE

(E.g., Start Location = AL in Spatial

Movement Scenario)



Enquire

RELEVANT SCRIPT

(E.g., the MOVEMENT-SCRIPT in

Spatial Movement Scenario)



Value Enquiry

For a given solution what is the

consequence on a certain value or what

given solution is needed to achieve a

desired value.

(E.g., Which of the available trajectories

gives rise to a ΔE = ΔEX

in the Spatial Movement Scenario)



Optimality Enquiry

Which solution gives rise to the optimal

consequence on a certain value

(E.g., Which of the available trajectories

ΔE gives rise to a maximum or minimum ΔE

in the Spatial Movement Scenario)

ΔEX



From Fig. 3.1

Trajectory

Fig. 3.3 Counterfactual script query process for scripts containing solutions to problems. Contrast

this with Fig. 2.25 – the differences are highlighted in dark red



up” process that includes learning of meta-level heuristics that we will discuss later

in this book. Suffice it to note that for the current purpose, we assume that this

information is available for problem solving and other noological tasks.

As mentioned in Chap. 2, in traditional AI, the concept of script has been

investigated by Schank and Abelson (1977) and as was also pointed out in

Chap. 2, one major difference between our script and the earlier script is that we

investigate the process that allows scripts to be learned, either through a problem

solving process or through observation/experience. Another extension to the script

idea compared to that of Schank and Abelson (1977) is that we add the CFI portion

to the script structure as shown in Figs. 3.1 and 3.3. In Fig. 3.5, therefore, we show

how a CFI portion can be added to Schank and Abelson’s script.

In Fig. 3.5 it is shown that a number of counterfactual information graphs have

been added to the traditional RESTAURANT-SCRIPT. These include the change

of amount of money on the part of both the customer and owner, and the change of



96



3 A General Noological Framework



MOVEMENT-SCRIPT

ACTIONS



START

AL = *AL1

RD = *RD1

GL = *GL1

E = *E1



SCENARIO



GOAL

GL



.

.



ACTIONS



START

AL

START



F(*, 0°)

F(*, 0°)

F(*, 0°)



AL = *AL1

RD = *RD1

GL = *GL1

E = *E1



F(*, 45°)

F(*, 40°)

F(*, 35°)



.

.



OUTCOME

AL = GL1

+ΔAL = GL1–AL1

RD = 0

-ΔRD = RD1

GL = GL1

ΔGL = 0

-ΔE = E1– E2



COUNTERFACTUAL

INFORMATION (CFI)



ΔE



OUTCOME

AL = GL1

+ΔAL = GL1–AL1

RD = 0

-ΔRD = RD1

GL = GL1

ΔGL = 0

-ΔE = E1– E3



Trajectory



ΔAD



Trajectory



Fig. 3.4 MOVEMENT-SCRIPT with distance traveled counterfactual information, ΔAD



RESTAURANT-SCRIPT



COUNTERFACTUAL INFORMATION

ΔMoney(S)



ΔMoney(O)



FAE



START



OUTCOME



FAE



ΔHungry(S)



ΔPleased(S)



FAE



ACTIONS



Legend:

S = Customer of restaurant

O = Owner of restaurant

F = Fast food restaurant

A = Average restaurant

E = Expensive restaurant



FAE



Fig. 3.5 Schank and Abelson’s (1977) RESTAURANT-SCRIPT enhanced with counterfactual

information (RESTAURANT Script portion republished with permission of Taylor and Francis

Group LLC Books, from Scripts Plans Goals and Understanding, Roger Schank and Robert

Abelson, 1977; permission conveyed through Copyright Clearance Center, Inc.)



the level of hunger and the level of being pleased (by the service or the quality of

the food) on the part of the customer.

As will be seen later in the book, notably in Chap. 7, this CFI portion of the script

which provides more information than just the OUTCOME portion of the script

contains very important information to allow the problem solving process to select

the right/desired script in a backward chaining process.

As mentioned above, we would like to emphasize that the example given in

Fig. 3.5 is just to illustrate how the basic idea of counterfactual information is

applicable in a general situation. While the automatic extraction of the



3.2 Causal Connection Between Movement and Energy Depletion



97



counterfactual information such as the change of energy and distance traveled with

respect to the various trajectories in Figs. 3.1 and 3.4 is relatively simple as there

are not many variables involved in those situations, and the system can simply

record the changes over a few “built-in” designated variable of relevance, the

restaurant situation, on the other hand, involves many variables and the selection

of the relevant ones to create the CFI portion of the script may require higher level

reasoning. Our current paradigm dictates that higher level knowledge is built on

grounded level knowledge. The issues of ground level semantics are discussed in

the next chapter.

Another issue concerns the competition of needs alluded to in Sect. 1.5, Chap. 1.

One good example would be none other than the restaurant-going situation as

encoded by the RESTAURANT-SCRIPT of Fig. 3.5. Suppose one needs to make

a decision on which or what kind of restaurant to have a meal at. By examining the

CFI portion, one can see that the three kinds of restaurants – the fast food, average,

and expensive restaurants – all provide the same degree of satisfaction of hunger.

However, the expensive restaurant provides a more “pleased” outcome (desired)

whereas it also ends up with the most money spent (undesired). Hence a need

competition arises between the need for being pleased with the food consumed

versus the need to conserve as much money as possible. We will return to address

this issue in more detail in Sect. 3.3.



3.2



Causal Connection Between Movement and Energy

Depletion



In Fig. 3.1 we have extracted the connection between energy depletion on the part

of an agent and its movement through a certain trajectory/path. This is done

basically through recording the change (difference between final and initial values)

of the energy of the system across different instances of trajectory/path execution.

However, in a general situation, it is important to the system also to be able to

understand and encode the knowledge that it is the movement itself that causes the

depletion in the energy and not because of other parameters, such as the agent

singing all the way from the beginning to the end of the trajectory. (In the case of

energy, of course, almost every activity executed by the agent including singing

will cause a depletion of energy of some degree, but there are activities that deplete

energy more extensively than others). There are two ways that the connection

between energy depletion and movement can be established. One way is, since

energy depletion as a result of movement is very fundamental to the “survival” of a

system, whether it be a natural or an artificial system, we can simply build-in the

knowledge of this connection. The other is, through our causal learning process as

outlined in Chap. 2, the connection between movement and energy depletion can be

established as shown in Fig. 3.6.



98



3 A General Noological Framework



a



Energy

Increase

ΔE

Time

Agent



“face”



t=3

t=2

t=1



ΔE1



b

Agent

moves as

a result of

internal

force



Energy

Increase

ΔE

Time

t=3

t=2

t=1



ΔE 1 + ΔE 2

ΔE 1



Fig. 3.6 (a) Agent is stationary but there is energy depletion due to internal processes. (b) Agent

is being moved by an internal force that consumes extra energy. Agent is depicted to have a “face”

that defines a direction of movement



Firstly, let us consider a situation in which the Agent does not move and yet there

is an energy depletion due to some other internal processes that consume the energy

as shown in Fig. 3.6a.2 In the figure, it is shown that for every elemental time

change from, say, t ¼ 1 to t ¼ 2, there is a decrease in energy by an amount ΔE1

and that amount is registered at the end of the time frame – at t ¼ 3 (the first ΔE1

shown at t ¼ 1 is due to the process between t ¼ À1 and t ¼ 0). For natural living

agents, internal biological processes are always ongoing and consuming energy.

For artificial agent, even when there is no movement, the internal processor could

still be operating (perhaps it is still “thinking” and solving problems). This

situation could be encoded as StationaryðAgent, t ¼ 1 to 2ị ^ Thinking=Living

Agent, t ẳ 1 to 2ịị ! Energy-ChangeAgent, E1 , t ẳ 2 to 3ị. The “action”

that causes the Agent’s change of energy is the thinking/living process. However,

this causal rule could only be learned using the causal learning process described

in Chap. 2 provided there was a situation earlier in which there was no thinking

processes and there was no energy depletion, and when the thinking processes

were switched on, the is an energy depletion that followed.



2



The Agent in Fig. 3.6a is shown with a “face” indicating a “forward” direction. This feature will

become more useful in subsequent discussions in this chapter.



3.3 Need and Anxiousness Competition



99



Note that the “thinking” or “ongoing activities of an agent’s internal processor” is

a meta-level process that has to be observed and characterized at a meta-level in order

for the system to be able to identify its causal connection to the energy depletion

process. If that is not done, then a simpler characterization of the process would be

StationaryðAgent, t ẳ 1 to 2ị ! Energy-ChangeAgent, E1 , t ¼ 2 to 3Þ.

Next, we consider the situation in which the Agent moves. In Fig. 3.6b the

movement of the Agent is shown to take place between time slices t ¼ 1 and t ¼ 2

due to an internally generated force that consumes energy. Over the elemental time

change from t ¼ 1 to t ¼ 2, the internal energy of the agent decreases by an amount

E1 ỵ E2 and this is registered at t ¼ 3. Hence an event – a change in position of the

Agent – between t ¼ 1 and t ¼ 2 is followed by another event – a further change in the

energy level – between t ¼ 2 and t ¼ 3. A causal connection can be established and

causal rule encoded based on the effective causal learning method described in Chap. 2

such as MoveðAgent, t ¼ 1 to 2ị ! Energy-ChangeAgent, E2 , t ẳ 2 to 3Þ. (ΔE2

is shown to be larger than ΔE1.) This illustrates the learning of the causal connection

between movement and energy depletion.



3.3



Need and Anxiousness Competition



As mentioned in Sect. 3.1, typically a noological system has multiple internal needs

that compete for priority of being satisfied. The example given earlier was based on

the RESTAURANT-SCRIPT (e.g., to please oneself by going to a more expensive

restaurant or to save money by going to a cheaper one?). In this section we use a

scenario from the StarCraft battle game environment (StarCraft II 2015) to illustrate

the idea of need competition. In Chap. 7, there is more detailed discussion on

learning and problem solving using the StarCraft environment.

In Fig. 3.7 we present some re-represented screenshots from the StarCraft game

environment. Figure 3.7a shows three agents, a Self agent, an Enemy agent, and a

Medic agent, that are placed at some distance away from each other in the

environment. There are many kinds of agents in the StarCraft environment and

there are also many kinds of activity that can take place. Typically, in a game, the

human player would control the agents on the “Self” side and the system will react

by controlling the agents on the “Enemy” side and the battle will ensue with many

shooting and tactical movement events, etc. that will result in one side winning over

the other side. The system also keeps track of many parameters associated with each

agent such as its location, orientation, state of attack, and energy level.

In our current discussion, however, we will focus on a small number of agents

and a small number of relevant parameters associated with the agents. One parameter of particular interest is what is called Health Point (HP) which reflects the state

of health of the agent involved and if HP reaches zero (0), the agent “dies.” Thus,

maintaining a high/minimum level of HP is a priority for the agent.



100



3 A General Noological Framework



Fig. 3.7 (a) Three kinds of StarCraft agents: Self, Enemy, and Medic agents (Due to copyright

reason, we did not capture the exact screenshots and illustrate them here. Instead, the same visual

contents were redrawn and re-represented here. To have an idea of what the original visual output

looks like, please follow some of the hyperlinks provided in Chap. 7) (b) Self agent moves to

Enemy agent and both are then engaged in a shooting event. (c) Self agent moves to Medic agent to

recharge its Health Point (HP). There are some objects in the StarCraft environment that are the

“facilities” that we will ignore in the current discussion



In Fig. 3.7b it is shown that the Self agent has moved close to the Enemy agent

and they are engaged in a shooting event in which they shoot at each other and if

this is continued all the way it will lead to one of them being “killed,” unless they

disengage from each other before that happens. In Fig. 3.7c it is shown that the Self

agent has disengaged itself from the shooting with the Enemy agent and approached

the Medic agent to charge up its HP. (The Medic agent here only charges the HP of

the Self agent, not that of the Enemy agent.)

The causal rules that are associated with these two situations – Self agent

shooting with Enemy agent and Self agent’s HP being charged by the Medic

agent – both triggered by certain causal preconditions, are learnable and this has

been described in Ho and Liausvia (2014) and will also be discussed in detail in

Chap. 7. For this section, we are assuming that these rules are already learned and

the focus will be on need/affective competition.



3.3 Need and Anxiousness Competition

Fig. 3.8 Anxiousness of

Low HP (ALH) vs Health

Point (HP) graph



101



Anxiousness of Low

HP (ALH)

100%

HP Charging

Response

Threshold

(HCRT)



0%



Change of shape enables

Affective Learning



Increasing ALH



100%



HP

(Health

Point)



Figure 3.8 introduces a concept called “Anxiousness of Low HP”3 (ALH) that

captures the “emotional state” of the agent with regards to its state of HP level.

Generally, the relationship is an inverse one – the lower the HP level, the higher the

level of ALH. The relationship is also typically a non-linear one. For example, there

is a range of HP reduction from the maximum level of 100 % in which ALH does

not increase very much, but as HP nears 0, ALH would shoot up rapidly.

As the HP of an agent decreases, there is also a threshold of ALH above which

the agent will respond to the situation and typically that means it would seek ways

to recharge its HP. We term this HP Charging Response Threshold (HCRT).

The shape of the ALH vs HP curve is changeable with experience. For example,

the agent may learn that it was not anxious enough at a given HP level and that led

to some disastrous consequences (such as losing a battle or being killed), it would

then “up-adjust” the curve. The HCRT is also learnable with experience – if

response is taken, say, at too high an anxiousness level and it leads to undesirable

consequences, the threshold would be up-adjusted.

There is also the situation that the agent may be “over anxiousness” – i.e.,

assigning too high an ALH level for a given HP level or setting two low an HCRT.

This leads to a different kind of undesirable consequence – over attending to the task

of recharging the HP and foregoing other priorities, which may in turn lead to other

kinds of disastrous consequences (e.g., abandoning the assigned task of engaging the

enemy and keep going to recharge the HP level, leading to the inability to win the

battle.) In this case, the agent needs to “down-adjust” the curve or the HCRT.

In principle, suppose the purpose of the ALH-HP curve and the HCRT is for

deciding when to recharge the agent’s HP, then we could either change the HCRT

as a result of learning while keep the ALH-HP curve constant or vice versa. The

reason why we advocate having two adjustable aspects – the HCRT and the



3



Fear and anxiousness are used synonymously here and the rest of the book. There are subtle

differences which we will not particularly address in this book. E.g., typically it is “fear of the

snake causes him to run” and “I am anxious that I cannot finish my homework by tonight.” In these

examples, it seems to be a matter of degree.



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

1 Spatial Movement, Effort Optimization, and Need Satisfaction

Tải bản đầy đủ ngay(0 tr)

×