AIMA - Part I - Artificial Intelligence

Part I – Artificial Intelligence

In this first part of the book, Russel and Norvig give an overview of the history of the field and adjacent areas of research, and describe the idea of intelligent agents. They claim that we can think of “intelligent agents that do the right thing” - that is a view focused more on the actions taken and how we judge them as the “standard model” of AI research.

Chapter 1 – Introduction

One thing that stood out to me from this chapter, was that - while they used almost interchangeably today - the authors make a distinction between the terms Artificial Intelligence and Machine Learning.

Artificial Intelligence is defined as being “focused on the study and construction of intelligent agents that do the right thing”, even in uncertain situations.
Machine Learning is a subfield of AI that focuses on “improving performance based on experience”.

So, based on these definitions, we can probably think of a rule-based system built by a human expert as an AI system, but not an ML system, while modern approaches like deep learning or generative-adverserial networks would fall into the ML category.

The fact that most modern ML networks only adjust their weights during a specific training phase, creates an interesting grey area, where we a (base) model used for inference only, that is, with unchanging weights, might not strictly be considered machine learning. But I guess we’d still call this ML, as there usually is a broader feedback loop around the model, where data, examples, etc. from runtime inference are gathered and fed into future training runs of the model.

Chapter 2 – Rational Agents

Chapter 2 focuses on the idea of rational agents which the authors claim is pretty much the “standard model” of AI these days.

A rational agent acts to achieve the best possible outcome or

if there is uncertainty - the best expected outcome, as defined by the agent’s objective. Or, more formally: a rational agent maps any given percept sequence to the action that it expects to maximize it’s performacne measure, given the evidence provided by the precept sequence and any (built-in) knowledge the agent has.

Judging an agent’s actions rather than how it is built or “thinks”, makes for a common framework that allows comparing and optimizing agents without constraining us to a particular way of building them. But - unless we carefully design the agent’s objective - we may face a value alignment problem where the agent’s objective does not actually align with the objective we wanted it to have. A funny example I remember hearing about, was an attempt to make mechanical “birds” fly with AI, by giving them the objective of maximing the body’s height above ground, which led to one very “good” solution that just learned to push up the body by pressing the wings into the ground. But this is a serious problem, that give rise to a whole swath of articles debating self driving cars and the trolley problem.

Agents

A more formal definition of an agent is a system that is sensing its environment through sensors and can act on the environment through actuators. We can think of an agent as an agent function, that maps the agen’ts percept sequence (i.e., the sequence of the agents percepts it got from its sensors since the agent started) to an action. The book gives an example of a very simple cleaning robot, with two percept inputs and two actions, to illustrate this. However, in practice we will usually specify an agent’s behaviour through some agent program instead of actually spelling out the agent function. Sometimes agents need to perform information gathering, that is to perform actions to modify future precepts. The book gives an example of looking left and right before crossing the street.

The book next talks about performance measures which judge an agent’s actions, and suggests that these should be defined based on the consequences of the agent’s actions. That is, a

Performance measure assigns scores to a sequence of environment states (that were created by the agent’s actions or lack thereof).

The book notes performance measure can be hard to define (the value alignment problem earlier), but suggests it usually works better to focus on the actual results we want in the environment rather than some idea of how we expect the agent to work. And indeed, it seems AI sometimes finds ways of surprising us. I remember reading some articles where AI produced unexpected designs for mechanical or electronics problems that worked better than the designed-from-parts human solutions.

Task Environments

Given we judge an agent by it’s actions, the task environment of the agent matters. The book suggests this is defined by the PEAS, that is the performance measure, environment, actors and sensors and gives an example of a taxi driver agent.

The book suggests a couple of dimensions in which task environments can differ, including whether an environment is fully or partially observable, whether it is single or multiagent, whether it is static or dynamic (i.e., willing to wait for the agent to think, or moving on wile it does), etc.

Agent Taxonomy

The book suggests that agents seem to come in four broad architectures of increasing complexity:

Reflex agents have no memory or internal state and just react to the most recent percept, often through some simple set of rules.
Model-based reflex agents start to remember some of their earlier precepts by keeping internal state that is updated forms the basis of decisions (again, often in terms of condition-action rules).
Goal-based agents have an explicit (single) goal they work towards, which requires a model of what effects actions will have (i.e., if they take the agent closer to the goal) and a way of planning.
Utility-based agents may have multiple or conflicting goals, and thus judge actions by their expected utility (i.e., how “happy” would I be in the future state I expect my actions to have?).

We can apply learning to all of these agent types, by adding three components around the core agent. First, we need a critic that judges the agent’s actions. Second we need a learning element that is allowed to change the agent’s knowledge (e.g., the rules or maybe models predicting the effect of actions or the expected utility). Most interestingly, we need a problem generator that can push the agent out of it’s “comfort zone” (i.e., a local maximum of performance) by suggesting actions that are suboptimal in the short term, but might help find better solutions in the long run.

State representations

The chapter concludes with different ways (world) state can be represented in agents, with increasing levels of expressiveness (but also complexity):

An atomic representation simply assigns unique values to possible states with no internal
structure. The only thing we can say about two states is that they are different. Simple state machines come to mind here.
A factored representation assigns a vector of variables or attributes to states, so two states may be the same in some dimensions, but different in others. I was reminded of embeddings here, which seem to be an opaque / implicit version of this.
A strucured representation uses objects, their attributes and relationships to represent state. An example of this could be the RDF graphs used in the Semantic Web.

More expressive representations can usually produce shorter descriptions of state, but at the cost of more complex rules for reading and processing them.