Tuesday, October 25, 2011

BAYES RULE





REVISED: Wednesday, February 4, 2015



You will learn Bayes Rule.

I.  BAYES RULE

Invented by 18th century British mathematician, Rev. Thomas Bayes, a Presbyterian minister.

P( A | B ) = ( P( B | A ) * P(A) ) / P( B )

P( A | B ) is the posterior, probability of event A given event B occurred.

P( B | A ) is the likelihood, probability of event B given event A occurred.

P(A) ) is the prior.

P( B ) is the marginal likelihood.

Links a conditional probability to its inverse.

posterior = (likelihood*prior)/(marginal likelihood)

Bayesian reasoning is very counterintuitive.

Bayes' theorem is particularly useful for inferring causes from their effects.

Initial Beliefs + Recent Objective Data = A New and Improved Belief.

Bayesian reasoning brings clarity when information is scarce and outcomes uncertain.

By expressing all information in terms of probability distributions, Bayes can produce reliable estimates from scant and uncertain evidence.

Probability is the likely percentage of times an event is expected to occur if the experiment is repeated for a large number of trials.

Joint probability is the statistical measure where the likelihood of two events occurring together and at the same point in time are calculated.

The probability of two events, A and B, both occurring is expressed as:

P(A,B)

Joint probability can also be expressed as:


P(A ∩ B)

And, is read as the probability of the intersection of A and B.


A.  Complimentary Event of Bayes Rule

"A" is not observable.

"B" is our test, it is observable.

We know the prior probability for "A" which is P(A).

We know the conditional probability for "B" given "A" which is P(B|A).

What we care about is the Diagnostic Reasoning which is the inverse of the Causal Reasoning.

P(A|B)

P(A|  ¬ B) 

It takes three parameters to describe the entire Bayes network:

One parameter for P(A) from which we can derive P(  ¬ A).

Two parameters for P(B|A) from which we can derive
P(B|  ¬ A); P(  ¬ B|A) and P(  ¬ B| ¬ A).

Therefore it is a total of three parameters for the Bayes network.

B.  Bayes Rule

P( A | B ) = ( P( B | A ) * P(A) ) / P( B )

The complementary event is   ¬ A .

P(  ¬ A | B ) = ( P( B |  ¬ A ) * P( ¬ A) ) / P( B )

The normalizer is P(B).

We know:

P( A | B ) + P(  ¬ A | B ) = 1

We can compute the Bayes Rule by ignoring the normalizer.

P '( A | B ) = ( P( B | A ) * P( A ) ) 

P ' ¬ A | B ) = ( P( B |  ¬ A ) * P( ¬ A ) )

P( A | B ) = ζP '( A | B )

P(  ¬ A | B ) = ζP ' ¬ A | B )

 ζ = (P '( A | B ) + P ' ¬ A | B ))-1

C.  Examples

Two test cancer example.

It takes three parameters to describe the entire Bayes network:

P( c ) = 0.1 is the probability of a patient having cancer.

P( + | c ) = 0.9 is the probability of receiving a positive test given the patient has cancer.

P( - | ¬ c ) = 0.8 is the probability of receiving a negative test given the patient is cancer free.

========================================

P( ¬c ) = 0.99 is the probability of a patient not having cancer.

P( - | c ) = 0.1 is the probability of receiving a negative test given the patient has cancer.

P( + | ¬ c ) = 0.2 is the probability of receiving a positive test given the patient is cancer free.


You have learned Bayes Rule.

Elcric Otto Circle





-->

-->

-->




How to Link to Elcric Otto Circle's home page!

It will appear on your website as:

Link to: "ELCRIC OTTO CIRCLE's Home Page".




Sunday, October 23, 2011

PROBABILITY IN ARTIFICIAL INTELLIGENCE







REVISED: Wednesday, February 4, 2015






I.  PROBABILITIES

We will start off by using a coin flip with H for heads and T for tails.

Probability of heads is P(H) = 1/2 = 0.5
Probability of tails is P(T) = 1/2 = 0.5

What is probability out of three coin flips of getting three H?

P = { H, H, H } =  ( ( 1/2 * 1/2 ) * 1/2 ) = 1/8 = 0.125

A.  Symbols

The " | " reads "provided that, condition on, or given."
The " : "  reads "such that."
The " ¬ "  reads "not."
The " ⊥ "  reads "independent."
The " ⇒ "  reads "implies."
The " ∑ "  reads "summation."
The " ∩ "  reads "intersection of."

B.  Complimentary Probability

If probability of an event is:

P( A ) = p;

then complimentary probability of an event:

P( ¬A ) is ( 1 - p ).

You can say X and Y are independent events or:

X ⊥ Y

P(A) = p  ⊥  P( ¬A ) is ( 1 - p )

X ⊥ Y : P(X) P(Y) = P(X,Y)

C.  Independence

P = ( X 1 = H ) = 1/2
H : P ( X 2  = H  |  X 1  = H ) = 0.9
T : P ( X 2  = T  |  X 1  = T) = 0.8

What is the probability of the second coin flip coming up heads?

P = (X 2  = H) = 0.55

P = (X 2  = H) = ((P ( X 2  = H  |  X 1  = H )) * (P ( X 1 = H ))) +
(P ( X 2  = H  |  X 1  = T )) * (P ( X 1 = T )) =
(0.5 * 1/2) + ( ( 1 - 0.8 ) * 1/2) =
(0.45 + 0.1) = 0.55

X i = Result of i-th coin flip.


X i  = { H, T }


P(H) = 1/2


P(T) = 1/2

What is probability of all four flips being either H;

or all four flips being T.

Four all four flips being H we have:

P(X 1 = X 2 = X 3 = X 4 ) =
(( 1/2 * 1/2) * 1/2 ) * 1/2) = 1/16

For all four flips being tails we would have the same thing 1/16.

The probability of either 4 H or 4 T = 1/16 + 1/16 = 2/16 = 1/8.
1/8 = 0.125.

What is probability of at least three out of four flips being three or more H.

P({ X 1 ,  X 2 ,  X 3 ,  X 4  } contains  >=  3H ) = (5 * 1/16) = 5/16 = 0.3125

HHHH = (((1/2 * 1/2) * 1/2 )* 1/2) = 1/16
HHHT = (((1/2 * 1/2) * 1/2 )* 1/2) = 1/16
HHTH = (((1/2 * 1/2) * 1/2 )* 1/2) = 1/16
HTHH = (((1/2 * 1/2) * 1/2 )* 1/2) = 1/16
THHH = (((1/2 * 1/2) * 1/2 )* 1/2) = 1/16

D. Total Probability 

P( Y ) = Si P( Y | X = i) P( X = i )

P( ¬X | Y ) = 1 - P( X | Y )


Example 1

P( D 1 )       P( D 1  = sunny ) = 0.9
P( D 2  = sunny | D 1  = sunny ) = 0.8

P( D 2  = rainy | D 1  = sunny ) = 1 - 0.8 = 0.2


Example 1.1

P( D 2  = sunny | D 1  = rainy) = 0.6

P( D 2  = rainy | D 1  = rainy) = 1 - 0.6 = 0.4 


Example 1.2

P( D 2  = sunny ) = ( 0.9 * 0.8 )  + ( 1 - 0.9)*0.6) = 0.78

P( D 3  = sunny ) = ( 0.78 * 0.8) + ( ( 1- 0.78 ) * 0.6) = 0.756


Example 2

P( C ) = 0.01

P( ¬C ) = ( 1.0 - 0.01 ) = 0.99


Example 2.1

P( + | C ) = 0.9

P( - | C ) = ( 1.0 - 0.9 ) = 0.1

P( + | ¬C ) = 0.2

P( - | ¬C ) = ( 1.0 - 0.2 ) = 0.8


Example 2.2

Joint Probabilities

P( +, C ) = ( 0.01 * 0.9 ) = 0.009

P( -, C ) = ( 0.01 * 0.1 ) = 0.001

P( +, ¬C ) = ( 0.99 * 0.2 ) = 0.198

P( -, ¬C ) = ( 0.99 * 0.8 ) = 0.792


Example 2.3

P( C | + ) = ( 
0.009 / (0.009 + 0.198 ) = ( 0.009 / 0.207 ) = 0.043

Elcric Otto Circle

--> --> -->




How to Link to Elcric Otto Circle's home page!

It will appear on your website as:

Link to: "ELCRIC OTTO CIRCLE's Home Page".




Monday, October 17, 2011

INTELLIGENT AGENTS







REVISED: Wednesday,February 4, 2015



I.  AGENT

Instead of calling AI "non-human" and having to deal with the negative connotation of "non-human" intelligence we will use the euphemism intelligent "agent."   And define "intelligent agent" to be anything that has sensors to perceive its environment and actuators to act upon its environment based on its perceptions.

An AI program is called an intelligent agent.

A. There are four key attributes of an intelligent agent:
1. Partial observability
2. Stochasticity
3. Continuous spaces
4. Adversarial natures

-->

II.  RATIONALITY

When we describe a human as rational we imply their actions are appropriate considering the circumstances occurring prior to their actions. We say they are behaving rationally and not inappropriately.

Accordingly we will define a rational agent as an agent that does the right thing, considering the consequences of doing the wrong thing.

III.  ENVIRONMENT

By environment we mean the task environment.

Before we begin the design of a solution, an agent, we want a complete description of the problem, the task environment.

IV.  TERMINOLOGY

A. Fully versus partially observable.

During an agent's perception action cycle:

Fully observable means the agent can see the entire state of the environment.

Partially observable means the agent can only see a fraction of the state of the environment.

B. Deterministic versus Stochastic.

Deterministic means the outcome is not random, it is predetermined.

Stochastic means the outcome is random, it is not predetermined.

C. Discrete versus Continuous.

Discrete means your choices are finite.

Continuous means your choices are infinite.

D. Benign versus Adversarial.

A benign environment is just business its not personal.

An adversarial environment is personal, its out to get you.

V.  AI as Uncertainty Management

When you are uncertain about makings a choice, AI helps you make the choice.

VI.  Reasons for Uncertainty

A. Sensor Limits.

B. Adversaries.

C. Stochastic Environment.

D. Laziness.

E. Ignorance.

-->


VII.  Examples

A. Checkers
1. Fully Observable
2. Deterministic
3. Discrete
4. Adversarial

B. Poker
1. Partially Observable
2. Stochastic
3. Adversarial

C. Robot Car
1. Partially Observable
2. Stochastic
3. Continuous

Elcric Otto Circle

-->




How to Link to Elcric Otto Circle's home page!

It will appear on your website as:

Link to: "ELCRIC OTTO CIRCLE's Home Page".




PROBLEM SOLVING







REVISED: Wednesday, February 4, 2015


I. What is a Problem?

A. An Initial State

B. A Function Actions

Actions takes a state as input and returns a set of possible actions the agent can execute when the agent is in this state.

C. A Function Result

Result takes as input a state and an action, and delivers as its output, a new state.

D. A Function Goal Test

Goal Test takes as input a state, and delivers as its output, a Boolean value, true or false, telling us this state is a goal or not.

E. A Path Cost Function

The Path Cost Function takes a path, a sequence of state/action transitions, and returns a number which is the cost of that path. The path cost function is additive, the sum of the cost of the individual steps. The step cost function takes a state, an action, and the resulting state from that action, and returns a number n which is the cost of that action.

II. Search Algorithms

A. Breadth-First

We always expand first the shortest paths.

Breadth-First will find the optimal shortest path.

Even if the tree is infinite, Breadth-First will find a goal placed at any finite level.

B. Cheapest-First

We always expand first the path with the lowest total cost.

Cheapest-First will find the optimal cheapest path.

Even if the tree is infinite, Cheapest-First will find a goal placed at any finite level.

C. Depth-First

Which is the opposite of Breadth-First search.

In Depth-First search we always expand first the longest path.

Depth-First will not find the optimal shortest path.

If the tree is infinite, Depth-First will not find a goal placed at any finite level.

D. Greedy Best-First

Our search is directed towards the goal.

E. A*

The A* works by always expanding the path that has a minimum value of the function f which is defined as a sum of the g plus h components. The function g(path) is just the path cost. The function h(path) equals h(s) which is the final state of the path which is equal to the estimated distance to the goal.

The result is the best possible search strategy. It finds the shortest length path while expanding the minimum number of paths possible.

It could be called "Best Estimated Total Path Cost First."

It finds the lowest cost path if: h(s), the h function for a state, is less than or equal to the true cost. In other words, we want the h to never overestimate the distance to the goal. h is optimistic. h is admissable to use it to find the lowest cost path.

III. Heuristics

h1 equals the number of misplaced blocks.

h2 equals the sum of the distances each block would have to move to get to the right position.

h2 is always greater than or equal to h1.   An A* search using h2 will always expand fewer paths than one using h1.

IV. Problem-Solving Technology

Problem-solving technology is guaranteed to work when the following set of conditions is true:

A. Observable Domain

The domain must be fully observable. We must be able to see what initial state we start out with.

B. Known Domain

The domain must be known. We have to know the set of available actions to us.

C. Discrete Domain

The domain must be discrete. There must be a finite number of actions to choose from.

D. Deterministic Domain

The domain must be deterministic. We have to know the result of taking an action.

E. Static Domain

The domain must be static. There must be nothing else in the world that can change the world except our own actions.

V. Linked List Node

Each node in a linked list is a data structure and it has four fields.

A. State Field

State field indicates the state of the field at the end of the path.

B. Action Field

Action field indicates the action it took to get there.

C. Cost Field

Cost field indicates the total cost.

D. Parent Field

Parent field is a pointer to another node.

Elcric Otto Circle



--> --> -->




How to Link to Elcric Otto Circle's home page!

It will appear on your website as:

Link to: "ELCRIC OTTO CIRCLE's Home Page".




Sunday, October 16, 2011

INTRODUCTION







REVISED: Sunday, June 7, 2015




I. What is Artificial Intelligence (AI)?

When you are asked this question you are immediately confident you know the answer.  However, when you are asked to express your thoughts in words, you suddenly realize the words, are not there.

The problem comes from the word "artificial."  Artificial could mean many things; however, in regards to AI, "non-human" springs to mind.

The question becomes, "What is 'non-human' Intelligence?"  As a species humans have historically vehemently denied the existence of "non-human" intelligence.  We base our beliefs on how we perceive reality.  Our perceptions tend to be self centered.  Many of us feel threatened when confronted with the possibility of "non-human" intelligence because "non-human" intelligence is inherently difficult for humans to grasp with our built-in mental faculties.

However, by observing nature we can define non-human intelligence as divide and conquer problem solving.  By watching animals we see them solve problems by breaking their problems down into incremental sub-optimal solutions.  They try a solution, if it does not work they try another solution, learning from each mistake until they solve the problem.  Their solution does not have to be a perfect solution, it just has to be a solution that works.  Animals in nature do this all the time and its not really any different from the way humans solve problems.  "Mother Nature" gave each species exactly the amount of "intelligence" they needed to survive.

Humans are changing the earth, our home, our environment.  We have changed the earth to such an extent we need more "intelligence" than "Mother Nature" gave us  to survive.  To make up for what we are lacking we have created AI. We are using computer science, mathematics, electronics, logic and any straw we can grasp to survive.  We have named this survival technique Artificial Intelligence because the problems we have created are beyond our flesh and blood human intelligence to solve.

AI is the discipline that deals with uncertainty and manages it in decision making.

-->

II.  What is the State of the Art?

Firstly, YASKAWA BUSHIDO PROJECT / industrial robot vs sword master.

Secondly, NASA's Mars Exploration Rovers were in the news and spoke well of the state of the art of AI.

Thirdly, IBM's Deep Blue AI defeated the word champion chess master Garry Kasparov.

These three examples show the state of the art of AI.

III.  SUMMARY

We are still learning how to survive, and AI will help us survive.

Elcric Otto Circle




-->

-->




How to Link to Elcric Otto Circle's home page!

It will appear on your website as:

Link to: "ELCRIC OTTO CIRCLE's Home Page".