Connectionist learning of belief networks.

*(English)*Zbl 0761.68081Summary: Connectionist learning procedures are presented for “sigmoid” and “noisy-OR” varieties of probabilistic belief networks. These networks have previously been seen primarily as a means of representing knowledge derived from experts. Here it is shown that the “Gibbs sampling” simulation procedure for such networks can support maximum-likelihood learning from empirical data through local gradient ascent. This learning procedure resembles that used for “Boltzmann machines”, and like it, allows the use of “hidden” variables to model correlations between visible variables. Due to the directed nature of the connections in a belief network, however, the “negative phase” of Boltzmann machine learning is unnecessary. Experimental results show that, as a result, learning in a sigmoid belief network can be faster than in a Boltzmann machine. These networks have other advantages over Boltzmann machines in pattern classification and decision making applications, are naturally applicable to unsupervised learning problems, and provide a link between work on connectionist learning and work on the representation of expert knowledge.

##### Keywords:

gradient ascent learning; probabilistic knowledge; sigmoid and noisy-OR varieties; Boltzmann machines; probabilistic belief networks##### Software:

AutoClass
Full Text:
DOI

**OpenURL**

##### References:

[1] | Ackley, D.H.; Hinton, G.E.; Sejnowski, T.J., A learning algorithm for Boltzmann machines, Cogn. sci., 9, 147-169, (1985) |

[2] | Bridle, J.S., Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition, () |

[3] | Cheeseman, P.; Kelly, J.; Self, M.; Stutz, J.; Taylor, W.; Freeman, D., Autoclass: a Bayesian classification system, () |

[4] | Dempster, A.P.; Laird, N.M.; Rubin, D.B., Maximum likelihood from incomplete data via the EM algorithm (with discussion), J. roy. stat. soc. B, 39, 1-38, (1977) · Zbl 0364.62022 |

[5] | Derthick, M., Variations on the Boltzmann machine learning algorithm, () |

[6] | Gelfand, A.E.; Smith, A.F.M., Sampling-based approaches to calculating marginal densities, J. am. stat. assoc., 85, 398-409, (1990) · Zbl 0702.62020 |

[7] | Henrion, M., Towards efficient probabilistic diagnosis in multiply connected belief networks, () |

[8] | Hinton, G.E.; Sejnowski, T.J., Learning and relearning in Boltzmann machines, (), 282-317 |

[9] | Lauritzen, S.L.; Spiegelhalter, D.J., Local computations with probabilities on graphical structures and their application to expert systems (with discussion), J. roy. stat. soc. B, 50, 2, 157-224, (1988) · Zbl 0684.68106 |

[10] | Levinson, S.E.; Rabiner, L.R.; Sondhi, M.M., An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition, Bell syst. tech. J., 62, 4, (1983) · Zbl 0507.68058 |

[11] | Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H.; Teller, E., Equation of state calculations by fast computing machines, J. chem. phys., 21, 6, 1087-1092, (1953) |

[12] | Neal, R.M., Learning stochastic feedforward networks, () |

[13] | Oliver, R.M.; Smith, J.Q., () |

[14] | Pearl, J., Evidential reasoning using stochastic simulation of causal models, Artif. intell., 32, 2, 245-257, (1987) · Zbl 0642.68177 |

[15] | Pearl, J., () |

[16] | Rumelhart, D.E.; Hinton, G.E.; Williams, R.J., Learning representations by back-propagating errors, Nature, 323, 533-536, (1986) · Zbl 1369.68284 |

[17] | Shachter, R.D., Probabilistic inference and influence diagrams, Oper. res., 36, 4, 589-604, (1988) · Zbl 0651.90043 |

[18] | Spiegelhalter, D.J.; Lauritzen, S.L., Sequential updating of conditional probabilities on directed graphical structures, Networks, 20, 579-605, (1990) · Zbl 0697.90045 |

[19] | Stornetta, W.S.; Huberman, B.A., An improved three-layer back propagation algorithm, () |

[20] | Stone, M., Cross-validatory choice and assessment of statistical predictions (with discussion), J. roy. stat. soc. B, 36, 111-147, (1974) · Zbl 0308.62063 |

[21] | Titterington, D.M.; Smith, A.F.M.; Makov, U.E., () |

[22] | Williams, C.K.I.; Hinton, G.E., Mean field networks that learn to discriminate temporally distorted string, () |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.