To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

Kolmogorov–Arnold Network

From Wikipedia, the free encyclopedia

A Kolmogorov–Arnold Network (or KAN) is a neural network that is trained by learning the activation function at each node, rather than the weight on each edge as in a traditional multilayer perceptron. This is a practical application of the Kolmogorov–Arnold representation theorem.

Description

By the Kolmogorov–Arnold representation theorem, any multivariate continuous function can be written as

.

with suitable values of the functions and . To create a deep network, the KAN extends this to a graph of such representations:

where each is a learnable B-spline. Since this representation is differentiable, the values can be learned by any standard backpropagation technique.

For regularization, the L1 norm of the activation function is not sufficient by itself.[1] To keep the network sparse and prevent overfitting, an additional entropy term is introduced for each layer :

A linear combination of these terms and the L1 norm over all layers produces an effective regularization penalty. This sparse representation helps the deep network to overcome the curse of dimensionality.[2]

Properties

The number of parameters in such a representation is , where is the output dimension, is the depth of the network, and is the number of intervals over which each spline is defined. This might appear to be greater than the parameters needed to train a multilayer perceptron of depth and output dimension ; however, Liu et al. argue that in scientific domains the KAN can achieve equivalent performance with fewer parameters since many natural functions can be decomposed efficiently into splines.[1]

KANs have been shown to perform well on problems from knot theory and physics (such as Anderson localization), although they have not yet been scaled to language models.[1]

History

Computing the optimal Kolmogorov–Arnold representation for a given function has been a research challenge since at least 1993.[3] Early work on applying this representation to a network used a fixed depth of two and did not appear to be a promising alternative to perceptrons for image processing.[4][5]

More recently, a deep learning algorithm was proposed for constructing these representations using Urysohn operators.[6] In 2021, the representation was successfully applied to a deeper network.[7] In 2022, ExSpliNet brought together the Kolmogorov–Arnold representation with B-splines and probabilistic trees to show good performance on the Iris flower, MNIST, and FMNIST datasets, and argued for the better interpretability of this representation compared to perceptrons.[8]

The term "Kolmogorov–Arnold Network" was introduced by Liu et al. in 2024, which generalizes the networks to arbitrary width and depth, and demonstrated strong performance on a realistic class of multivariate functions although training is inefficient.[1]

References

  1. ^ a b c d Liu, Ziming; Wang, Yixuan; Vaidya, Sachin; Ruehle, Fabian; Halverson, James; Soljačić, Marin; Hou, Thomas Y.; Tegmark, Max (2024). "KAN: Kolmogorov-Arnold Networks". arXiv:2404.19756 [cs.LG].
  2. ^ Poggio, Tomaso (2022). "How deep sparse networks avoid the curse of dimensionality: Efficiently computable functions are compositionally sparse". Center for Brains, Minds and Machines Memo. 10.
  3. ^ Lin, Ji-Nan; Unbehauen, Rolf (January 1993). "On the Realization of a Kolmogorov Network". Neural Computation. 5 (1): 18–20. doi:10.1162/neco.1993.5.1.18.
  4. ^ Köppen, Mario (2022). "On the Training of a Kolmogorov Network". Lecture Notes in Computer Science. Vol. 2415. pp. 474–479. doi:10.1007/3-540-46084-5_77. ISBN 978-3-540-44074-1. {{cite book}}: |journal= ignored (help); Missing or empty |title= (help)
  5. ^ Leni, Pierre-Emmanuel; Fougerolle, Yohan D.; Truchetet, Frédéric (2013). "The Kolmogorov Spline Network for Image Processing". Image Processing: Concepts, Methodologies, Tools, and Applications. IGI Global. ISBN 9781466639942.
  6. ^ Polar, Andrew; Poluektov, M. (30 December 2020). "A deep machine learning algorithm for construction of the Kolmogorov–Arnold representation". Engineering Applications of Artificial Intelligence. 2020: 104–137. arXiv:2001.04652. doi:10.1016/j.engappai.2020.104137.
  7. ^ Schmidt-Hieber, Johannes (May 2021). "The Kolmogorov–Arnold representation theorem revisited". Neural Networks. 137: 119–126. arXiv:2007.15884. doi:10.1016/j.neunet.2021.01.020. PMID 33592434.
  8. ^ Fakhoury, Daniele; Fakhoury, Emanuele; Speleers, Hendrik (August 2022). "ExSpliNet: An interpretable and expressive spline-based neural network". Neural Networks. 152: 332–346. arXiv:2205.01510. doi:10.1016/j.neunet.2022.04.029. PMID 35598402.
This page was last edited on 13 May 2024, at 05:37
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.