To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Languages
Recent
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

Deterministic acyclic finite state automaton

From Wikipedia, the free encyclopedia

The strings "tap", "taps", "top", and "tops" stored in a trie (left) and a DAFSA (right), EOW stands for End-of-word.

In computer science, a deterministic acyclic finite state automaton (DAFSA),[1] also called a directed acyclic word graph (DAWG; though that name also refers to a related data structure that functions as a suffix index[2]) is a data structure that represents a set of strings, and allows for a query operation that tests whether a given string belongs to the set in time proportional to its length. Algorithms exist to construct and maintain such automata,[1] while keeping them minimal.

A DAFSA is a special case of a finite state recognizer that takes the form of a directed acyclic graph with a single source vertex (a vertex with no incoming edges), in which each edge of the graph is labeled by a letter or symbol, and in which each vertex has at most one outgoing edge for each possible letter or symbol. The strings represented by the DAFSA are formed by the symbols on paths in the graph from the source vertex to any sink vertex (a vertex with no outgoing edges). In fact, a deterministic finite state automaton is acyclic if and only if it recognizes a finite set of strings.[1]

YouTube Encyclopedic

  • 1/5
    Views:
    708 336
    43 818
    10 266
    95 725
    838
  • Deterministic Finite Automata (Example 1)
  • String Matching with Finite Automata
  • (ML 13.3) Directed graphical models - formalism (part 1)
  • Trie Data Structure (EXPLAINED)
  • Reactive Synthesis

Transcription

Comparison to tries

By allowing the same vertices to be reached by multiple paths, a DAFSA may use significantly fewer vertices than the strongly related trie data structure. Consider, for example, the four English words "tap", "taps", "top", and "tops". A trie for those four words would have 12 vertices, one for each of the strings formed as a prefix of one of these words, or for one of the words followed by the end-of-string marker. However, a DAFSA can represent these same four words using only six vertices vi for 0 ≤ i ≤ 5, and the following edges: an edge from v0 to v1 labeled "t", two edges from v1 to v2 labeled "a" and "o", an edge from v2 to v3 labeled "p", an edge v3 to v4 labeled "s", and edges from v3 and v4 to v5 labeled with the end-of-string marker. There is a tradeoff between memory and functionality, because a standard DAFSA can tell you if a word exists within it, but it cannot point you to auxiliary information about that word, whereas a trie can.

The primary difference between DAFSA and trie is the elimination of suffix and infix redundancy in storing strings. The trie eliminates prefix redundancy since all common prefixes are shared between strings, such as between doctors and doctorate the doctor prefix is shared. In a DAFSA common suffixes are also shared, for words that have the same set of possible suffixes as each other. For dictionary sets of common English words, this translates into major memory usage reduction.

Because the terminal nodes of a DAFSA can be reached by multiple paths, a DAFSA cannot directly store auxiliary information relating to each path, e.g. a word's frequency in the English language. However, if for each node we store the number of unique paths through that point in the structure, we can use it to retrieve the index of a word, or a word given its index.[3] The auxiliary information can then be stored in an array.

References

  1. ^ a b c Jan Daciuk, Stoyan Mihov, Bruce Watson and Richard Watson (2000). Incremental construction of minimal acyclic finite state automata. Computational Linguistics 26(1):3-16.
  2. ^ Public Domain This article incorporates public domain material from Paul E. Black. "directed acyclic word graph". Dictionary of Algorithms and Data Structures. NIST.
  3. ^ Kowaltowski, T.; CL Lucchesi (1993). "Applications of finite automata representing large vocabularies". Software-Practice and Experience. 1993: 15–30. CiteSeerX 10.1.1.56.5272.

External links

This page was last edited on 3 January 2024, at 01:02
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.