To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

From Wikipedia, the free encyclopedia

spaCy
Original author(s)Matthew Honnibal
Developer(s)Explosion AI, various
Initial releaseFebruary 2015; 8 years ago (2015-02)[1]
Stable release
3.7.2[2] Edit this on Wikidata / 16 October 2023; 46 days ago (16 October 2023)
Repository
Written inPython, Cython
Operating systemLinux, Windows, macOS, OS X
PlatformCross-platform
TypeNatural language processing
LicenseMIT License
Websitespacy.io Edit this at Wikidata

spaCy (/spˈs/ spay-SEE) is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython.[3][4] The library is published under the MIT license and its main developers are Matthew Honnibal and Ines Montani, the founders of the software company Explosion.

Unlike NLTK, which is widely used for teaching and research, spaCy focuses on providing software for production usage.[5][6] spaCy also supports deep learning workflows that allow connecting statistical models trained by popular machine learning libraries like TensorFlow, PyTorch or MXNet through its own machine learning library Thinc.[7][8] Using Thinc as its backend, spaCy features convolutional neural network models for part-of-speech tagging, dependency parsing, text categorization and named entity recognition (NER). Prebuilt statistical neural network models to perform these tasks are available for 23 languages, including English, Portuguese, Spanish, Russian and Chinese, and there is also a multi-language NER model. Additional support for tokenization for more than 65 languages allows users to train custom models on their own datasets as well.[9]

History

  • Version 1.0 was released on October 19, 2016, and included preliminary support for deep learning workflows by supporting custom processing pipelines.[10] It further included a rule matcher that supported entity annotations, and an officially documented training API.
  • Version 2.0 was released on November 7, 2017, and introduced convolutional neural network models for 7 different languages.[11] It also supported custom processing pipeline components and extension attributes, and featured a built-in trainable text classification component.
  • Version 3.0 was released on February 1, 2021, and introduced state-of-the-art transformer-based pipelines.[12] It also introduced a new configuration system and training workflow, as well as type hints and project templates. This version dropped support for Python 2.

Main features

Extensions and visualizers

Dependency parse tree visualization generated with the displaCy visualizer
Dependency parse tree visualization generated with the displaCy visualizer

spaCy comes with several extensions and visualizations that are available as free, open-source libraries:

References

  1. ^ "Introducing spaCy". explosion.ai. Retrieved 2016-12-18.
  2. ^ "Release 3.7.2". 16 October 2023. Retrieved 20 October 2023.
  3. ^ Choi et al. (2015). It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool.
  4. ^ "Google's new artificial intelligence can't understand these sentences. Can you?". Washington Post. Retrieved 2016-12-18.
  5. ^ "Facts & Figures - spaCy". spacy.io. Retrieved 2020-04-04.
  6. ^ Bird, Steven; Klein, Ewan; Loper, Edward; Baldridge, Jason (2008). "Multidisciplinary instruction with the Natural Language Toolkit" (PDF). Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics, ACL: 62. doi:10.3115/1627306.1627317. ISBN 9781932432145. S2CID 16932735.
  7. ^ "PyTorch, TensorFlow & MXNet". thinc.ai. Retrieved 2020-04-04.
  8. ^ "explosion/thinc". GitHub. Retrieved 2016-12-30.
  9. ^ "Models & Languages | spaCy Usage Documentation". spacy.io. Retrieved 2020-03-10.
  10. ^ "explosion/spaCy". GitHub. Retrieved 2021-02-08.
  11. ^ "explosion/spaCy". GitHub. Retrieved 2021-02-08.
  12. ^ "explosion/spaCy". GitHub. Retrieved 2021-02-08.
  13. ^ "Models & Languages - spaCy". spacy.io. Retrieved 2021-02-08.
  14. ^ "Models & Languages | spaCy Usage Documentation". spacy.io. Retrieved 2021-02-08.
  15. ^ "Benchmarks | spaCy Usage Documentation". spacy.io. Retrieved 2021-02-08.
  16. ^ Trask et al. (2015). sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings.

External links

This page was last edited on 18 November 2023, at 22:02
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.