To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Languages
Recent
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

BLOOM (language model)

From Wikipedia, the free encyclopedia

BigScience Large Open-science Open-access Multilingual Language Model (BLOOM)[1][2] is a 176-billion-parameter transformer-based autoregressive large language model (LLM). The model, as well as the code base and the data used to train it, are distributed under free licences.[3] BLOOM was trained on approximately 366 billion (1.6TB) tokens from March to July 2022.[4][5]

BLOOM is the main outcome of the BigScience collaborative initiative,[6] a one-year-long research workshop that took place between May 2021 and May 2022. BigScience was led by HuggingFace and involved several hundreds of researchers and engineers from France and abroad representing both the academia and the private sector. BigScience was supported by a large-scale public compute grant on the French public supercomputer Jean Zay, managed by GENCI and IDRIS (CNRS), on which it was trained.

BLOOM's training corpus, named ROOTS, combines data extracted from the then-latest version of the web-based OSCAR corpus (38% of ROOTS) and newly collected data extracted from a manually selected and documented list of language data sources. It encompasses 46 natural languages (in amounts ranging from 30% of the whole dataset for English to 0.00002% for Chi Tumbuka) and 13 programming languages.[7]

YouTube Encyclopedic

  • 1/3
    Views:
    12 609
    17 338
    2 244
  • Checking out the WORLD's LARGEST Open Multilingual Language Model - BLOOM
  • First look - BLOOM by BigScience (176B) - Launched Jul/2022 - Part 1 (multilingual, open source)
  • BigScience BLOOM Language Model: Live DEMO w/ LLM on Policy, Tech, Science and Economy.

Transcription

References

  1. ^ "BigScience Large Open-science Open-access Multilingual Language Model". Retrieved 2022-10-01.
  2. ^ Le Scao T, Fan A, Akiki C, Pavlick E, Ilić S, Hesslow D, Castagné R, Luccioni A, Yvon F, Gallé M, Tow J, Rush AM, Biderman S, Webson A, Sasanka Ammanamanchi P, Wang T, Sagot B, Muennighoff N, Villanova del Moral A, Ruwase O, Bawden R, Bekman S, McMillan-Major A, Beltagy I, Nguyen H, Saulnier L, Tan S, Ortiz Suarez P, Sanh V, Laurençon H, Jernite Y, Launay J, Mitchell M, Raffel C, et al. (2022). "BLOOM: A 176B-Parameter Open-Access Multilingual Language Model". arXiv:2211.05100.
  3. ^ "The BigScience RAIL license". Retrieved 2024-01-10.
  4. ^ Heikkilä, Melissa (2022-07-12). "BLOOM: Inside the radical new project to democratize AI". MIT Technology Review. Retrieved 2023-12-26.
  5. ^ "Release of largest trained open-science multilingual language model ever". French National Centre for Scientific Research. 2022-07-12. Retrieved 2023-12-26.
  6. ^ "BigScience". Retrieved 2024-01-10.
  7. ^ Laurençon H, Saulnier L, Wang T, Akiki C, Villanova del Moral A, Le Scao T, Von Werra L, Mou C, González Ponferrada C, Nguyen H, Frohberg J, Šaško M, Lhoest Q, McMillan-Major A, Dupont G, Biderman S, Rogers A, Ben allal L, De Toni F, Pistilli G, Nguyen O, Nikpoor S, Masoud M, Colombo P, de la Rosa J, Villegas P, Thrush T, Longpre S, Nagel S, Weber L, Muñoz M, Zhu J, Van Strien D, Alyafeai Z, Almubarak K, Vu MC, Gonzalez-Dios I, Soroa A, Lo K, Dey M, Ortiz Suarez P, Gokaslan A, Bose S, Adelani D, Phan L, Tran H, Yu I, Pai S, Chim J, Lepercq V, Ilic S, Mitchell M, Luccioni S, Jernite Y (2022). "The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset". arXiv:2303.03915.


This page was last edited on 17 February 2024, at 23:03
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.