To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Languages
Recent
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

Galaxy (computational biology)

From Wikipedia, the free encyclopedia

Galaxy[2] is a scientific workflow, data integration,[3][4] and data and analysis persistence and publishing platform that aims to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used as a general bioinformatics workflow management system.[5]

YouTube Encyclopedic

  • 1/5
    Views:
    1 300
    493
    792
    956
    1 362
  • Computational Mathematics and its Role in Science and Engineering
  • George Djorgovski - CS+Astronomy - Alumni College 2016
  • Petaflop Biofluidics Simulations on the TSUBAME 2.0 Supercomputer
  • Dr. Daoud Meerzaman: Computational Tools for Cancer Genome Analysis
  • Computational Metagenomics on the Species Level

Transcription

>>> SO I'M GOING TO TALK ABOUT THE ROLE OF COMPUTING A LITTLE BIT IN SOCIETY IN GENERAL AND THEN I'M GOING TO TALK BRIEFLY ABOUT ONE OF THE PROBLEMS WE ARE LOOKING AT. MY GROUP IS RELATIVELY LARGE. I HAVE SEVEN MATH STUDENTS AND I HAVE THREE ENGINEERING STUDENTS. AND WE TEND TO LOOK AT A LOT OF DIFFERENT PROBLEMS BECAUSE WE HAVE A LOT OF DIFFERENT INTERESTS. I'M GOING TO FOCUS IN ON ONE PIECE AS WE GO ALONG HERE. SO WHAT IS COMPUTING? COMPUTING IS AN ESSENTIAL PART OF SOCIETY AROUND US. IT'S AN IMPORTANT PART OF THE STEM DISCIPLINES. COMPUTATIONAL SCIENCE HAS DEVELOPED INTO A DISCIPLINE IN ITS OWN RIGHT. IT REALLY REPRESENTS SORT OF THE THIRD LEG OF SCIENCE AS THINGS STAND. COMPUTATIONAL THE TOOLS ARE REALLY A KEY IN THE 21st CENTURY FOR HELPING US DO SCIENTIFIC DISCOVERY IN MANY AREAS FROM BIOLOGY WHERE WE ARE LOOKING AT ARE GENOMICS, THROUGH PHYSICAL SCIENCES WE'RE TRYING TO UNDERSTAND THE ORIGINS OF THE UNIVERSE, TO MODELING AND SIMULATING HOW WE DESIGN AND BUILD COMPUTER CHIPS. COMPUTING PLAYS A CRITICAL ROLE IN EVERY PART OF THE SCIENTIFIC ENDEAVORS. I LIKE A TONGUE IN CHEEK DESCRIPTION OF COMPUTING BECAUSE FRANKLY I THINK IT'S GOOD TO BRING A LITTLE LEVITY TO THE SITUATION. I LIKE TO SAY COMPUTING IS WHEN PEN AND PAPER ARE NO LONGER THE RIGHT WAY TO GO. SO REALLY IT'S WHEN THE PROBLEM BECOMES SO BIG THAT TO REALLY TACKLE THE PROBLEM YOU HAVE TO BRING TO BEAR TOOLS OF SCIENTIFIC COMPUTING TO THE PROBLEM. USE THE COMPUTER TO HELP YOU ANALYZE THE PROBLEM. SO WHAT DOES IT MEAN TO BRING COMPUTING TO ANALYZE A PROBLEM? YOU HAVE TO HAVE KNOWLEDGE OF THE APPLICATION YOU CARE ABOUT BECAUSE IF YOU DON'T UNDERSTAND THE APPLICATION YOU'RE TRYING TO STUDY YOU'RE PROBABLY NOT TRYING TO DEVELOP THE RIGHT TOOLS, UNDERSTAND THE RIGHT BEHAVIOR FOR THE PROBLEM YOU ARE LOOKING AT. YOU HAVE TO HAVE APPLICATION KNOWLEDGE. YOU NEED KNOWLEDGE OF COMPUTER SCIENCE -- THAT IS, HOW ARE YOU GOING TO PROGRAM AND REPRESENT THINGS IN THE COMPUTER? AND YOU NEED MATHEMATICAL SKILLS, APPROXIMATION THEORY TO SAY HOW DO I TAKE THIS PROBLEM AND BEST APPROXIMATE WHAT THE COMPUTER CAN DO? THE COMPUTER CAN DO FOUR THINGS WELL. IT ADDS, SUBTRACTS, DIVIDES AND MULTIPLIES. EVERYTHING ELSE WE MAKE IT DO IS BASED ON THOSE SKILLS IT CAN DO. AND SO REALLY TRANSLATING MODELS INTO SOMETHING THAT CAN BE REPRESENTED IN THAT WAY IS WHERE MATHEMATICS COMES IN. SO THE GOALS OF WHAT I DO -- MY GROUP FOCUSES ON SORT OF THREE AREAS. ONE IS MEMBRANE SCIENCE. SO MEMBRANES ARE USED IN MANY ASPECTS OF SCIENCE AND TECHNOLOGY. IN FUEL CELLS, BATTERIES, AND SOLAR CELLS, THEY ARE WHAT'S CALLED A SEPARATOR MEMBRANE. THESE ARE FUNCTIONALIZED POLYMERS THAT LET ELECTRONS GO ONE WAY AND IONS HAVE TO GO ANOTHER WAY SO YOU HAVE CURRENT DRAWN THROUGH THE SYSTEM. THIS IS REALLY HOW YOU DEVELOP MODELS THAT ACCURATELY DESCRIBE THE MORPHOLOGY, ACCURATE DESCRIBE THE BEHAVIORS IS ONE OF THE THINGS I'M INTERESTED IN. THE SAME SORTS OF MODELS ALSO PLAY A ROLE IN BIOLOGICAL APPLICATIONS. I'M INTERESTED IN HOW THESE PLAY A ROLE IN THINGS LIKE LIPID VIOLATORS. ANOTHER AREA -- THE AREA THAT I GOT MY START IN THAT WE WORK IN IS PLASMA SCIENCE. WHAT I'M REALLY INTERESTED IN IS HOW PLASMA CAN BE USED TO AFFECT THE WORLD AROUND US AND IMPROVE THE WORLD AROUND US. EVERYTHING FROM SPACECRAFT PROPULSION SYSTEMS THROUGH BUILDING BETTER FUEL SYSTEMS FOR JET ENGINES, PLASMA IS ACTUALLY PLAYING AN IMPORTANT ROLE IN THESE SORTS OF SYSTEMS I WORK IN. PLASMA SCIENCE IS ONE OF THE MAIN AREAS WE FOCUS ON. IT TURNS OUT THAT RELATED TO COMPUTING -- BECAUSE AS WE MOVE TO LARGE SCALE COMPUTING WE END UP WITH SUCH MASSIVE AMOUNTS OF DATA, RIGHT NOW I'M WORKING WITH OAK RIDGE RESEARCH LAB AND THEY ARE WORKING ON KINETIC SIMULATIONS WHERE ONE TIME STEP OF THE SIMULATION GENERATES ABOUT FOUR PETABYTES OF DATA. THAT'S AN AMAZING AMOUNT OF DATA. HOW DO YOU EVEN SIFT THROUGH THAT DATA TO FIND THE ESSENTIAL INFORMATION? DATA SCIENCE IS CROSSING OVER AND PLAYING A REALLY IMPORTANT ROLE, NOT ONLY THROUGH SOCIETY IN GENERAL BUT IN SCIENTIFIC COMPUTING WE TRY TO DO LARGE SCALE COMPUTING. DATA SCIENCE IS PLAYING A REALLY IMPORTANT ROLE THERE AS WELL. AND SO THESE ARE THINGS THAT I CARE ABOUT AND I WORK ON. SO WHAT DOES MY GROUP DO? MY RESEARCH GROUP WORKS ON DEVELOPING NEW COMPUTATIONAL METHODS THAT HELP GIVE A BETTER UNDERSTANDING OF THE WORLD AROUND US FOR PROBLEMS THAT WE ARE BARELY STARTING TO UNDERSTAND RIGHT NOW. SO I'M GOING TO TALK ABOUT ONE OF THESE PROBLEMS RIGHT NOW. IT'S A CORRELATED PLASMA. SO, FIRST OF ALL, TO UNDERSTAND THE PROBLEM YOU HAVE TO KNOW WHAT A PLASMA IS. TO UNDERSTAND THE PROBLEM, YOU NEED TO KNOW WHAT DOES IT MEAN TO BE CORRELATED, AND THEN WHO CARES? WHY DO WE CARE ABOUT THIS? RIGHT? THAT'S THE THREE THINGS I WANT TO SAY IN THE NEXT COUPLE OF SLIDES. THIS IS JOINT WORK WITH YINGDA CHENG AND JOHN VERBONCOEUR. THEN GAUTHAM AND MAYUR ARE TWO OF THE STUDENTS IN ENGINEERING THAT ARE HELPING WITH THIS WORK. SO LET'S START WITH A SIMPLE TOPIC OF WHAT IS A PLASMA? YOU START WITH A SOLID AND YOU ADD ENERGY AND IT MELTS INTO A LIQUID. IF YOU HEAT IT UP MORE -- YOU ADD MORE HEAT, IT EVAPORATES INTO A GAS. IF YOU ADD EVEN MORE ENERGY WHAT HAPPENS TO THE GAS MOLECULES THEMSELVES, THE FUNDAMENTAL MOLECULE BREAKS DOWN AND SEPARATES. SO END UP WITH IONS AND ELECTRONS, THE CHARGED PARTICLES FLOATING AROUND AS A GAS. THAT'S A PLASMA. IT'S REFERRED TO AS THE FOURTH STATE OF MATTER. IT'S 99% OF THE VISIBLE UNIVERSE. WHEN YOU LOOK AT THE NIGHT SKY -- WHEN YOU LOOK AT THE SKY AT NIGHT IT'S -- 99% OF WHAT YOU SEE IS IN THE PLASMA STATE. TO GIVE YOU SORT OF A REFERENCE FOR UNDERSTANDING IT'S HARD TO UNDERSTAND WHEN YOU ADD MORE ENERGY WHAT IT IS THAT REALLY MEANS. IF YOU TALK ABOUT A LABORATORY PLASMA, WE ARE GOING IN THE LAB AND WE'RE GOING TO MAKE AN ARGON PLASMA. AN ARGON PLASMA IS WHERE YOU TAKE AN ELECTROMAGNETIC WAVE AND YOU ADD A LOT OF ENERGY AND FORCE THE GAS TO BREAK DOWN. THE TEMPERATURE OF LABORATORY PLASMA TYPICALLY FOR ELECTRONS IS AROUND THREE ELECTRON VOLTS. ONE ELECTRON VOLT IS 10,000 KELVIN. IT'S REALLY, REALLY, REALLY HOT. THERE IS A LOT OF ENERGY IN THE SYSTEM. YOU COULD ASK WHY DOESN'T IT MELT THE SYSTEM AND EVAPORATE? BECAUSE 30,000 KELVINS SHOULD MELT ANY METAL, RIGHT? THAT WOULD BE WHAT YOU WOULD THINK. TURNS OUT TYPICALLY WE TALK ABOUT LABORATORY PLASMAS. THEY ARE SO DILUTE THE AMOUNT OF ENERGY THEY IMPACT THE SIDE OF THE SYSTEM IS SO LITTLE THAT NONE OF THE STUFF MELTS. BUT LABORATORY PLASMAS ARE AN IMPORTANT PART OF SCIENCE. THIS SORT OF STUFF IS WHERE I GOT MY START. MANUFACTURING COMPUTER CHIPS. LABORATORY PLASMA PLAYS A CRITICAL ROLE IN UNDERSTANDING HOW WE MANUFACTURE COMPUTER CHIPS. SO WHAT IS A CORRELATED PLASMA? I HAVE DESCRIBED WHAT A PLASMA IS. LET'S START OUT WITH HOW WE ARE GOING TO MAKE THE PLASMA. WE'RE GOING TO DO SOMETHING A LITTLE BIT CRAZY. WE'RE GOING TO TAKE A GAS AND COOL IT DOWN TO .1 KELVIN. IN DOING SO, THE MATTER IS GOING TO CHANGE STATE AND FORM WHAT'S CALLED A BOSE EINSTEIN CONDENSATE. SO WHAT YOU WILL HAVE IS YOU WILL HAVE A COLLECTION OF BOSONS THAT COLLAPSE DOWN. SO THE MATTER IS GOING TO CHANGE FROM BEING -- INSTEAD OF LIKE A FROZEN SOLID, IT'S GOING TO COLLAPSE DOWN AND YOU WILL GET THIS CORE OF BOSONS WITH A CLOUD OF FERMIONS AROUND THE OUTSIDE. IT'S ACTUALLY GOING TO CHANGE THE WAY WE THINK OF MATTER WOULD BE IONS AND ELECTRONS IN A DIFFERENT STATE OF MATTER. THEN WHAT WE ARE GOING TO DO IS TAKE A LASER AND TRY TO SEPARATE THE ELECTRONS FROM THE CLOUD BY JUST ADDING ENOUGH ENERGY TO FORCE IT TO IONIZE. IN DOING SO WE FORM A PLASMA WHERE THE IONS ARE .1 KELVIN. 0 KELVIN IS ABSOLUTE ZERO IN THE UNIVERSE. RIGHT? SO WE'RE TALKING VERY COLD PLASMA. THEN WE'RE GOING TO ADD -- AND THE ELECTRONS IN IT ARE ONE KELVIN OF ENERGY. THEY ARE VERY COLD AS WELL. BUT THERE ARE ENOUGH THAT THEY ACT A LITTLE BIT THERMAL. SO WHAT'S INTERESTING ABOUT THIS SYSTEM IS THAT THERE IS MORE POTENTIAL ENERGY THAN KINETIC ENERGY. SO WHAT IS POTENTIAL ENERGY? WE THINK ABOUT A BALL AT THE TOP OF A HILL. WHEN IT SITS AT THE TOP OF THE HILL, THAT'S ITS POTENTIAL ENERGY. WHEN IT ROLLS DOWN AND IT'S ROLLING, THAT'S ITS KINETIC ENERGY. THE SYSTEM HAS A LOT OF POTENTIAL ENERGY, NOT KINETIC ENERGY. WHAT THAT MEANS IS THE SYSTEM ACTUALLY TENDS TO BEHAVE VERY POORLY. THAT IONS TEND TO ACT AS A BULK. THEY TEND TO BEHAVE AS A RIGID BODY MOTION THOUGH THEY ARE A SEPARATED GAS. THEY HAVE VERY DIFFERENT DYNAMICS WHEN WE THINK ABOUT GAS MOLECULES IN THE AIR. SO THE QUESTION -- AN IMPORTANT PART OF THE PROBLEM IS THAT ELECTROMAGNETIC WAVES DO NOT COUPLE THE WAVES THAT WE UNDERSTAND WELL TO THESE CORRELATED PLASMAS. WHY WOULD YOU CARE ABOUT -- WHY WOULD I CARE ABOUT TRYING TO COUPLE AN ELECTROMAGNETIC WAVE TO THIS CORRELATED PLASMA? WELL, OKAY. LET'S TALK ABOUT A COMPLETELY DIFFERENT PROBLEM WHICH IS THE MOTIVATION FOR THE PROBLEM I'M LOOKING AT. THE COMPLETELY DIFFERENT PROBLEM IS IF YOU CONSIDER A SOLAR FLARE FROM THE SUN. ALL RIGHT? SO THE SUN HAS BIG SOLAR FLARES AND WHAT HAPPENS IS ENERGETIC PARTICLES FROM THESE SOLAR FLARES GET TRAPPED IN THE EARTH'S IONOSPHERE AND BOUNCED BACK AND FORTH BETWEEN THE POLES. THIS IS DUE TO THE LORENTZ FORCE. THEY GET TRAPPED IN THE MAGNETIC FIELD LINES OF THE EARTH AND GO BACK AND FORTH. THESE ENERGETIC PARTICLES WITH A BIG SOLAR ERUPTION, IF IT ACTUALLY WAS TO HAPPEN WHERE ONE WAS TO HIT THE EARTH, THOSE ENERGETIC PARTICLES WOULD DESTROY MODERN COMMUNICATION SATELLITES WITHIN A MATTER OF HOURS. AND SO WHO CARES ABOUT THIS? THE TELECOM INDUSTRY, THOSE OF US WHO LIKE OUR CELL PHONE. I LIKE TO TALK ON MY CELL PHONE. I LIKE TO DO ALL SORTS OF STUFF ON THE INTERNET. I CARE ABOUT WHETHER THIS ACTUALLY HAPPENS. SO WHAT YOU WANT TO DO IS YOU WANT TO THINK ABOUT IS THERE A WAY THAT YOU COULD PREVENT THESE ENERGETIC PARTICLES FROM DESTROYING THE SATELLITES. WELL, OKAY. THERE IS A POSSIBLE SOLUTION WHICH IS TO TRY AND USE ENERGETIC ELECTROMAGNETIC WAVES TO KNOCK THOSE PARTICLES THAT ARE TRAPPED ON THE EARTH'S MAGNETIC FIELD INTO WHAT'S CALLED THE LOSS CONE. IF YOU CAN KNOCK THEM INTO THE LOSS CONE, THEY WILL DRIFT DOWN AND THEY WON'T IMPACT OUR SATELLITES. THEY WILL ACTUALLY BURN UP COMING INTO THE EARTH'S ATMOSPHERE. SO YOU COULD COUPLE THE MAGNETIC WAVES INTO THE IONOSPHERE AND YOU COULD KNOCK THESE ENERGETIC PARTICLES OUT OF THE MAGNETIC FIELD. THE PROBLEM IS THAT THE EARTH'S MAGNETIC FIELD IS FULL OF DUST. WHY DOES THAT MATTER? BECAUSE DUST CHARGES UP NEGATIVELY. THE DUST CHARGED NEGATIVELY AND IT'S VERY HEAVY. IT STARTS TO LOOK LIKE THAT ULTRA COLD PLASMA I WAS TALKING ABOUT. VERY BIG, VERY, VERY HEAVY, VERY SLOW MOVING PARTICLES WITH LOTS OF CHARGE. AND THE ENERGETIC IONS NOW LOOK -- WE HAVE REVERSED THE ROLES. THEY LOOK LIKE THE ELECTRONS. IN FACT, IT IS A CORRELATED PLASMA THAT BEHAVES VERY DYNAMICALLY LIKE A CORRELATED PLASMA. SO WHO CARES ABOUT THIS? WELL, ALL SORTS OF PEOPLE IN TELECOMMUNICATIONS AS WELL AS THE AIR FORCE WOULD CARE ABOUT THIS. HOW DO I COUPLE ELECTROMAGNETIC WAVES INTO THIS CORRELATED PLASMA TO KNOCK THESE ENERGETIC PARTICLES INTO THE LOSS CONE SO THEY'RE NOT GOING TO DESTROY MY SATELLITE? WHY ARE WE LOOKING AT, SAY, A BOSE EINSTEIN CONDENSATE TURNED INTO PLASMA INSTEAD OF A DUST PARTICLE? WELL, IT'S REALLY, REALLY HARD TO MODEL A DUSTY PLASMA WELL. WE ARE TRYING TO UNDERSTAND THE FUNDAMENTAL PROCESS OF HOW YOU CONNECT ELECTROMAGNETIC WAVES TO A CORRELATED PLASMA. SO WE ARE LOOKING AT A LABORATORY SETTING WHERE WE CAN ACTUALLY VERY ACCURATELY MODEL. IN THE CASE OF A BOSE EINSTEIN CONDENSATE, THERE ARE SO FEW IONS THAT WE CAN ACTUALLY MODEL EACH PARTICLE WITHIN THE SYSTEM INDIVIDUALLY. IT'S A VERY, VERY SMALL SYSTEM FROM WHAT WE ARE USED TO TALKING ABOUT. SINCE WE CAN MODEL EVERY PARTICLE INDIVIDUALLY AND THERE ARE VERY, VERY GOOD LABORATORY BENCHMARKS, WE ARE TRYING TO BUILD A VIRTUAL LABORATORY SO WE CAN UNDERSTAND THE IMPACT OF CORRELATION IN THE SYSTEM TO TRY AND DEVELOP MORE COARSE-GRAINED MODELS THAT WOULD ALLOW US TO MODEL LARGER SYSTEMS THAT INCLUDE THAT CORRELATION IN THEM. AND SO WE ARE TRYING TO START AT THE SMALL LEVEL AND BUILD UP TO THE LARGER LEVEL TO TACKLE THIS BIGGER PROBLEM. THAT'S REALLY WHAT THIS IS BEING MOTIVATED BY. SO MY GROUP DOES A LOT OF THINGS. IN PARTICULAR WE LOOK AT FUNCTIONALIZED POLYMERS. WE LOOK AT NERVE GAS LASERS. WE LOOK AT PLASMA ASSISTED COMBUSTION. WE LOOK AT MODELING CORRELATED PLASMAS. WE LOOK AT DATA SCIENCE TOOLS FOR REPRESENTING -- FOR MINIMAL REPRESENTATION. WE LOOK AT NEXT GENERATION HPC KIND OF TOOLS, HIGH PERFORMANCE COMPUTING AND MULTI-SCALE PHYSICS PROBLEMS. WE LOOK AT SCALE-BRIDGING NUMERICAL METHODS. ALL OF THIS IS REALLY TRYING TO TACKLE VERY BIG PROBLEMS, BUT WE ARE TRYING TO TAKE IT ONE PIECE AT A TIME WORKING ON BETTER TOOLS, BETTER NUMERICAL METHODS AND TRYING TO DO IT SYSTEMATICALLY WITH THE HOPE OF MAKING A BIG IMPACT IN THE END. SO, WITH THAT, I WOULD LIKE TO PUT UP A SLIDE OF ALL THE PEOPLE WHO DO THE REALLY HARD WORK WHICH ARE MY STUDENTS. SO THESE ARE MY CURRENT STUDENTS IN MATHEMATICS AND ENGINEERING AND MY CURRENT POST-DOCS. WITH THAT, I WOULD LIKE TO THANK YOU FOR YOUR TIME. [ APPLAUSE ]

Contents

Functionality

Galaxy is a scientific workflow system. These systems provide a means to build multi-step computational analyses akin to a recipe. They typically provide a graphical user interface[6] for specifying what data to operate on, what steps to take, and what order to do them in.

Galaxy is also a data integration platform for biological data. It supports data uploads from the user's computer, by URL, and directly from many online resources (such as the UCSC Genome Browser, BioMart and InterMine). Galaxy supports a range of widely used biological data formats, and translation between those formats. Galaxy provides a web interface to many text manipulation utilities, enabling researchers to do their own custom reformatting and manipulation without having to do any programming. Galaxy includes interval manipulation utilities for doing set theoretic operations (e.g. intersection, union, ...) on intervals. Many biological file formats include genomic interval data (a frame of reference, e.g., chromosome or contig name, and start and stop positions), allowing these data to be integrated.

Galaxy was originally written for biological data analysis, particularly genomics. The set of available tools has been greatly expanded over the years and Galaxy is now also used for gene expression, genome assembly, proteomics, epigenomics, transcriptomics and host of other disciplines in the life sciences. The platform itself is actually domain agnostic and can be applied, in theory, to any scientific domain. For example, Galaxy servers exist for image analysis,[7] computational chemistry[8] and drug design,[9] cosmology, climate modeling, social science,[10] and linguistics.

Finally, Galaxy also supports data and analysis persistence and publishing. See Reproducibility and Transparency below.

Project Goals

Galaxy is "an open, web-based platform for performing accessible, reproducible, and transparent genomic science."[11]

Accessibility

Computational biology is a specialized domain that often requires knowledge of computer programming. Galaxy aims to give biomedical researchers access to computational biology without also requiring them to understand computer programming.[12][13] Galaxy does this by stressing a simple user interface[14] over the ability to build complex workflows. This design choice makes it relatively easy to build typical analyses, but more difficult to build complex workflows that include, for example, looping constructs. (See Apache Taverna for an example of a data-driven workflow system that supports looping.[15])

Reproducibility

Reproducibility is a key goal of science: When scientific results are published the publications should include enough information that others can repeat the experiment and get the same results. There have been many recent efforts to extend this goal from the bench (the "wet lab") to computational experiments (the "dry lab") as well. This has proved to be a more difficult task than initially expected.[16]

Galaxy supports reproducibility by capturing sufficient information about every step in a computational analysis, so that the analysis can be repeated, exactly, at any point in the future. This includes keeping track of all input, intermediate, and final datasets, as well as the parameters provided to, and the order of each step of the analysis.

Transparency

Galaxy supports transparency in scientific research by enabling researchers to share any of their Galaxy Objects either publicly, or with specific individuals. Shared items can be examined in detail, rerun at will and copied and modified to test hypotheses.

Galaxy Objects: Histories, Workflows, Datasets and Pages

Galaxy objects are anything that can be saved, persisted, and shared in Galaxy:

Histories
Histories are computational analyses (recipes) run with specified input datasets, computational steps and parameters. Histories include all intermediate and output datasets as well.
Workflows
Workflows are computational analyses that specify all the steps (and parameters) in the analysis, but none of the data. Workflows are used to run the same analysis against multiple sets of input data.
Datasets
Datasets includes any input, intermediate, or output dataset, used or produced in an analysis.
Pages
Histories, workflows and datasets can include user-provided annotation. Galaxy Pages enables the creation of a virtual paper that describes the how and why of the overall experiment. Tight integration of Pages with Histories, Workflows, and Datasets supports this goal.

Availability

Galaxy is available:

  1. As a free public web server,[17] supported by the Galaxy Project.[18] This server includes many bioinformatics tools that are widely useful in many areas of genomics research. Users can create logins, and save histories, workflows, and datasets on the server. These saved items can also be shared with others.
  2. As open-source software that can be downloaded, installed and customized to address specific needs.[19] Galaxy can be installed locally or using a computing cloud.[20]
  3. Public web servers hosted by other organizations.[21] Several organizations with their own Galaxy installation have also opted to make those servers available to others.
  4. As part of the GenomeSpace initiative.

Implementation

Galaxy is open-source software implemented using the Python programming language. It is developed by the Galaxy team[22] at Penn State, Johns Hopkins University, Oregon Health & Science University, and the Galaxy Community.[23]

Galaxy is extensible, as new command line tools can be integrated and shared within the Galaxy ToolShed.[24]

An example of extending Galaxy is Galaxy-P from the University of Minnesota Supercomputing Institute, which is customized as a data analysis platform for mass spectrometry-based proteomics.[25]

Community

Galaxy is an open source project and the community includes users, organizations that install their own instance, Galaxy developers, and bioinformatics tool developers. The Galaxy project has mailing lists,[26] a community hub,[27] and annual meetings.[28]

See also

External links

References

  1. ^ https://galaxyproject.org/admin/license/
  2. ^ Afgan, E.; Baker, D.; van den Beek, M.; Blankenberg, D.; Bouvier, D.; Čech, M.; Chilton, J.; Clements, D.; Coraor, N.; Eberhard, C.; Grüning, B.; Guerler, A.; Hillman-Jackson, J.; Von Kuster, G.; Rasche, E.; Soranzo, N.; Turaga, N.; Taylor, J.; Nekrutenko, A.; Goecks, J. (8 July 2016). "The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update". Nucleic Acids Research. 44 (W1): W3–W10. doi:10.1093/nar/gkw343. PMC 4987906Freely accessible. PMID 27137889. 
  3. ^ Blankenberg, D.; Coraor, N.; Von Kuster, G.; Taylor, J.; Nekrutenko, A.; Galaxy, T. (2011). "Integrating diverse databases into an unified analysis framework: A Galaxy approach". Database. 2011: bar011. doi:10.1093/database/bar011. PMC 3092608Freely accessible. PMID 21531983. 
  4. ^ Blankenberg, D.; Gordon, A.; Von Kuster, G.; Coraor, N.; Taylor, J.; Nekrutenko, A.; Galaxy, T. (2010). "Manipulation of FASTQ data with Galaxy". Bioinformatics. 26 (14): 1783–1785. doi:10.1093/bioinformatics/btq281. PMC 2894519Freely accessible. PMID 20562416. 
  5. ^ https://galaxyproject.org/public-galaxy-servers
  6. ^ Schatz, M. C. (2010). "The missing graphical user interface for genomics". Genome Biology. 11 (8): 128–201. doi:10.1186/gb-2010-11-8-128. PMC 2945776Freely accessible. PMID 20804568. 
  7. ^ http://cloudimaging.net.au/
  8. ^ Hildebrandt, A. K.; Stöckel, D; Fischer, N. M.; de la Garza, L; Krüger, J; Nickels, S; Röttig, M; Schärfe, C; Schumann, M; Thiel, P; Lenhof, H. P.; Kohlbacher, O; Hildebrandt, A (2014). "Ballaxy: Web services for structural bioinformatics". Bioinformatics. 31: 121–2. doi:10.1093/bioinformatics/btu574. PMID 25183489. 
  9. ^ http://osddlinux.osdd.net:8001/
  10. ^ http://socscicompute.ss.uci.edu/
  11. ^ Goecks, J.; Nekrutenko, A.; Taylor, J.; Galaxy Team, T. (2010). "Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences". Genome Biology. 11 (8): R86. doi:10.1186/gb-2010-11-8-r86. PMC 2945788Freely accessible. PMID 20738864. 
  12. ^ Blankenberg, D.; Taylor, J.; Nekrutenko, A.; The Galaxy, T. (2011). "Making whole genome multiple alignments usable for biologists". Bioinformatics. 27 (17): 2426–8. doi:10.1093/bioinformatics/btr398. PMC 3157923Freely accessible. PMID 21775304. 
  13. ^ Blankenberg, D.; Taylor, J.; Schenck, I.; He, J.; Zhang, Y.; Ghent, M.; Veeraraghavan, N.; Albert, I.; Miller, W.; Makova, K. D.; Hardison, R. C.; Nekrutenko, A. (2007). "A framework for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly". Genome Research. 17 (6): 960–964. doi:10.1101/gr.5578007. PMC 1891355Freely accessible. PMID 17568012. 
  14. ^ Schatz, M. C. (2010). "The missing graphical user interface for genomics". Genome Biology. 11 (8): 128–201. doi:10.1186/gb-2010-11-8-128. PMC 2945776Freely accessible. PMID 20804568. 
  15. ^ Soiland-Reyes, S (2010-12-13). "Looping". The Taverna Knowledge Blog. knowledgeblog.org. Retrieved 28 January 2015. 
  16. ^ Ioannidis, J. P. A.; Allison, D. B.; Ball, C. A.; Coulibaly, I.; Cui, X.; Culhane, A. N. C.; Falchi, M.; Furlanello, C.; Game, L.; Jurman, G.; Mangion, J.; Mehta, T.; Nitzberg, M.; Page, G. P.; Petretto, E.; Van Noort, V. (2008). "Repeatability of published microarray gene expression analyses". Nature Genetics. 41 (2): 149–155. doi:10.1038/ng.295. PMID 19174838. 
  17. ^ https://usegalaxy.org/
  18. ^ http://galaxyproject.org/
  19. ^ http://getgalaxy.org/
  20. ^ Afgan, E.; Baker, D.; Coraor, N.; Chapman, B.; Nekrutenko, A.; Taylor, J. (2010). "Galaxy CloudMan: Delivering cloud compute clusters". BMC Bioinformatics. 11: S4. doi:10.1186/1471-2105-11-S12-S4. PMC 3040530Freely accessible. PMID 21210983. 
  21. ^ https://galaxyproject.org/public-galaxy-servers
  22. ^ https://galaxyproject.org/galaxy-team
  23. ^ Lazarus, R.; Taylor, J.; Qiu, W.; Nekrutenko, A. (2008). "Toward the commoditization of translational genomic research: Design and implementation features of the Galaxy genomic workbench". Summit on translational bioinformatics. 2008: 56–60. PMC 3041519Freely accessible. PMID 21347127. 
  24. ^ Blankenberg, Daniel; Von Kuster, Gregory; Bouvier, Emil; Baker, Dannon; Afgan, Enis; Stoler, Nicholas; Taylor, James; Nekrutenko, Anton (2014). "Dissemination of scientific software with Galaxy ToolShed". Genome Biology. 15 (2): 403. doi:10.1186/gb4161. PMC 4038738Freely accessible. PMID 25001293. 
  25. ^ Sheynkman, GM; Johnson, JE; Jagtap, PD; Shortreed, MR; Onsongo, G; Frey, BL; Griffin, TJ; Smith, LM (22 August 2014). "Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations.". BMC Genomics. 15 (703). doi:10.1186/1471-2164-15-703. PMC 4158061Freely accessible. PMID 25149441. 
  26. ^ https://galaxyproject.org/mailing-lists
  27. ^ https://galaxyproject.org/
  28. ^ https://galaxyproject.org/events
This page was last edited on 30 April 2017, at 08:56.
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.