Mass spectrometry data format

Mass spectrometry data format

Mass spectrometry is a scientific technique for measuring the mass of ions. It is often coupled to chromatographic techniques such as gas- or liquid chromatography and has found widespread adoption in the fields of analytical chemistry and biochemistry where it can be used to identify and characterize small molecules and proteins (proteomics). The large volume of data produced in a typical mass spectrometry experiment requires that computers be used for data storage and processing. Over the years, different manufacturers of mass spectrometers have developed various proprietary data formats for handling such data which makes it difficult for academic scientists to directly manipulate their data. To address this limitation, several open, XML-based data formats have recently been developed by the Trans-Proteomic Pipeline at the Institute for Systems Biology to facilitate data manipulation and innovation in the public sector. These data formats are described here.

Contents

Open formats

JCAMP-DX

This format was one of the earliest attempts to supply a standardized file format for data exchange in mass spectrometry. JCAMP-DX was initially developed for infrared spectrometry. JCAMP-DX is a ASCII based format and therefore not very compact even though it includes standards for file compression. JCAMP was officially released in 1988[1]. JCAMP was found impractical for today's large MS data sets, but it is still used for exchanging moderate numbers of spectra. IUPAC is currently in charge and the latest protocol is from 2005[2].

ANDI-MS or netCDF

The ANalytical Data Interchange format for Mass Spectrometry is a format for exchanging data. Many mass spectrometry software packages can read or write ANDI files. ANDI is specified in the ASTM E1947 Standard. ANDI is based on netCDF which is a software tool library for writing and reading data files. ANDI was initially developed for chromatography-MS data and therefore was not used in the proteomics gold rush where new formats based on XML were developed.

mzXML

mzXML is a XML (eXtensible Markup Language) based common file format for proteomics mass spectrometric data.[3][4] Most mass spectrometers do not directly produce mzXML data, but there are several tools available that generate mzXML files from native acquisition files.

mzData

The Human Proteome Organization (HUPO) has developed a common file format called mzData which offers similar functionality to mzXML.[5]

mzML

The existence of the two above standard formats for proteomics data is an undesirable state. Thus, mzData and mzXML developers are currently developing the joint format called mzML.[5][6][7] As of 2008-06-01, mzML 1.0.0 is ready. This format was officially released at the 2008 American Society for Mass Spectrometry Meeting.[8]

On 2009-06-01, mzML 1.1.0 was released. [9] There are no planned further changes as of mid 2010.

Proprietary formats

Here is a list of different file format extensions :

  • .BAF : Bruker instrument data format
  • .D : Agilent QTOF instrument data format
  • .FID : Bruker instrument data format
  • .PKL : MassLynx associated format
  • .RAW :
    • Thermo Xcalibur file format
    • Micromass MassLynx directory format
    • PerkinElmer TurboMass file format
  • .WIFF : ABI/Sciex (QSTAR and QTRAP instrument) file format
  • .YEP : Bruker instrument data format

Software

Viewers

There are several viewers for mzXML and mzData: PEAKS,[10] Insilicos,[11] MS-Spectre,[12] TOPPView,[13] Spectra Viewer,[14] SeeMS,[15] msInspect,[16] and Mascot Distiller.[17] There is one viewer for mzML: jmzML.[18]

Converters

Known converters for mzData to mzXML:

Hermes: A Java "mzData, mzXML, mzML" converter to all directions: publicly available, runs with a graphical user interface, by the Institute of Molecular Systems Biology, ETH Zurich[19][20]
FileConverter: A command line tool that converts to/from various mass spectrometry formats,[21] part of TOPP[22]

Known converters for mzXML:

The Institute for Systems Biology maintains a list of converters[23]

Known converters for mzML:

msConvert: A command line tool that converts to/from various mass spectrometry formats; The reference implementation of mzML has been provided by the ProteoWizard project[24].
ReAdW:[25] The Institute for Systems Biology command line converter for Thermo RAW files, part of the TransProteomicPipeline.[26]
FileConverter: A command line tool that converts to/from various mass spectrometry formats,[27] part of TOPP[28]

Converters for proprietary formats:

CompassXport, Bruker's free tool generating mzXML (and now mzData)[citation needed] files for many of their native file formats (.baf).
MASSTransit, a software to change data between proprietary formats, by Palisade Corporation and distributed by Scientific Instrument Services, Inc[29] and PerkinElmer[30]
Sashimi, an open source project[31] offering a collection of converter programs for some common mass spectrometric file formats. Currently available converters are :
MassWolf, for Micromass MassLynx .Raw format
mzStar, for SCIEX/ABI SCIEX/ABI Analyst format
ReAdW, for ThermoFinnigan Xcalibur format
Wiff2dta for SCIEX/ABI SCIEX/ABI Analyst format to mzXML, DTA, MGF and PMF

Compressors

mzSquash: Command line utilities and Java API to compress and uncompress mzML files.[32]

See also

References

  1. ^ R.S. McDonald and P.A. Wilks; "JCAMP-DX: A Standard Form for Exchange of Infrared Spectra in Computer-Readable Form"; Applied Spectroscopy, Vol. 42, No. 1, January 1988, pp 151-162.
  2. ^ JCAMP-DX V.6.00 for CHROMATOGRAPHY and MASS SPECTROMETRY HYPHENATED METHODS (IUPAC Technical Note 2005); J. Hau, P. Lampen, R.J. Lancashire, R.S. McDonald, P.S. McIntyre, D.N. Rutledge, W. Schrader, A.N. Davies
  3. ^ Pedrioli PG, Eng JK, Hubley R, Vogelzang M, Deutsch EW, Raught B, Pratt B, Nilsson E, Angeletti RH, Apweiler R, Cheung K, Costello CE, Hermjakob H, Huang S, Julian RK, Kapp E, McComb ME, Oliver SG, Omenn G, Paton NW, Simpson R, Smith R, Taylor CF, Zhu W, Aebersold R (2004). "A common open representation of mass spectrometry data and its application to proteomics research". Nat. Biotechnol. 22 (11): 1459–66. doi:10.1038/nbt1031. PMID 15529173. 
  4. ^ Lin SM, Zhu L, Winter AQ, Sasinowski M, Kibbe WA (2005). "What is mzXML good for?". Expert review of proteomics 2 (6): 839–45. doi:10.1586/14789450.2.6.839. PMID 17342793. 
  5. ^ a b Orchard S, Montechi-Palazzi L, Deutsch EW, Binz PA, Jones AR, Paton N, Pizarro A, Creasy DM, Wojcik J, Hermjakob H (2007). "Five years of progress in the Standardization of Proteomics Data 4(th) Annual Spring Workshop of the HUPO-Proteomics Standards Initiative April 23-25, 2007 Ecole Nationale Supérieure (ENS), Lyon, France". Proteomics 7 (19): 3436–40. doi:10.1002/pmic.200700658. PMID 17907277. 
  6. ^ "mzML". http://www.psidev.info/index.php?q=node/257. Retrieved 2007-10-11. 
  7. ^ Deutsch EW (2008). "mzML: A single, unifying data format for mass spectrometer output". Proteomics 8 (14): 2776–7. doi:10.1002/pmic.200890049. PMID 18655045. 
  8. ^ HUPO-PSI
  9. ^ [1]
  10. ^ BSI: PEAKS website
  11. ^ Insilicos website
  12. ^ MS-Spectre website
  13. ^ OpenMS and TOPP website
  14. ^ An open source viewer developed under academic projects
  15. ^ An open source viewer developed by Matt Chambers at Vanderbilt
  16. ^ An open source viewer developed by at the Fred Hutchinson Cancer Center
  17. ^ Commercial software with free viewer mode for mzXML and many proprietary formats
  18. ^ jmzML
  19. ^ Hermes
  20. ^ Hermes website
  21. ^ FileConverter
  22. ^ TOPP
  23. ^ "mzXML". http://tools.proteomecenter.org/wiki/index.php?title=Formats:mzXML. Retrieved 2008-06-30. 
  24. ^ "ProteoWizard". http://proteowizard.sourceforge.net. Retrieved 2008-06-30. 
  25. ^ ReAdW
  26. ^ TransProteomicPipeline
  27. ^ FileConverter
  28. ^ TOPP
  29. ^ http://www.sisweb.com/software/masstransit.htm
  30. ^ http://www.perkinelmer.com/gc
  31. ^ "Sashimi". http://sashimi.sourceforge.net. Retrieved 2007-10-11. 
  32. ^ mzSquash

Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Mass spectrometry data formats — Here the main mass spectrometry data formats are listed.mzXMLmzXML is a XML (eXtensible Markup Language) based common file format for proteomics mass spectrometric data.cite journal |author=Pedrioli PG, Eng JK, Hubley R, Vogelzang M, Deutsch EW,… …   Wikipedia

  • Mass spectrometry software — is software used for data acquisition, analysis, or representation in mass spectrometry. Contents 1 MS/MS peptide identification 1.1 Database search algorithms 1.1.1 SEQUEST 1.1.2 …   Wikipedia

  • Mass spectrometry — (MS) is an analytical technique that measures the mass to charge ratio of charged particles.[1] It is used for determining masses of particles, for determining the elemental composition of a sample or molecule, and for elucidating the chemical… …   Wikipedia

  • Mass spectrometry imaging — (also known as imaging mass spectrometry) is a technique used in mass spectrometry to visualize the spatial distribution of e.g. compounds, biomarker, metabolites, peptides or proteins by their molecular masses. Emerging technologies in the field …   Wikipedia

  • American Society for Mass Spectrometry — Infobox Organization name = American Society for Mass Spectrometry image border = size = 100 px caption = motto = formation = 1969 type = headquarters = location = flag|United States membership = 7000 language = English leader title = President… …   Wikipedia

  • Joint Committee on Atomic and Molecular Physical Data — Infobox file format name = jcamp extension = .jdx, .dx mime = chemical/x jcamp dx owner = creatorcode = genre = chemical file format container for = contained by = extended from = extended to = The format defined by the Joint Committee on Atomic… …   Wikipedia

  • Excalibur (disambiguation) — Excalibur is the mythical sword of King Arthur.Excalibur may also refer to: * Apache Excalibur (programming project), a project to produce a set of software components for the Java programming language * Excalibur (film), a 1981 film about the… …   Wikipedia

  • Lynx (disambiguation) — NOTOC Lynx is a type of wild cat. Lynx may also refer to: Business * LYNX Express (often branded simply as LYNX), formerly one of the UK s largest independent parcel carriers, now owned by UPS * Lynx electrical appliances, a brand of electrical… …   Wikipedia

  • Minimum Information About a Proteomics Experiment — MIAPE (Minimum Information About a Proteomics Experiment) is a minimum information standard created by the HUPO Proteomics Standards Initiative for reporting proteomics experiments[1]. It is intended to specify all the information necessary to… …   Wikipedia

  • Rosetta (spacecraft) — Infobox Spacecraft Name = Rosetta Organization = European Space Agency Major Contractors = European Space Agency Mission Type = Comet Orbiter/Lander Flyby Of = Earth, Mars, 2867 Šteins, 21 Lutetia Satellite Of = 67P/Churyumov Gerasimenko Launch …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”