promc is hosted by Hepforge, IPPP Durham

ProMC

ProMC is a library for file input and output of Monte Carlo event records or any HEP structural data. The main features are:

  • Content-dependent "compression". Event records are encoded using Google's protocol buffers with variable number of bytes ("varints"). This leads to 30% (and more) smaller file sizes compared to any known compression based on a fixed-length representation of numeric information.
  • Multiplatform. Data records can be read and can be written natively in C++, Java and Python.
  • Self-describing data format. Analysis source codes can be generated from ProMC files with unknown data layouts.
  • Forwards-compatible and backwards-compatible binary wire format.
  • Random access. Events can be read starting at any index.
  • Fast. No CPU overhead on decompression of events
  • Simplicity. No external dependence. The library is small and self-contained. The library has been deployed on BlueGene/Q.
ProMC ("ProtocolBuffers" MC) is based on Google's Protocol Buffers, language-neutral, platform-neutral and extensible mechanism for serializing structured data. It uses "varints" as a way to store and compress integers using variable number of bits (one or more bytes). Smaller numbers take a smaller number of bytes. This means that low energetic particles can be represented by smaller number of bytes, since values needed to represent such particles are smaller compared to high-energetic particles. This is important concept for "smart" compression of events with large number of soft particles ("pileup" events). since such events use less disk storage compared to methods with a fixed-length representation of numbers.

The current ProMC implementation already indicates that file sizes created using this library are 50-30% smaller compared to the stadard ROOT (gzip) compression using either Float_t or Double32_t for storage or HEPMC after gzip compression. These file-size reduction factors depend on the shape of pT distributions and a few other factors (read the manual).

The library can also be used for storing any object data in a multiplatform, self-describing and highly compact form.

ProMC is used by the HepSim public database with theoretical predictions from Monte Carlo generators for HEP.

Other resources

ProMC downloads
ProMC downloads (ANL mirror)
ProMC SVN
ProMC manual and tutorial
HepSim database with events from Monte Carlo generators

Created by S.V.Chekanov (ANL).
Contact: chekanov [AT] anl.gov