Download Latest Version dmtcp-2.6.0.tar.gz (1.4 MB)
Email in envelope

Get an email when there's a new version of Distributed MultiThreaded Checkpointing

Name Modified Size InfoDownloads / Week
Parent folder
README 2014-01-12 5.6 kB
dmtcp-2.1.tar.gz 2014-01-12 1.3 MB
Totals: 2 Items   1.3 MB 0
DMTCP version 2.1. has now been released.

As before, it runs on most Linux distros, and supports both x86 and x86_64
(Intel/AMD for 32- and 64-bits), and 32-bit ARM (ARMv7).  In addition, the
older DMTCP version 1.2.x (currently 1.2.8) continues to be maintained, but on
a bug-fix basis only.

* CHANGE NEEDED FOR ALL PLUGINS:
  - If you have plugins that include "dmtcpplugin.h", they will now have to be
    changed to include "dmtcp.h".  This is to reflect that "dmtcp.h" has more
    uses than just for plugins.

* This new release includes:
  - some newly stable plugins - batch-queue, modify-env, ptrace (see below)
  - full support for 32-/64-bit multilib architecture. (see below)
  - other enhancements to the core feature set (see below)
  - adapting DMTCP to application requirements:  removal of the old dmtcpaware
    interface in favor of the newer interface:  test/plugin/applic-*ckpt/
    (see below)
  - attempt to restore current working directory on restart (may be impossible
    if restart host has different filesystem)
  - 'dmtcp_coordinator --port-file <FILE>' causes coordinator to write the port
  - number on which it listens into FILE.  This is useful in
    conjunction with 'dmtcp_coordinator --port 0', which starts a coordinator
    at a random unused port.
  - 'dmtcp_restart --ckptdir <DIR>' and 'dmtcp_restart_script.sh --ckptdir <DIR>'
    will change to a new directory to hold checkpoint images on restart.
  - 'dmtcp_restart --no-strict-uid-checking'
    or 'dmtcp_coordinator --no-strict-uid-checking'
    [ allows a user with a different uid to restart a checkpoint image;
    process uid will be changed to that of the new user ]
  - './configure --enable-run-as-root'  [ self explanatory; normally running
    as root is bad practice ]
  - a new internal plugin to handle 'ssh' uniformly; Some corner cases
    in checkpointing MPI could have been affected by this.
  - some bug fixes related to the new plugin software architecture initiated
    with DMTCP 2.0

* SOME NEWLY STABLE PLUGINS:
  This release continues to emphasize the use of DMTCP plugins.
  The plugins are now organized into two top-level subdirectories:
  - plugin - plugin is built by './configure; make', but must be invoked,
             typically through command-line option of 'dmtcp_launch'
  - contrib - plugin not built; user must cd to the subdirectory of the plugin,
              build it, and invoke it with 'dmtcp_launch --with-plugin ...'

  - Plugins in the top-level plugin directory:
    + ptrace :  'dmtcp_launch --ptrace'
        a plugin to support checkpointing ptrace-based applications,
        notably including GDB.
    + batch-queue :  'dmtcp_launch --batch-queue'
        a resource manager plugin that supports the Torque/PBS and SLURM
        batch queue systems.  (This plugin is now mature, and was renamed
        from 'rm' in DMTCP-2.0 to 'batch-queue' to better reflect its use.)
        [ improved in DMTCP 2.1 ]
    + modify-env :  'dmtcp_launch --modify-env'
        Normally, on dmtcp_restart, a process can see only the original
        environment variables in effect during dmtcp_launch or set by the
        process itself.  It is common to wish to update these environment
        variables based on the environment on the restart host
        (e.g., DISPLAY=$DISPLAY).  This can be set in a file dmtcp_env.txt .
        [ new in DMTCP 2.1 ]

  - The contrib plugins include:
    + condor : support for HTCondor, a framework for high throughput computing
    + kvm : checkpointing of a KVM virtual machine
    + tun : support for tun networking (as in Tun/Tap) between a virtual
        machine and the host machine
    + python : support for checkpoint/restart within a Python session
    + infiniband : checkpointing over InfiniBand networks supports OFED
        InfiniBand API.
        (Note: If you are using a newer release of OFED, you may wish to use
         the rewrite of this plugin, to be available from the svn in late
         January, 2014.)
      [ improved in DMTCP 2.1 ]
    + ib2tcp : support for checkpointing computation over InfiniBand and
        restarting over TCP.
      [ new in DMTCP 2.1 ]
    + ckptfile : example/template for a plugin to change the default directory
        to receive checkpoint images.  This can be important when restarting on
        a new host.
      [ new in DMTCP 2.1 ]

* FULL SUPPORT FOR 32-/64-bit MULTILIB ARCHITECTURE:

  The standard binary, dmtcp_launch, now supports both 32- and 64-bit programs.
  Further, a 64-bit program may invoke a 32-bit program and vice versa, as part
  of a single computation under DMTCP control.

* OTHER ENHANCEMENTS TO THE CORE FEATURE SET:

  - For extremely malloc-intensive programs, run-time overhead from several
      per cent to 20% has been observed.  This is due to DMTCP deadlock
      avoidance.  (The glibc implementation of malloc uses a global lock,
      that can result in deadlock if a user invokes malloc inside a plugin
      during checkpoint or restart.)  If a user program is not using malloc
      in a plugin during checkpoint, then the user can disable this
      DMTCP deadlock avoidance scheme with a flag:
        dmtcp_launch --disable-alloc-plugin
      A future modification to DMTCP may remove this issue entirely.

* ADAPTING DMTCP TO APPLICATION REQUIREMENTS AND TO EXTERNAL ENVIRONMENTS:

  The old 'dmtcpaware' API is being removed in favor of:
    test/plugin/applic-*ckpt/

  For details on this newer API, please read the QUICK-START file with this
  same heading:  ADAPTING DMTCP TO ...
Source: README, updated 2014-01-12