This project has retired. For details please refer to its Attic page.
Joshua Documentation | Installation


Released November 5, 2015





Installation

Download and install

To use Joshua as a standalone decoder (with language packs), you only need to download and install the runtime version of the decoder. If you also wish to build translation models from your own data, you will want to install the full version. See the instructions below.

  1. Set up some basic environment variables. You need to define $JAVA_HOME

    export JAVA_HOME=/path/to/java
    
    # JAVA_HOME is not very standardized. Here are some places to look:
    # OS X:  export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home
    # Linux: export JAVA_HOME=/usr/java/default
    
  2. If you are installing the full version of Joshua, you also need to define $HADOOP to point to your Hadoop installation. (Joshua looks for the Hadoop executuble in $HADOOP/bin/hadoop)

    export HADOOP=/usr
    

    If you don’t have a Hadoop installation, Joshua’s pipeline can install a standalone version for you.

  3. To install just the runtime version of Joshua, type

    wget -q http://cs.jhu.edu/~post/files/joshua-runtime-6.0.5.tgz
    

    Then build everything

    tar xzf joshua-runtime-6.0.5.tgz
    cd joshua-runtime-6.0.5
    
    # Add this to your init files
    export JOSHUA=$(pwd)
       
    # build everything
    ant
    
  4. To instead install the full version, type

    wget -q http://cs.jhu.edu/~post/files/joshua-6.0.5.tgz
    
    tar xzf joshua-6.0.5.tgz
    cd joshua-6.0.5
    
    # Add this to your init files
    export JOSHUA=$(pwd)
       
    # build everything
    ant
    

Building new models

If you wish to build models for new language pairs from existing data (such as the WMT data), you need to install some additional dependencies.

  1. For learning hierarchical models, Joshua includes a tool called Thrax, which is built on Hadoop. If you have a Hadoop installation, make sure that the environment variable $HADOOP is set and points to it. If you don’t, Joshua will roll one out for you in standalone mode. Hadoop is only needed if you plan to build new models with Joshua.

  2. You will need to install Moses if either of the following applies to you:

    • You wish to build phrase-based models (Joshua 6 includes a phrase-based decoder, but not the tools for building such a model)

    • You are building your own models (phrase- or syntax-based) and wish to use Cherry & Foster’s batch MIRA tuner instead of the included MERT implementation, Z-MERT.

    Follow the instructions for installing Moses here, and then define the $MOSES environment variable to point to the root of the Moses installation.

More information

For more detail on the decoder itself, including its command-line options, see the Joshua decoder page. You can also learn more about other steps of the Joshua MT pipeline, including grammar extraction with Thrax and Joshua’s efficient grammar representation.

If you have problems or issues, you might find some help on our answers page or in the mailing list archives.

A bundled configuration, which is a minimal set of configuration, resource, and script files, can be created and easily transferred and shared.