distributed rendering using Apple Qmaster 3 – success!!!

Wed 27, February 2008 12:26 am computer stuff, Extremely Cool, osx, python, VFX

qmaster3logo


Thanks to a fair amount of googling and a few days on and off of testing I seem to have a working setup for doing distributed processing using Apple Qmaster – for Shake, Maya and Compressor.

Here is my effort at explaining what I did to get it working – big apologies, it’s poorly written. I’ll revisit this post soon.

some notes on our workflow

  • Our current workflow for dealing with files/assets evolves around a directory structure that breaks each project up into shot numbers and the various departments we have etc..
  • Our xserve is the centralised area for our assets and the QMaster render manager.
  • Each artist’s local machine has a directory somewhere that mimics the xServe’s project directory structure for working locally (to keep network traffic down over gigabit ethernet).
  • I set up $JOBS_LOCAL and $JOBS_SERVER env variables on each machine and the xserve – these variables point to the relevant local project directory on the artist’s mac and the project directory on the server.
  • I created a python script that does a find and replace of the 2 variables and writes our a new shake script renamed “*_SERVER.shk”, or “*_LOCAL.shk”
  • (See further down for setting up the ENV variables.)

    Centralised $NR_INCLUDE_PATH
    I setup an env variable for $NR_INCLUDE_PATH for all Shake machines and the xserve – to look at a sharepoint (the nreal folder) on the xServe and automatically mount it – so all the Shake machines would be using the same macros/plugins and settings. I setup a new user on the xserve “shake” that can only mount the nreal directory.

    After some googling around I found a way to automount volumes:

    OS X 10.5 (fstab)
    /etc/fstab – to automount sharepoint on xServe
    LINK: http://blogs.sun.com/lowbit/entry/easy_afp_autmount_on_os

    # ————————————————————————————-
    # Mount AFP share from xServe via AFP
    # ————————————————————————————-
    ourserver.local:/nreal /Network/nreal url auto,url==afp://nreal:password@ourserver.local/nreal 0 0
    # ————————————————————————————-

    how to refresh automount(login as root/su):

    sudo automount -cv

    OS X 10.4 (netinfo manager)
    LINK: http://ask.metafilter.com/54223/Help-me-automount-my-smb-share-in-Apple-OS-X-reward-inside
    in terminal.app:

    nidump fstab . > /tmp/fstab
    echo “ourservernreal /Network/nreal url url==cifs://nreal:password@ourserver/nreal 0 0” >> /tmp/fstab
    sudo niload fstab . < /tmp/fstab sudo killall -HUP automount

    how to refresh automount(login as root/su):

    sudo killall -HUP automount

    QMASTER
    I had tried for some time to get shake scripts to render over our network using Qmaster but it just wouldn’t work. The QMaster logs were where I found all my errors. ‘frame 0021 could not be found’, ‘UNIX error 3 file could not be found’

    things to check

  • shake / maya / qmaster 3 node is installed on all render machines
  • the location of all your media can be accessed by all machines
  • What seemed strange was that if I logged into the xServe and executed a script via terminal with shake (to render just on the xserve) the render would complete successfully. Then it clicked that maybe the environment variables I was using in my scripts ($xxx) might not be getting recognised by Qmaster or the way Qmaster launches Shake??

    The big tip-off
    I googled for the errors I kept seeing and luckily enough this forum post popped up:
    http://www.highend3d.com/boards/index.php?showtopic=204342

    “I have pinned this down to at least one reason – that the shake qmaster isn’t picking up the NR_INCLUDE_PATH environment variable. Does anyone know where you need to set this up on a cluster node (I can get the qmasterd to pick it up but that doesn’t solve the problem!) “

    If you are trying to use Qmaster, and need to set environment variables, then you need to create a wrapper script that sets the variables and then calls the appropriate version of shake.

    For example, (this was from Apple)

    NR_INCLUDE_PATH=/Network_Applications/Shake Plugins/nreal/include;export NR_INCLUDE_PATH
    NR_ICON_PATH=/Network_Applications/Shake Plugins/nreal/icons;export NR_ICON_PATH

    umask 000

    /Applications/Shake3.50/shake.app/Contents/MacOS/shake $*
    status=$?
    if [ $status -ne 0 ]
    then
    exit $status
    fi

    Then when using Qmaster, you run the application using this
    script (saved for example as /usr/bin/shakeWrapper) which must be
    installed on all nodes in the cluster.

    Regards
    Nell

    Cheers Nell!

    I took Nell’s script, added a few lines to it and stuck it in /usr/bin/

    /usr/bin/shakeWrapper – create a file to launch Shake respecting ENV variables – later alias ‘shake’ to this file

    echo
    echo “Shake 4 running through a wrapper script – /usr/bin/shakeWrapper”
    echo

    umask 000

    /Applications/Shake/shake.app/Contents/MacOS/shake $*
    status=$?
    if [ $status -ne 0 ]
    then
    exit $status
    fi

    I added the first few lines there so when I later made an alias to this script the Shake user would have some idea what is going on when launching Shake via the terminal.

    Setting up the ENV (environment) variables…

    /etc/profile – to declare system-wide Environment variables / aliases
    (alias shake to use a wrapper to make it launch respecting/recognising the env variables)

    # System-wide .profile for sh(1)

    if [ -x /usr/libexec/path_helper ]; then
    eval `/usr/libexec/path_helper -s`
    fi

    if [ “${BASH-no}” != “no” ]; then
    [ -r /etc/bashrc ] && . /etc/bashrc
    fi

    JOBS_LOCAL=”/Volumes/otherdrive/jobs”;
    export JOBS_LOCAL

    JOBS_SERVER=”/Volumes/ourserversharepoint/jobs”;
    export JOBS_SERVER

    NR_INCLUDE_PATH=”$HOME/nreal/include”:”/Network/nreal/include/”;
    export NR_INCLUDE_PATH

    alias shake=”/usr/bin/shakeWrapper”

    *remember to enter into terminal:
    source /etc/profile