Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For single-node installations, refer to the Installing Partek Flow for on Linux:

Prior to installation, make sure you have the license key related to the host-ID of the compute cluster the software will be installed in. Contact licensing@partek.com for key generation.

...

  1. Log into the flow account and proceed to the cd to the flow home directory

    cd home/flow

     

  2. Download Partek Flow and the remote worker package

    wget --content-disposition http://packages.partek.com/linux/flow-release
    wget --content-disposition http://packages.partek.com/linux/flow-worker-release

     

  3. Unzip these files into the flow home directory /home/flow. This yields two directories: partek_flow and PartekFlowRemoteWorker


  4. Partek Flow can generate large amounts of data, so it needs to be configured to the bulk of this data in the largest shared data store available. For this guide we assume that the directory is located at /shared. Adjust this path accordingly.

  5. It is required that the Partek Flow server (which is running on the head node) and remote workers (which is running on the compute nodes) see identical file system paths for any directory Partek Flow has read or write access to. Thus /shared and /home/flow must be mounted on the Flow server and all compute nodes. Create the directory /shared/FlowData and allow the flow linux account write access to it

  6. It is assumed the head node is attached to at least two separate networks: (1) a public network that allows users to log in to the head node and (2) a private backend network that is used for communication between compute nodes and the head node. Clients connect to the Flow web server on port 8080 so adjust the firewall to allow inbound connections to 8080 over the public network of the head node. Partek Flow will connect to remote workers over you private network on port 2552, so make sure that port is also open.

  7. Partek Flow needs to be informed of what private network to use for communication between the server and workers. It is possible that there are several private networks available (gigabit, infiniband, etc.) so select one to use. We recommend using the fastest network available. For this guide, let's assume that private network is 10.1.0.0/16. Locate the headnode hostname that resolves to an address on the 10.1.0.0/16 network. This must resolve to the same address on all compute nodes.

  8. For example:
    host head-node.local
    yields
    10.1.1.200 

    Open /home/flow/.bashrc and add this as the last line:
    export CATALINA_OPTS="$CATALINA_OPTS -Djava.awt.headless=true
    -DflowDispatcher.flow.command.hostname=head-node.local
    -DflowDispatcher.akka.remote.netty.tcp.hostname=head-node.local"
     
    Source .bashrc so the environment variable CATALINA_OPTS is accessible. 

    NOTE: If workers are unable to connect (below), then replace all hostnames with their respective IPs.

  9. Start Partek Flow
    ~/partek_flow/start_flow.sh

  10. You can monitor progress by tailing the log file partek_flow/logs/catalina.out. After a few minutes, the server should be up.

  11. Make sure the correct ports are bound
    netstat -tulpn

  12. You should see 10.1.1.200:2552 and :::8080 as LISTENing. Inspect catalina.out for additional error messages.

  13. Open a browser and go to http://localhost:8080 on the head node to configure the Partek Flow server.

     

  14. Enter the license key provided (Figure 1)

    Numbered figure captions
    SubtitleTextSetting up the Partek Flow license during installation
    AnchorNameSetting up the Partek Flow license during installation

  15. If there appears to be an issue with the license or there is a message about 'no workers attached', then restart Partek Flow. It may take 30 sec for the process to shut down. Make sure the process is terminated before starting the server back up.:
    ~/partek_flow/stop_flow.sh 
    Then run: 
    ~/partek_flow/start_flow.sh

  16. You will now be prompted to setup the Partek Flow admin user (Figure 2). Specify the username (admin), password and email address for the administrator account and click Next

    Numbered figure captions
    SubtitleTextSetting up the Partek Flow 'admin' account during installation
    AnchorNameSetting up the Partek Flow 'admin' account during installation


  17. Select a directory folder to store the library files that will be downloaded or generated by Partek Flow (Figure 3).  All Partek Flow users share library files and the size of the library folder can grow significantly. We recommend at least 100GB of free space should be allocated for library files.  The free space in the selected library file directory is shown.  Click Next to proceed.  You can change this directory after installation by changing system preferences.  For more information, see Library file management.

    Numbered figure captions
    SubtitleTextSelecting the library file directory
    AnchorNameSelecting the library file directory

  18. To set up the Partek Flow data paths, click on Settings located on the top-right of the Flow server webpage. On the left, click on Directory permissions then Permit access to a new directory. Add /shared/PartekFlow and allow all users access.

  19. Next click on System preferences on the left menu and change data download directory and default project output directory to /shared/PartekFlow/downloads and /shared/PartekFlow/project_output respectively

    Note: If you do not see the /sharedfolder listed, click on the Refresh folder list link that is toward the bottom of the download directory dialog

  20. Since you do not want to run any work on the head node, go to Settings>Server configuration>Task Settings>System preferences>Task queue settings and job processing and uncheck Use Start internal server workerworker at Partek Flow server startup.

  21. Restart the Flow server:
    ~/partek_flow/stop_flow.sh 
    After 30 seconds, run: 
    ~/partek_flow/start_flow.sh
    This is needed to disable the internal worker.

  22. Test that remote workers can connect to the Flow server

  23. Log in as the flow user to one of your compute nodes. Assume the hostname is compute-0. Since your home directory is exported to all compute nodes, you should be able to go to /home/flow/PartekFlowRemoteWorker/

  24. To start the remote worker:
    ./partekFlowRemoteWorker.sh head-node.local compute-0

  25. These two addresses should both be in the 10.1.0.0/16 address space. The remote worker will output to stdout when you run it. Scan for any errors. You should see the message woot! I'm online.

  26. A successfully connected worker will show up on the Resource management page on the Partek Flow server. This can be reached from the main homepage or by clicking Resource management from the Settings page. Once you have confirmed the worker can connect, kill the remote worker (CTRL-C) from the terminal in which you started it.

  27. Once everything is working, return to library file management and add the genomes/indices required by your research team. If Partek hosts these genomes/indices, these will automatically be downloaded by Partek Flow

...