Partek Flow Documentation

Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »


Create 


Log in to your AWS management console. The login URL may depend on your organization, but it is likey https://console.aws.amazon.com

Click on EC2

Switch to the region you will deploy Flow. For this tutorial, we will be using US East (N. Virginia)

On the left menu, click on Instances, then click the Lauch Instance button. You will be brought to the first step denoted "Choose an Amazon Machine Image (AMI)"

Click the Select button next to "Ubuntu Server 16.04 LTS (HVM), SSD Volume Type - ami-f4cc1de2"

For the next step "Choose an Instance Type" the selection depends on your budget and the size of the Flow deployment. We recommend m4.large for testing or cluster front-end operation, m4.xlarge for standard deployments, and m4.2xlarge for alignment-heavy workloads with a large user-base. See (AWS instance type resources and costs) for assistance with choosing the right instance. In most cases, the instance type and associated resources can be changed after deployment, so one is not locked in to the choices made for this step.  

For "Configure Instance Details", make the following selections:

Number of instances = 1. You do not need to create an autoscaling group for single-node deployments.

Purchasing option : Leave "Request Spot instances" unchecked. This is relevant for cost-minimization of Flow cluster deployments.

Network: If you do not have a VPC already created for Flow, click "Create new VPC". This will open a new browser tab for VPC management.

Use the following settings for the VPC:

Name tag: Flow-VPC

IPv4 CIDR block: 10.0.0.0/16

Select "No IPv6 CIDR Block"

Tenancy: Default

Click "Yes, Create". You may be asked to select a DHCP option set. If so, then make sure the DHCP option set you select or create has the following options:

Options: domain-name = ec2.internal;domain-name-servers = AmazonProvidedDNS;

DNS resolution: leave the defaults set to yes

DNS hostname: change this to yes and you may need internal dns resolution depending on the Flow deployment.

Once created, the new Flow-VPC will appear in the list of available VPCs. The VPC needs additional configuration for external access. To continue, right click on Flow-VPC and select "Edit DNS Resolution", then select "Yes" and then Save. Next, right click the Flow-VPC and select "Edit DNS Hostnames", select Yes, then Save. 

Make sure the DHCP option set is set to the one created above. If it is not, select "Edit DHCP option sets" by right clicking on the Flow-VPC and select the correct VPC.

Now, close the VPC management tab and go back to the EC2 management console. Click the refresh arrow next to "Create new VPC" and select the Flow-VPC VPC.

Next, click "Create new subnet" and a new browser tab will open with a list of existing subnets. Click "Create Subnet" and set the following options:

Name tag: Flow-Subnet

VPC : Flow-VPC

VPC CIDRs: This should be automatically populated with the information from Flow-VPC

Availability Zone: It is OK to let Amazon choose for you if you do not have a preference

IPv4 CIDR block: 10.0.1.0/24

Stay on the VPC dashboard tab and on the left navigation menu, click "Internet gateways", then click "Create Internet Gateway" and use the following options:

Name tag = Flow-IGW, then click "Yes, create".

The new gateway will be displayed as "detached". Right click on the Flow-IGW gateway and select "Attach to VPC", then select Flow-VPC and click "Yes, Attach"

Click on "Route Tables" on the left navigation menu. 

If it exists, select the route table already associated with Flow-VPC. If not, make a new route table and associate it with Flow-VPC. Click on the new route table, then click the "Routes" tab toward the bottom of the page. The route Destination = 10.0.0.0/16 Target = local should already be present. Click Edit, then Click "Add another route" and set the following parameters:

Destination 0.0.0.0/0

Target set to Flow-IGW (the internet gateway just created)

Click "Save"

Close the VPC Dashboard browser tab and go back to the EC2 Management Console tab. We are still on "Step 3: Configure Instance Details." Click the refresh arrow next to "Create new subnet" and select Flow-Subnet.

Auto-assign public ip: Use subnet setting (Disable)

Placement group: No placement group

IAM role : None. NOTE: For multi-node Flow deployments or instances where you would like Partek to manage AWS resources on your behalf, please see (Partek AWS support) and set up an IAM role as needed.

Shutdown behavior: Stop

Enable termination protection: select "Protect against accidental termination"

Monitoring: leave "Enable CloudWatch detailed monitoring" disabled

EBS-optimized instance: Make sure "Launch as EBS-optimized instance" is enabled and non-selectable. Given the recommended choice of a m4 instance type, EBS optimization should be enabled at no extra costs. 

Tenancy: Shared - Run a shared hardware instance

Network interfaces: leave as-is

Advanced details: leave as-is

Click "Next: Add storage". You should be on the "Step 4: Add Storage"

For the existing root volume, set the following options:

Size: 8GB

Volume type: Magnetic

Select "Delete on termination"

Click "Add New Volume" and set the following options:

Volume Type: EBS

Device: /dev/sdb (take the default)

Do not define a snapshot

Size (GiB): 500 This is the minimum for st1 volumes (See: Notes about EBS volumes)

Volume Type: Throughput optimized HDD (ST1)

Do not delete on terminate or encrypt

Click "Next: Add Tags"

You do not need to define any tags for this new EC2 instance, but you can if you would like.

Click "Next: Configure Security Group" 

For "Assign a security group" select "Create a new security group"

Security group name: Flow-SG

Description: Security group for Flow server

Add the following rules:

1) SSH set Source to My IP (or the address range of your company or institution)

2) Click Add Rule, Type is Custom TCP Rule, Port Range = 8080, Source is anywhere (0.0.0.0/0, ::/0). It is recommended to restrict this to just those that need access to Flow.

Click "Review and Launch"

AWS will suggest this server not be booted from a magnetic volume. Since there is not a lot of IO on the root partition and reboots are will be rare, choosing "Continue with Magnetic" will reduce costs. Choosing an SSD volume will not provide substantial benefit but it it OK if one wishes to use an SSD volume.

Click Launch

Create a new keypair : Name it Flow-Key, download it, the run chmod 600 on the downloaded key so it can be used. Backup this key as you may lose access to the Flow instance without it.

The new instance will now boot. Use the left navigation bar and click on Instances. Click the pencil icon and assign the instance the name “Flow Server”

Next, we will need to make the server accessible at a fixed address. To do this, click on "Elastic IPs" on the left. 

Click "Allocate new address"

Assign Scope to VPC

Click Allocate

On the table containing the newly allocated elastic IP, right click and select "Associate Address"

For Instance select the instance name “Flow Test Server”

For Private IP, select the one private IP available, then click Associate

 

------------------------------------

Notes about EBS volumes:

500 (This is the minimum for st1 volumes) (See: Notes about EBS volumes) Throughput optimized HDD, throughput =  20 / 123 (can’t change) Baseline: 40 MB/s per TiB, no delete on terminate or encrypt

 

Set the frontend domain name (Cluster only)

 

https://console.aws.amazon.com/route53/

get started under DNS management

hosted zones, create hosted zone

domain name = flowcluster

type = private for VPC

set to Flow-Testing VPC

 

hosted zones => create record set

naem = frontend type A - ipv4 address

value set to 10.0.1.116 (flow IP)

routing policy simple

 

Connect to instance:

chmod the key

ssh -i ~/aws-keys/Flow-Testing.pem ubuntu@awstest.partek.com

sudo su

/etc/hostname -> frontend.flowcluster (reboot)

 

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html

 

sudo su

mkfs -t ext4 /dev/xvdb

test and move:

mount -t ext4 /dev/xvdb /mnt/

rsync -avr /home/ /mnt/

umount /mnt/

 

vi /etc/fstab

UUID=24311e67-c70a-4b4c-9d2a-e9c016dbce29 /home   ext4    defaults,nofail        0       2

mount -a

logout then login

 

We contain everything in /home/ubuntu, so we need to do a zip install.

Install Flow (we assume you have the license) zip install to keep everything in /home/ubuntu:

 

Install the following packages beforehand:

sudo apt-get update

sudo apt-get install  python perl make gcc g++ zlib1g libbz2-1.0 libstdc++6 libgcc1 libncurses5 libsqlite3-0 libfreetype6 libpng12-0 zip unzip libgomp1 libxrender1 libxtst6 libxi6 debconf

 

Now Flow:

Exit back to ubuntu. Be in the home directory

 

wget --content-disposition packages.partek.com/linux/flow-release

unzip PartekFlow-LINUX-6.0.17.0319.zip

 

vi ~/.bashrc and at the end:

export CATALINA_OPTS="-DflowDispatcher.flow.command.hostname=frontend.flowcluster -DflowDispatcher.akka.remote.netty.tcp.hostname=frontend.flowcluster"

 

source ~/.bashrc

 

./partek_flow/start_flow.sh

goto: http://awstest.partek.com:8080/

Enter license key

Set up admin account

Leave library file directory where it is and check that free space matches what you expect (~500 GB)

 

Support and multinode:

IAM Role:

role name = Flow-Testing

role type : add

 

Amazon EC2

 

leave policies alone



+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

Background info and rationale for choices:

Candidate nodes, cheapest to most expensive. Cost is dynamic cost (see reserved pricing).

Want network speed to be 1GB/s or greater for multi-node setups.

Want HVM

No EBS optimized surcharge

Want placement group support

No T class servers as don’t want to slow responsiveness

We don’t use instance store since all data is lost after instance stop. Too risky.

 

EBS-only options:

Type:  Mem:  Cores:         EBS throughput (netw rate):   Monthly cost:

m4.large 8.0 GB             2 vCPUs    56.25 MB/s  M                         $78.840

r4.large 15.25 GB 2 vCPUs     50 MB/s       H(10G int)           $97.09

*m4.xlarge 16.0 GB 4 vCPUs     93.75 MB/s  H                         $156.950

r4.xlarge 30.5 GB 4 vCPUs     100 MB/s     H                         $194.180

*m4.2xlarge 32.0 GB 8 vCPUs     125 MB/s     H            $314.630

r4.2xlarge 61.0 GB 8 vCPUs     200 MB/s     H(10G int)           $388.360

 

Single server recommendation: m4.2xlarge

Cluster head node recommendation: m4.xlarge

Soma “test” server with worst case hardware: m4.large

 

Performance may suffer if one chooses smaller nodes.

Once can always shut down and change the instance types if a particular choice is insufficient. EBS volumes can be grown or performance changed.

 

Network speed (netw rate) US-EAST-1 internal and external.

L = 50Mb/s

M = 300Mb/s

H = 1Gb/s.

 

See network benchmarks: http://epamcloud.blogspot.com.br/2013/03/testing-amazon-ec2-network-speed.html

 

EBS types:

Use throughput optimized HDD for flow data.

Max throughput 500 MiB/s

$0.045 per GB-month of provisioned storage

500 GB min provisioned storage so min cost = $22.5 per month for a 500 GB drive.




Single-Node install : change pref to beefy server

Multi-node: less-beefy head node

 

Other Notes:

 

Pricing and resource table:

http://www.ec2instances.info/

 

ECU Vs. vCPU:

each vCPU is a hyperthread of an Intel Xeon core

1 ECU is the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor

 

ADD: placement group

 

paravirtualization: faster because not fully virtualized, but major drawback: You need a region-specific kernel object for each Linux instance. So just stick with hvm for ease of deployment. Hvm has increased in performance.

 

spreadsheet: removed anything that was not EBS optimized. removed GPU nodes, remove micro, nano, small instances

don't care about gpu support

remove < 4 GB memory

 

Linus grow filesystem:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html#recognize-expanded-volume-linux



 

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 1 rates

  • No labels