Partek Flow Documentation

Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »


Create 


Log in to your AWS management console. The login URL may depend on your organization, but it is likey https://console.aws.amazon.com

Click on EC2

Switch to the region you will deploy Flow. For this tutorial, we will be using US East (N. Virginia)

On the left menu, click on Instances, then click the Lauch Instance button. You will be brought to the first step denoted "Choose an Amazon Machine Image (AMI)"

Click the Select button next to "Ubuntu Server 16.04 LTS (HVM), SSD Volume Type - ami-f4cc1de2"

For the next step "Choose an Instance Type" the selection depends on your budget and the size of the Flow deployment. We recommend m4.large for testing or cluster front-end operation, m4.xlarge for standard deployments, and m4.2xlarge for alignment-heavy workloads with a large user-base. See (AWS instance type resources and costs) for assistance with choosing the right instance. In most cases, the instance type and associated resources can be changed after deployment, so one is not locked in to the choices made for this step.  

For "Configure Instance Details", make the following selections:

Number of instances = 1. You do not need to create an autoscaling group for single-node deplayments.

Purchasing option : Leave "Request Spot instances" unchecked. This is relevant for cost-minimization of Flow cluster deployments.

Network: If you do not have a VPC already created for Flow, click "Create new VPC"

Use the following settings for the VPC:

 Name tag: Flow-VPC

IPv4 CIDR block: 10.0.0.0/16

Select "No IPv6 CIDR Block"

Tenancy: Default

 

1) DHCP options set : (ec2.internal is for us-east-1)

domain-name = ec2.internal;domain-name-servers = AmazonProvidedDNS;

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_DHCP_Options.html

2 - DNS resolution -> defaults to yes

3 - DNS hostname -> defaults to no (i set to yes later, not sure if we need this or not) yes we need this for internal dns resolution


Create subnet:

name tag Flow-Testing-Subnet

VPC : the one you just made

avail zone us-east-1d

IPv4 CIDR block : 10.0.1.0/24


Create internet gateway:

Name tag = Flow-Testing

attach to VPC, select Flow-Testing VPC


Go to route tables:

select the one associated with your VPC

Add another route: (dest) 0.0.0.0/0 (Target) set to right internet gateway


Auto-assign public ip: use subnet setting (Disable)

Placement group (none)


IAM role : new


role name = Flow-Testing

role type : add


Amazon EC2


leave policies alone


----

shutdown behavior Stop

enable termination protection

do not need cloudwatch

make sure EBS optimized is on

Tenancy = shared

leave network interfaces alone

leave advanced details alone


Add storage:

Root /dev/sda1 -snap- 8GB vol type = magnetic, delete on termination, not encrypted.

add another -

type = EBS, device /dev/sdb, no snapshot, size=500 (min for st1), Throughput optimized HDD, throughput =  20 / 123 (can’t change) Baseline: 40 MB/s per TiB, no delete on terminate or encrypt


no tags


create new security group

name = Flow-Testing

desc = Default Flow SG for testing

1) ssh (defaults) source = myIP 97.84.41.194/32

2) add rule, custom, port range = 8080, source anywhere (0.0.0.0/0, ::/0)


Boot from…

Continue with Magnetic


Create a new keypair : Flow-Testing, download it. chmod 600 it


Instance now boots, go to instances, give it a name “Flow Test Server”


Make new elastic IP, scope = VPC

New associate address, resource type = instance, pick “Flow Test Server”, select the one private IP available, click reassociation


------------------------------------


Can’t connect, go to services, VPC, your VPC

(solved above by subnet rules)


Set the frontend domain name (Cluster only)


https://console.aws.amazon.com/route53/

get started under DNS management

hosted zones, create hosted zone

domain name = flowcluster

type = private for VPC

set to Flow-Testing VPC


hosted zones => create record set

naem = frontend type A - ipv4 address

value set to 10.0.1.116 (flow IP)

routing policy simple


Connect to instance:

chmod the key

ssh -i ~/aws-keys/Flow-Testing.pem ubuntu@awstest.partek.com

sudo su

/etc/hostname -> frontend.flowcluster (reboot)


http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html


sudo su

mkfs -t ext4 /dev/xvdb

test and move:

mount -t ext4 /dev/xvdb /mnt/

rsync -avr /home/ /mnt/

umount /mnt/


vi /etc/fstab

UUID=24311e67-c70a-4b4c-9d2a-e9c016dbce29 /home   ext4    defaults,nofail        0       2

mount -a

logout then login


We contain everything in /home/ubuntu, so we need to do a zip install.

Install Flow (we assume you have the license) zip install to keep everything in /home/ubuntu:


Install the following packages beforehand:

sudo apt-get update

sudo apt-get install  python perl make gcc g++ zlib1g libbz2-1.0 libstdc++6 libgcc1 libncurses5 libsqlite3-0 libfreetype6 libpng12-0 zip unzip libgomp1 libxrender1 libxtst6 libxi6 debconf


Now Flow:

Exit back to ubuntu. Be in the home directory


wget --content-disposition packages.partek.com/linux/flow-release

unzip PartekFlow-LINUX-6.0.17.0319.zip


vi ~/.bashrc and at the end:

export CATALINA_OPTS="-DflowDispatcher.flow.command.hostname=frontend.flowcluster -DflowDispatcher.akka.remote.netty.tcp.hostname=frontend.flowcluster"


source ~/.bashrc


./partek_flow/start_flow.sh

goto: http://awstest.partek.com:8080/

Enter license key

Set up admin account

Leave library file directory where it is and check that free space matches what you expect (~500 GB)



+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Background info and rationale for choices:

Candidate nodes, cheapest to most expensive. Cost is dynamic cost (see reserved pricing).

Want network speed to be 1GB/s or greater for multi-node setups.

Want HVM

No EBS optimized surcharge

Want placement group support

No T class servers as don’t want to slow responsiveness

We don’t use instance store since all data is lost after instance stop. Too risky.


EBS-only options:

Type:  Mem:  Cores:         EBS throughput (netw rate):   Monthly cost:

m4.large 8.0 GB             2 vCPUs    56.25 MB/s  M                         $78.840

r4.large 15.25 GB 2 vCPUs     50 MB/s       H(10G int)           $97.09

*m4.xlarge 16.0 GB 4 vCPUs     93.75 MB/s  H                         $156.950

r4.xlarge 30.5 GB 4 vCPUs     100 MB/s     H                         $194.180

*m4.2xlarge 32.0 GB 8 vCPUs     125 MB/s     H            $314.630

r4.2xlarge 61.0 GB 8 vCPUs     200 MB/s     H(10G int)           $388.360


Single server recommendation: m4.2xlarge

Cluster head node recommendation: m4.xlarge

Soma “test” server with worst case hardware: m4.large


Performance may suffer if one chooses smaller nodes.

Once can always shut down and change the instance types if a particular choice is insufficient. EBS volumes can be grown or performance changed.


Network speed (netw rate) US-EAST-1 internal and external.

L = 50Mb/s

M = 300Mb/s

H = 1Gb/s.


See network benchmarks: http://epamcloud.blogspot.com.br/2013/03/testing-amazon-ec2-network-speed.html


EBS types:

Use throughput optimized HDD for flow data.

Max throughput 500 MiB/s

$0.045 per GB-month of provisioned storage

500 GB min provisioned storage so min cost = $22.5 per month for a 500 GB drive.




Single-Node install : change pref to beefy server

Multi-node: less-beefy head node


Other Notes:


Pricing and resource table:

http://www.ec2instances.info/


ECU Vs. vCPU:

each vCPU is a hyperthread of an Intel Xeon core

1 ECU is the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor


ADD: placement group


paravirtualization: faster because not fully virtualized, but major drawback: You need a region-specific kernel object for each Linux instance. So just stick with hvm for ease of deployment. Hvm has increased in performance.


spreadsheet: removed anything that was not EBS optimized. removed GPU nodes, remove micro, nano, small instances

don't care about gpu support

remove < 4 GB memory


Linus grow filesystem:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html#recognize-expanded-volume-linux



 

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 1 rates

  • No labels