Create
Log in to your AWS management console. The login URL may depend on your organization, but it is likey https://console.aws.amazon.com
Click on EC2
Switch to the region you will deploy Flow. For this tutorial, we will be using US East (N. Virginia)
On the left menu, click on Instances, then click the Lauch Instance button. You will be brought to the first step denoted "Choose an Amazon Machine Image (AMI)"
Click the Select button next to "Ubuntu Server 16.04 LTS (HVM), SSD Volume Type - ami-f4cc1de2"
For the next step "Choose an Instance Type" the selection depends on your budget and the size of the Flow deployment. We recommend m4.large for testing or cluster front-end operation, m4.xlarge for standard deployments, and m4.2xlarge for alignment-heavy workloads with a large user-base. See (AWS instance type resources and costs) for assistance with choosing the right instance. In most cases, the instance type and associated resources can be changed after deployment, so one is not locked in to the choices made for this step.
For "Configure Instance Details", make the following selections:
Number of instances = 1. You do not need to create an autoscaling group for single-node deployments.
Purchasing option : Leave "Request Spot instances" unchecked. This is relevant for cost-minimization of Flow cluster deployments.
Network: If you do not have a VPC already created for Flow, click "Create new VPC". This will open a new browser tab for VPC management.
Use the following settings for the VPC:
Name tag: Flow-VPC
IPv4 CIDR block: 10.0.0.0/16
Select "No IPv6 CIDR Block"
Tenancy: Default
Click "Yes, Create". You may be asked to select a DHCP option set. If so, then make sure the DHCP option set you select or create has the following options:
Options: domain-name = ec2.internal;domain-name-servers = AmazonProvidedDNS;
DNS resolution: leave the defaults set to yes
DNS hostname: change this to yes and you may need internal dns resolution depending on the Flow deployment.
Once created, the new Flow-VPC will appear in the list of available VPCs. The VPC needs additional configuration for external access. To continue, right click on Flow-VPC and select "Edit DNS Resolution", then select "Yes" and then Save. Next, right click the Flow-VPC and select "Edit DNS Hostnames", select Yes, then Save.
Make sure the DHCP option set is set to the one created above. If it is not, select "Edit DHCP option sets" by right clicking on the Flow-VPC and select the correct VPC.
Now, close the VPC management tab and go back to the EC2 management console. Click the refresh arrow next to "Create new VPC" and select the Flow-VPC VPC.
Next, click "Create new subnet" and a new browser tab will open with a list of existing subnets. Click "Create Subnet" and set the following options:
Name tag: Flow-Subnet
VPC : Flow-VPC
VPC CIDRs: This should be automatically populated with the information from Flow-VPC
Availability Zone: It is OK to let Amazon choose for you if you do not have a preference
IPv4 CIDR block: 10.0.1.0/24
Stay on the VPC dashboard tab and on the left navigation menu, click "Internet gateways", then click "Create Internet Gateway" and use the following options:
Name tag = Flow-IGW, then click "Yes, create".
The new gateway will be displayed as "detached". Right click on the Flow-IGW gateway and select "Attach to VPC", then select Flow-VPC and click "Yes, Attach"
Click on "Route Tables" on the left navigation menu.
If it exists, select the route table already associated with Flow-VPC. If not, make a new route table and associate it with Flow-VPC. Click on the new route table, then click the "Routes" tab toward the bottom of the page. The route Destination = 10.0.0.0/16 Target = local should already be present. Click Edit, then Click "Add another route" and set the following parameters:
Destination 0.0.0.0/0
Target set to Flow-IGW (the internet gateway just created)
Click "Save"
Close the VPC Dashboard browser tab and go back to the EC2 Management Console tab. We are still on "Step 3: Configure Instance Details." Click the refresh arrow next to "Create new subnet" and select Flow-Subnet.
Auto-assign public ip: Use subnet setting (Disable)
Placement group: No placement group
IAM role : None. NOTE: For multi-node Flow deployments or instances where you would like Partek to manage AWS resources on your behalf, please see (Partek AWS support) and set up an IAM role as needed.
Shutdown behavior: Stop
Enable termination protection: select "Protect against accidental termination"
Monitoring: leave "Enable CloudWatch detailed monitoring" disabled
EBS-optimized instance: Make sure "Launch as EBS-optimized instance" is enabled and non-selectable. Given the recommended choice of a m4 instance type, EBS optimization should be enabled at no extra costs.
Tenancy: Shared - Run a shared hardware instance
Network interfaces: leave as-is
Advanced details: leave as-is
Click "Next: Add storage". You should be on the "Step 4: Add Storage"
For the existing root volume, set the following options:
Size: 8GB
Volume type: Magnetic
Select "Delete on termination"
Click "Add New Volume" and set the following options:
Volume Type: EBS
Device: /dev/sdb (take the default)
Do not define a snapshot
Size (GiB): 500 This is the minimum for st1 volumes (See: Notes about EBS volumes)
Volume Type: Throughput optimized HDD (ST1)
Do not delete on terminate or encrypt
Click "Next: Add Tags"
You do not need to define any tags for this new EC2 instance, but you can if you would like.
Click "Next: Configure Security Group"
For "Assign a security group" select "Create a new security group"
Security group name: Flow-SG
Description: Security group for Flow server
Add the following rules:
1) SSH set Source to My IP (or the address range of your company or institution)
2) Click Add Rule, Type is Custom TCP Rule, Port Range = 8080, Source is anywhere (0.0.0.0/0, ::/0). It is recommended to restrict this to just those that need access to Flow.
Click "Review and Launch"
AWS will suggest this server not be booted from a magnetic volume. Since there is not a lot of IO on the root partition and reboots are will be rare, choosing "Continue with Magnetic" will reduce costs. Choosing an SSD volume will not provide substantial benefit but it it OK if one wishes to use an SSD volume.
Click Launch
Create a new keypair : Name it Flow-Key, download it, the run chmod 600 on the downloaded key so it can be used. Backup this key as you may lose access to the Flow instance without it.
The new instance will now boot. Use the left navigation bar and click on Instances. Click the pencil icon and assign the instance the name “Flow Server”
Next, we will need to make the server accessible at a fixed address. To do this, click on "Elastic IPs" on the left.
Click "Allocate new address"
Assign Scope to VPC
Click Allocate
On the table containing the newly allocated elastic IP, right click and select "Associate Address"
For Instance select the instance name “Flow Test Server”
For Private IP, select the one private IP available, then click Associate
For the remaining steps, we refer to the elastic ip as "elastic.ip"
SSH to the new Flow-Server instance:
chmod 600 Flow-Key.pem
ssh -i Flow-Testing.pem ubuntu@elastic.ip
Attach, format, and move the ubuntu home directory into the large ST1 EBS volume. All Flow data will live in this volume.
sudo su
mkfs -t ext4 /dev/xvdb (Under Volumes in the EC2 console, inspect "Attachment information". It will likely list this volume as attached to /dev/sdb. Replace "s" with "xv" to find the device name to use for this mkfs command)
Make a note of the newly created UUID for this volume
Copy the ubuntu home directory onto the EBS volume using a temporary mount point:
mount -t ext4 /dev/xvdb /mnt/
rsync -avr /home/ /mnt/
umount /mnt/
Make the EBS volume mount at system boot:
Add the following to /etc/fstab
UUID=(the UUID from the mkfs command above) /home ext4 defaults,nofail 0 2
mount -a
logout then login
We contain everything in /home/ubuntu, so we need to do a zip install.
Install Flow (we assume you have the license) zip install to keep everything in /home/ubuntu:
Install the following packages beforehand:
sudo apt-get update
sudo apt-get install python perl make gcc g++ zlib1g libbz2-1.0 libstdc++6 libgcc1 libncurses5 libsqlite3-0 libfreetype6 libpng12-0 zip unzip libgomp1 libxrender1 libxtst6 libxi6 debconf
Now Flow:
Exit back to ubuntu. Be in the home directory
wget --content-disposition packages.partek.com/linux/flow-release
unzip PartekFlow-LINUX-6.0.17.0319.zip
vi ~/.bashrc and at the end:
export CATALINA_OPTS="-DflowDispatcher.flow.command.hostname=frontend.flowcluster -DflowDispatcher.akka.remote.netty.tcp.hostname=frontend.flowcluster"
source ~/.bashrc
./partek_flow/start_flow.sh
goto: http://awstest.partek.com:8080/
Enter license key
Set up admin account
Leave library file directory where it is and check that free space matches what you expect (~500 GB)
Notes about EBS volumes:
500 (This is the minimum for st1 volumes) (See: Notes about EBS volumes) Throughput optimized HDD, throughput = 20 / 123 (can’t change) Baseline: 40 MB/s per TiB, no delete on terminate or encrypt
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html
Support and multinode:
IAM Role:
role name = Flow-Testing
role type : add
Amazon EC2
leave policies alone
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Background info and rationale for choices:
Candidate nodes, cheapest to most expensive. Cost is dynamic cost (see reserved pricing).
Want network speed to be 1GB/s or greater for multi-node setups.
Want HVM
No EBS optimized surcharge
Want placement group support
No T class servers as don’t want to slow responsiveness
We don’t use instance store since all data is lost after instance stop. Too risky.
EBS-only options:
Type | Mem | Cores | EBS throughput (netw rate) | Monthly cost |
---|---|---|---|---|
m4.large | 8.0 GB | 2 vCPUs | 56.25 MB/s M | $78.840 |
r4.large | 15.25 GB | 2 vCPUs | 50 MB/s H(10G int) | $97.09 |
*m4.xlarge | 16.0 GB | 4 vCPUs | 93.75 MB/s H | $156.950 |
r4.xlarge | 30.5 GB | 4 vCPUs | 100 MB/s H | $194.180 |
*m4.2xlarge | 32.0 GB | 8 vCPUs | 125 MB/s H | $314.630 |
r4.2xlarge | 61.0 GB | 8 vCPUs | 200 MB/s H(10G int) | $388.360 |
Single server recommendation: m4.2xlarge
Cluster head node recommendation: m4.xlarge
Soma “test” server with worst case hardware: m4.large
Performance may suffer if one chooses smaller nodes.
Once can always shut down and change the instance types if a particular choice is insufficient. EBS volumes can be grown or performance changed.
Network speed (netw rate) US-EAST-1 internal and external.
L = 50Mb/s
M = 300Mb/s
H = 1Gb/s.
See network benchmarks: http://epamcloud.blogspot.com.br/2013/03/testing-amazon-ec2-network-speed.html
EBS types:
Use throughput optimized HDD for flow data.
Max throughput 500 MiB/s
$0.045 per GB-month of provisioned storage
500 GB min provisioned storage so min cost = $22.5 per month for a 500 GB drive.
Single-Node install : change pref to beefy server
Multi-node: less-beefy head node
Other Notes:
Pricing and resource table:
ECU Vs. vCPU:
each vCPU is a hyperthread of an Intel Xeon core
1 ECU is the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor
ADD: placement group
paravirtualization: faster because not fully virtualized, but major drawback: You need a region-specific kernel object for each Linux instance. So just stick with hvm for ease of deployment. Hvm has increased in performance.
spreadsheet: removed anything that was not EBS optimized. removed GPU nodes, remove micro, nano, small instances
don't care about gpu support
remove < 4 GB memory
Linus grow filesystem:
Additional Assistance
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Your Rating: | Results: | 1 | rates |