Page History
...
Attach, format, and move the ubuntu home directory into the large ST1 EBS volume. All Flow data will live in this volume. Consult the AWS EC2 documentation for further information about attaching EBS volumes to your instance.
sudo su
mkfs -t ext4 /dev/xvdb (Under Volumes in the EC2 console, inspect "Attachment information". It will likely list this volume as attached to /dev/sdb. Replace "s" with "xv" to find the device name to use for this mkfs command)
...
With newer EC2 instance types, it is possible to change the instance type of an already deployed Flow EC2 server. We recommend doing several rounds of benchmarks with production-sized workloads and evaluate if the resources allocated to your Flow server are sufficient. You may find that reducing resources allocated to the Flow server may come with significant cost savings, but may can cause UI responsiveness and job run-times to reach unacceptable levels. Once you have found an instance type that works, you may wish to use "reserved instance" pricing which is significantly cheaper than dynamic instance pricing. Reserved instances come with 1 or 3 year usage terms. Please see the EC2 Reserved Instance Marketplace to to sell or purchase existing reserved instances at reduced rates.
The network performance of the EC2 instance type becomes an important factor if your primary usage of Flow is for alignment. For this use case, one will have to move copious amounts of data back (input fastq files) and forth (output bam files) between the Flow server and the end users, thus it is important to have as what AWS refers to as "High network performance" which for most cases is around 1 Gb/s. If focus is primarialy primarily on downstream analysis and visualization (e.g. the primary input files are ADAT) then network performance is less of a concern.
We recommend HVM vitualization virtualization as we have not seen any performance impact from using them and non-HVM instance types can come with significant deployment barriers.
Make sure your instance is "EBS optimized" by default and you are not charged a surcharge for EBS optimization.
"T-class" servers, although cheap, may slow responsiveness for the Flow server and generally do not provide sufficient resources.
We do not recommend placing any data on "instance store" volumes since all data is lost on that volume those volumes after an instance stops. This is too risky as their there are cases where user tasks can take up unexpected amounts of memory forcing a server stop/reboot.
...
Latest (April 2017) EC2 instance costs:
Pricing The latest pricing and resource table:EC2 resource offerings can be found at http://www.ec2instances.info/
Instance Type | MemMemory | Cores | EBS throughput (netw rate) | Network Performance | Monthly cost |
---|---|---|---|---|---|
m4.large | 8.0 GB | 2 vCPUs | 56.25 MB/s M | Medium | $78.840 |
r4.large | 15.25 GB | 2 vCPUs | 50 MB/s H(10G int) | High (+10G interface) | $97.09 |
*m4.xlarge | 16.0 GB | 4 vCPUs | 93.75 MB/s H | High | $156.950 |
r4.xlarge | 30.5 GB | 4 vCPUs | 100 MB/s H | High | $194.180 |
*m4.2xlarge | 32.0 GB | 8 vCPUs | 125 MB/s H $314 | High | $314.630 |
r4.2xlarge | 61.0 GB | 8 vCPUs | 200 MB/s H(10G int) | High (+10G interface) | $388.360 |
Single server recommendation: m4.2xlarge
Cluster head node recommendation: m4.xlarge
Soma “test” server with worst case hardware: m4.large
Performance may suffer if one chooses smaller nodes.
Once can always shut down and change the instance types if a particular choice is insufficient. EBS volumes can be grown or performance changed.
Network speed (netw rate) xlarge or m4.2xlarge
Network performance values for US-EAST-1 internal and external.L = correspond to: Low ~ 50Mb/sM = , Medium ~ 300Mb/sH = , High ~ 1Gb/s.
See network benchmarks: http://epamcloud.blogspot.com.br/2013/03/testing-amazon-ec2-network-speed.html
EBS types:
Use throughput optimized HDD for flow data.
EBS volumes:
Volume type:
This is dependent on the type of workload. For must users, the Flow server tasks will be alignment-heavy so we recommend a throughput optimized HDD (ST1) EBS volume since most aligner operations are sequential in nature. For workloads that focus primarily on downstream analysis, a general purpose SSD volume will suffice but the costs are greater. For those who focus on alignment or host several users the storage requirements can be high. ST1 EBS volumes have the following characteristics:
Max throughput 500 MiB/s
$0.045 per GB-month of provisioned storage
...
($22.5 per month for a 500 GB
...
of storage).
Notes about EBS volumes:
500 (This is the minimum for st1 volumes) (See: Notes about EBS volumes) Throughput optimized HDD, throughput = 20 / 123 (can’t change) Baseline: 40 MB/s per TiB, no delete on terminate or encrypt
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html
Single-Node install : change pref to beefy server
Multi-node: less-beefy head node
Other Notes:
ECU Vs. vCPU:
each vCPU is a hyperthread of an Intel Xeon core
1 ECU is the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor
ADD: placement group
paravirtualization: faster because not fully virtualized, but major drawback: You need a region-specific kernel object for each Linux instance. So just stick with hvm for ease of deployment. Hvm has increased in performance.
spreadsheet: removed anything that was not EBS optimized. removed GPU nodes, remove micro, nano, small instances
don't care about gpu support
remove < 4 GB memory
Linus grow filesystem:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html#recognize-expanded-volume-linuxNote that EBS volumes can be grown or performance characteristics changed. To minimize costs, start with a smaller EBS volume allocation of 0.5 - 2 TB as most mature Flow installations generate roughly this amount of data. When necessary, the EBS volume and the underlying file system can be grown on-line (making ext4 a good choice). Shrinking is also possible, but may require the Flow server to be off-line.
Additional assistance |
---|
|
Rate Macro | ||
---|---|---|
|