OK, I'm bored on a plane so here goes. Disclaimer. My opinions, best practices are W-I-P ;-)
1. IP connections between your instances can be protected by firewalls and tunneling VPNs, but DBs are almost always bottlenecks in systems and network latency would be a killer. Replicate and sync your data across internal and cloud locations and work with it locally.
2. The names and IPs of instances are not volatile per se, and can be more predictable with Elastic IPs remapping to new/recovered instances. The issue is how to do this under auto-scaling conditions?
3. Within the same account container, private addressing should cover it? Public is mostly for remote admin and mapping to Elastic IPs for front-end client interactions.
4. HA Proxy on an small Linux instance is the most common solution for load balancing in AWS. Part of your scripting and start-up of the instances is to update grids, app clusters or load balancing groups.
5. You can't start an AMI at home, yet ;-) Availability zones are distinct and isolated compute regions - logical DCs. If performance optimization is your lead goal, keep them all in the same availability zone. For resilience, start services across multiple zones, possibly even NA and Euro zones now. What are you trying to achieve? Process a HPC job ASAP, 100% uptime website or a follow-the-sun active-active banking type architecture?
6. Multicast. Hmm.
7. Account structures. Traditionally 3 containers (accounts) mirrored the enterprise view of 3 partitioned environments. You can use one account with security groups to logically partition and segment them. In RightScale, you can now have nested hierarchies of accounts and containers, they're working with AWS to iron out portability across boundaries (e.g private AMIs)
8. Data encryption. Trust no-one and assume there's always some wise-ass snooping quietly around, no matter how good defenses are the burden is upon the defender in any hostile situation. Assume you're doing business in Grand Central Station. Personally-identifiable information, customer data and competitive knowledge are the big risk. Encrypt during transit (min outside your container) and encrypt data stored at rest, both for confidentiality (can't read) and integrity (can't amend). Workloads on queues and transient data I wouldn't protect en-mass, there would be too much of a performance hit. Database-level content encryption is a big quick win.
9. Keystore distribution. Would need to know more.
10. Target architecture model. I personally believe in staying away from proprietary APIs to get the maximum openness and portability given the market is still new. Migrating traditionally architected apps still has value. Having said that - building a new app (especially a transient one for an event say) - why not use what's available? Apps using SQS as the state machine works really well for massive scaling from what I've seen architecturally. Simple web apps using name/value pair storage interaction could use SimpleDB to avoid performance tuning, but could also use XML files?
Would I build a 5yr ROI enterprise app with them, no. But I also wouldn't use vendor library extensions in JEE app servers or SOA stacks also because of the switching headache. As always, it's a productivity vs. portability trade-off.
HTH,
Simon Plant
-----Original Message-----
From: Ricky Ho <rickyphyllis@yahoo.com>
Date: Thu, 18 Dec 2008 10:45:20
To: <cloud-computing@googlegroups.com>
Subject: [ Cloud Computing ] Designing "Cloud-aware" Apps
Love to see the active participations in this group.
I want to see if someone can share their best practices in developing
cloud-aware applications or migrating existing apps to the cloud. What
are the architectural considerations that is specific to the cloud ?
Which part of your existing application needs to be changed in order to run in
the cloud ? What are the outstanding issues ? ...
I have encountered some of these issues in the context of Amazon's web
services. I'll list in below and let me know if you encountered the
similar issues and how do you address them ...
1) Network Configuration Changes
Lets say before the change, your AppServer talk directly to the database within
your intranet. Now after you move the
AppServer to the cloud, how does it talk back to your database ? What kind of network configuration changes do
you need to make ? Firewall, VPN … etc. What are the set of security considerations
need to go through ?
2) Endpoint discovery
Most distributed applications are written to lookup its peers endpoint from a
configuration file which contains the node name and then make a DNS lookup for
the IP address. However, both the
machine name and IP address of an EC2 instance are volatile in the cloud. How does the peers discovered each other if
their name/IP address will change after a restart ?
3) Two-addresses scenario
There is a "public" and "private" address attached to an EC2 instance, which
one should I be using to communicate with my peers ? The discovery mechanism above need to be
aware of the location of the asker to give appropriate answers
4) Load balancer setup
Since Amazon has no specific support in load balancing, do you use your
in-house load balancer or run a specific EC2 instance of software-implemented
load balancer ? In both cases, how do
you notify the load balancer of the new members after you spawn more EC2
instances to deal with increased load ?
5) VM Placement
Where should an EC2 instance being started ? within your data center (private cloud), or which availability zone in
case of public cloud ? What are the cost
considerations as the charge of communications across different types of
boundaries will be different ? And what
are the fault resiliency considerations ?
6) No Multicast
Amazon doesn't route IP multicast traffic so applications using the multicast
socket won't work. How do you work
around this problem ?
7) Developer Account management
Do you use one aws account for your company ? or each app has its own aws
account ?
8) Data encryption
What kind of data that you need to encrypt when store inside the cloud ? Is the cloud secure enough to pass PCI
compliance ?
9) Keystore Distribution
Crypto algorithms typically requires your keystore file. How do you securely distribute your keystore
to the running EC2 instance ?
10) Eventual Consistency Model
Do you change your application to store data in S3 and SimpleDB ? Where is the line drawn between a traditional
ACID consistency model and the much relaxed Eventual consistency model. Do you need to put more integrity checking
logic in your application to compensate the lost of guarantee at the DB level ?
I'd love to hear about your experience in dealing with above
issues and let me know if you have encountered other issues that I haven't
covered.Rgds, Ricky
http://horicky.blogspot.com
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Cloud Computing" group.
To post to this group, send email to cloud-computing@googlegroups.com
To unsubscribe from this group, send email to
cloud-computing-unsubscribe@googlegroups.com
To post job listing, send email to jobs@cloudjobs.net (position title, employer and location in subject, description in message body) or visit http://www.cloudjobs.net
To submit your resume for cloud computing job bank, send it to resume@cloudjobs.net.
For more options, visit this group at
http://groups.google.ca/group/cloud-computing?hl=en?hl=en
Posting guidelines:
http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
This group posts are licensed under a Creative Commons Attribution-Share Alike 3.0 United States License http://creativecommons.org/licenses/by-sa/3.0/us/
Group Members Meet up Calendar - http://groups.google.ca/group/cloud-computing/web/meet-up-calendar
-~----------~----~----~----~------~----~------~--~---
No comments:
Post a Comment