Thoughts on Software Development

Saturday, April 20, 2013

Currently, it seems that most business activity in the cloud is focused on Infrastructure as a Service (IAAS) rather than Platform as a Service (PAAS).

Despite the obvious advantages of PAAS over IAAS, very few enterprises would see economic benefit in redesigning their applications for a PAAS platform. IAAS provides a means to allow them to use their already virtualized environments on someone else’s hardware. Ultimately, cloud computing is about economics. Even those moving to PAAS, for the most part, are treating it as a hosting platform. They are looking for a cheaper way to host, which as I have argued elsewhere is usually not the case.

In both cases, what people are doing is outsourcing the datacenter.

To get the true benefits of cloud computing, whether through IAAS or PAAS, you have to design explicitly for that platform. I have discussed this in many past blog posts. You have to design explicitly for failure, and you have to recognize that you are building a distributed system. Over the long term this approach will bring sustained success.

IAAS as currently applied is a dying niche. Of course any evolutionary extinction takes time. The long time span allows it to be unnoticed, or even if noticed, most people feel it will not affect them. Of course sometimes dynamic changes occur. Ask the dinosaurs, or even the railroads or minicomputer companies. See how the companies that compose the Dow Jones averages have changed over the past 100, 50, 30, or even 10 years.

Companies continue to do things the way they have always done until a successful company creates a new fad in corporate philosophy. Companies then run to become blind followers. One classic example is Deming’s work on modern manufacturing. Although he lived in the US, it was not until the Japanese had success with it that US industry started to adopt his ideas.

Someday, somewhere, somebody is going to build a widely successful company using PAAS. Then, everyone, like lemmings, will run to emulate them. The irony is that some will run to do it even if it makes no economic sense for them.

4/20/2013 11:38:27 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | SOA | Software Development

Sunday, March 17, 2013

Public and Private Cloud are Contranyms

People get confused and try to reconcile what "cloud" means in the terms public cloud and private cloud. To me they are contranyms. Those are words that are spelled the same but have two entirely different meanings.

For example, the word "fast". It can mean to move rapidly. The runner ran fast. It can also mean to be held tightly. The mast held fast to the ship during the storm. Another example is "clip". He clipped the papers together. He clipped the coupons from the newspaper.

The world cloud is being used in two different ways. In public cloud, the term cloud refers to utility computing which is effectively massively scalable. In private cloud, the term cloud refers to a virtualized computing environment which is not massively scalable.

3/17/2013 9:47:10 AM (Eastern Standard Time, UTC-05:00) | Comments [0] | All | Cloud Computing | SOA

Wednesday, July 25, 2012

Client Side Failure Handling

In order to build robust cloud applications, a client that calls a service has to handle four scenarios:

Complete Success
Partial Success (Success with conditions)
Transient Failure
Resource Failure

A partial success occurs when a service only accomplishes part of a requested task. This might be a query where you ask for the last 100 transactions, and only the last 50 are returned. Or the service only creates an order entry, but does not submit the order. Usually a reason is supplied with the partial success. Based on that reason the client has to decide what to do next.

Transient failures occur when some resource (like a network connection) is temporarily unavailable. You might see this as a timeout, or some sort of error information indicating what occurred. As discussed in a previous post , continually retrying to connect to a transient resource impedes scalability because resources a being held on to while the retries are occurring. Better to retry a few times and if you cannot access the resource, treat it as a complete failure.

With failure, you might try another strategy before you treat the resource access as a failure. You might relax some conditions, and then achieve partial success. You might access another resource that might be able to accomplish the same task (say obtain a credit rating) albeit at greater cost. In any case all failures should be reported to the client. You can summarize this responsibility in this diagram:

Bill Wilder helped me formulate these thoughts.

Friday, June 08, 2012

Cloud Outages Do Occur

We have been talking about cloud failures. How likely are they?

Some outages occur all the time. Even with the excellent reliability of hardware, a cloud data center has an enormous number of components. Something is always failing. Cloud data centers are built to detect failure, and move applications to working hardware, and restart them. They have failure zones so that instances of the same service are kept on different sets of hardware.

What about other kinds of outages?

Amazon and Windows Azure have had interruptions in service, some longer than others. There could be major power outages such as in the Northeast United States in 1965 that left people without power for up to 12 hours. In 2003 there was a major power outage in the Northeastern and North central US as well as Ontario, Canada. The Japanese tsunami had a similar effect. Many smaller outages occur after storms. Even if a data center remains, external Internet connections might be interrupted. In 2009 Google had a major outage in Asia caused by a configuration error that caused problems even in the United States and Europe. The problem was analogous to the cause of the 1965 Northeast US power failure.

As we have discussed in previous posts, any software or service that you depend on is a possible source of an outrage, including the Internet itself. You don’t even need an outage; all you need is for a service to become less responsive. Remember when Michael Jackson died? It was difficult to get to any web site, because there was not enough bandwidth to accommodate everybody’s surfing to find out the news. It was the largest self-inflicted denial of service attack ever.

Are these outages really rare?

As we found out with the financial sector, black swan events can happen. Random events do occur. Small probability events happen. Kahneman and Tversky demonstrated that people reason about probability poorly. While this can be acceptable behavior with regard to personal decisions, it is very questionable when it comes to estimating the probabilities of rare engineering events

You cannot assume that any connection to a distributed service will always be available.

Netflix’s continual availability during the April 2011 Amazon outage is now legendary. The reason for their success was because they assumed failure was possible. They had stateless services. They restricted the use of relational data to where it was really necessary so they could switch to a hot standby. They degraded gracefully, only keeping alive services that were really necessary. You might not have been able to get your personalized movie list, but you could still find and play movies. They had enough excess capacity to deal with transient failures, and shifting loads.

Assume the rare can occur, because it will.

6/8/2012 3:55:55 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | SOA | Software Development

Thursday, May 24, 2012

Avoiding Single Points of Failure

To avoid a single point of failure you needed to have redundancy.

There are several power lines between a power station and a customer. When an overload is detected by a circuit breaker, power can be rerouted along another line. Each individual line is normally run at reduced capacity so it can handle the increased load.

Redundancy, however, introduces problems. Business applications have state. One electron is as good as another and can be easily rerouted; application state is tied to particular users, sessions, or transactions. I need to access parts of my current order. If the network connection I am on gets broken, substituting someone else's order does not work. Parallel processing of mathematical algorithms (such as MapReduce, or multi-threaded versions of Linq) cannot solve this problem.

You need to remember information about your users, or what products are on order. Some of this data changes rarely if at all; some of this data changes dynamically. Some of this data is very valuable, some of it is not. In a non-cloud application you might use data base mirroring. But you pay a performance penalty for a database mirror because the transaction must commit on both the original and the mirror. Imagine the cost (latency and throughput) of trying to keep redundant databases in different geographic areas in sync. Hence, you must deal with state explicitly.

You have to reduce the parts of your application that handle state to a minimum. Loss of a component means the loss of application state that it holds.

This is classic advice for scaling systems, but is more critical in the cloud. In the past we created stateless middle tier components for easier scalability, and relied on the data tier to handle scale. You then rely on clusters or technology built into the database (transaction logs, etc.). While stateless components are still an excellent idea, we cannot always rely on the database to scale, and the possibility of database failure is a large risk.

The business layer and the domain layer should be stateless, and to the greatest extent possible the clients should hold whatever state they can. Stateless services can be added or removed easily to handle changes in demand or the failure of a service instance, since you have decoupled functionality. Try to build services that are as atomic as possible because that makes it easier to scale or recover from failure by using redundancy. Here, atomic does not mean small.

Suppose you put customer information in the same service as your catalog service. If the customer information service goes down, so does the catalog. You should be able to check in people to your flight even if you cannot give the prices for tomorrow's flights.

There are other techniques you can use. For example, if the bill paying service is not available, you can usually use a transactional queue, and just retry the payment until the service is available. That is one of the reasons why banks, for example, say that they need 24 hours for an electronic payment that could be processed immediately. The odds that the payment service will be down for 24 hours are miniscule.

Also don't forget infrastructure pieces as to where logs are stored. You should probably back them up frequently.

So what about where state has to be stored, say in the data tier? This is a complicated problem which I have discussed this before in both a blog post and a presentation . It basically boils down to the question: what is the acceptable level of data loss? As it turns out, in many applications you do not have to be absolutely consistent in all places, and you can relax consistency constraints to get scalability and reliability. Now you may think this is ridiculous, data can never be lost. But think about how business is actually done.

Airline reservation systems separate out the flight query database, from the transactional database where flights are booked. As a result, occasionally a flight or a price you thought was available is not. But if they did not do that, the performance of making reservations would be very poor. More business would be lost under strict consistency than under relaxed consistency constraints. Why does Amazon use an email system for notifying you of your book order? To scale the user interface and to avoid performance problems due to abandoned shopping carts. You have to ask the question: What is the cost of an apology for the data loss?

What kind of availability are your customers actually willing to pay for, as opposed to what they say they want? Given a choice, do they really want absolute consistency in all cases? If you used replication over a short period of time to another data center, and a data center was destroyed by a hurricane, and a certain amount of data was lost, how terrible would that be?

Consider grouping your components in units of failure. If you have components of type α, β, and γ, you could put all α on a single host, all β on a separate host, or all γ on a separate host. Or you could put groups of α, β, γ on separate hosts. The latter would cause a complete failure on a loss assuming affinity to a given machine. The former would only result in a failure of α functionality.

What do you do if your components fail? Do you have to reroute traffic to another data store, or data center? Do you need to add more instances of a component? You have to monitor your components and understand why they fail. Under certain circumstances, you can degrade performance. Caching might help in keeping your application running under failure.

Degrade gracefully and predictably. Know what you can live without.

5/24/2012 7:44:21 AM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | SOA | Software Development

Sunday, May 20, 2012

Designing for Failure

Designing for Failure has been around a lot longer than cloud computing. As we have discussed in several other blog posts, cloud computing, as opposed to hosting in the cloud, is about the ability to acquire or release computing resources as necessary. Acquiring more resources allows you to keep up with demand, or to compensate for a failed instance of a resource.

You must examine every source of dependency in your application: third party libraries, hardware, software interfaces between parts of your own application, TCPIP ports, DNS servers, message queues, database drivers, database size, latencies, to name just a few. These include third party services such as credit card processors, fraud detection services, and geocoding services.

You also have to examine your queries because small queries can become large overnight as you scale. This is why search providers limit the result set that they return. See what kind of joins your ORM is producing when it handles inheritance. Look at the number of objects coming back from a DCOM or RMI call.

Any one of these could fail, or cause latency. As we discussed in a previous post, any potential long latency has to be treated as a potential failure. You need to avoid single points of failure because they are potential bottlenecks or failure points.

Acquiring more resources costs more money. So every strategy is a tradeoff between keeping the application responding (available and scalable) and how much it costs. This is driven of course, by what customers are willingness to pay. Every strategy has to undergo a cost benefit analysis.

The more you approach 100% availability, however, the more the law of diminishing return sets in. I cannot here tell you which problems you should solve, and which you can safely ignore. It depends on your application and your customers. I can tell you that every component is a potential source of failure.

Avoid single points of failure. Accept the fact that you have to build a distributed system.

5/20/2012 9:14:43 AM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | SOA | Software Development

Sunday, May 13, 2012

Margin of Safety

You have to make sure your cloud application is not brittle. Make your components more resistant to failure. Bridges can withstand more traffic than their largest anticipated load. Since you can add and remove resources in a cloud computing environment, your margin of safety can expand or contract as your load expands or contracts. Nonetheless, adding and removing resources is not instantaneous. You have to make sure that your system can handle a "normal load".

How do you determine your margin of safety?

Look at every resource you use in the system: database sizes, bandwidth, virtual memory, CPU, network latencies, and the response times of your software and your third party components. See how they respond under various types of commands, reports, and queries over time. Because of the economic costs, and possible performance hits with handling failure, you want to ensure your application in its normal state of operations can handle the load. You might want to factor in some likely scenarios, for instance, and make the resources required larger than might be ordinarily needed. Make sure all errors are handled, even unlikely ones. Return clear error codes that indicate what the problem is to the best of your ability. When problems occur, you might degrade performance rather than eliminate functionality. Determine what functionality is essential and what is not. During the Amazon outage last year Netflix turned off personalized movie lists, but you could still get lists of movies and play them.

Make reasonable SLA promises to your customers. So the UI can scale properly, Amazon sends confirmation emails for book orders.

A chain is as strong as its weakest link. If your web front end has limited capacity, or you run out of TCP/IP ports, it does not matter how strong your database server is.

Use a Margin of Safety when determining the resources needed for your application.

5/13/2012 12:59:10 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | SOA | Software Development

Sunday, March 18, 2012

Transient Failures Don't Exist

In the simple example we have been discussing, the consequences of a failure appear immediately to the user. In a more complicated architecture there are many more tiers and many more dependencies. With more dependencies, more problems can possibly result from poor decisions on how to handle failure. Those dependencies include other applications in your own shop, third party libraries that you don't control, the internet, etc. For example, if your order queues fail, you cannot do orders. If your customer service app fails, you cannot retrieve member information. Unhandled failures propagate (like cracks) throughout your application.

Failures Cascade - a unhandled failure in one part of your system becomes a failure of your application.

In deciding on how to respond to failure we have to distinguish between two types of failure: transient failures and resource failures. Transient failures are not due to component failure, but a resource temporarily under load that cannot respond as fast as you had assumed. With resource failures you have to have an alternative strategy because a component is not available.

Transient failures occur for short periods of time. The typical response is to retry the operation after a short period of time. But questions still remain. How often do you retry? What is a short period of time? What do you with the data during the retry? On the other hand, remember that just as failures cascade, so do delays. While you are waiting or retrying, scarce resources are being used (threads, memory, TCPIP ports, database connections) that cannot be used for other requests.

Go back to our WCF example and look at the try/catch block. If we had to do a retry how would we change the logic? We will have to adopt a whole different strategy to handle failure, a retry loop inside the catch handler would not be enough because other failures could reoccur, and not every error would allow a retry. You have to design the entire routine around expecting failure, because as we discussed in our last post, failures happen.

Since slow responses usually come from resource bottlenecks, you have to treat them as failures if you are going to have reasonable availability. Which means that transient failure can soon look like resource failures. So what do you do? Retry for a limited amount time and then give up. In addition, never block on an I/O, timeout and assume failure.

From the point of view of architecture and design, there is really no such thing as a transient failure. If you have a transient failure, fail fast and treat it as a resource failure.

Friday, March 09, 2012

Failures Are a Fact of Software Life

Why is failure endemic to distributed systems?

In the past two blog posts we talked about a hypothetical ASP.NET application. Let's add a second tier to this app where we make a call to a web service. We will then have some version of the following code fragment which resembles something everybody has written:

ClientProxy client = new ClientProxy();
int result = client.Do(a, b, c);

What's wrong with this?

We have assumed that the call would succeed. Why would it not succeed? At the very minimum you could have a network timeout. You are assuming you have control over a resource that your really do not. The fundamental concept in designing for failure is to understand that any interface between two components can fail. So we rewrite the code as follows:

try
{
   ClientProxy client = new ClientProxy();
   int result = client.Do (a, b, c);
}
   catch (Exception ex)
{
   ????
}

But now what do you do in the exception handler?

In this simple example, how many times do you retry? When you give up do you cache the input, or do you make the user enter it over again? Suppose the service on the other side stopped working? What happens when the underlying hardware crashes, and your application has to be restarted. Where is the user data then?

What about total failure conditions? Do you "go to" out of the exception handler? Where do you go to?

You cannot program your way out of a failure condition in code that is based on the assumption that everything works properly. You have to architect and design for failure conditions from the start. The critical issue is how you respond to that failure.

Here is the fundamental principle of designing for failure:

Assume failure will occur. The question is how will the application respond to that failure. You cannot depend on the underlying infrastructure to achieve availability because it cannot make that guarantee.

Monday, February 27, 2012

Cloud Computing is Designing For Failure

Continuing out examination of hosting options from the last post, perhaps, the cheaper options have lower reliability. Here are the availability numbers for our providers:

Provider	Compute SLA (%)
Go Daddy	99.9
ORCS Web	99.9
Host Gator	99.9
Rackspace	Repair within one hour
Amazon	99.95
Azure	99.95

If Amazon and Azure are more expensive with a similar SLA, why use them?

Availability numbers do not tell the whole story.

The rate for cloud computing infrastructure is more expensive because it allows you to pay only for what you use. Not only does this allow you to use computing resources more economically, it allows you to design around outages. With a cloud computing infrastructure, you can reach, if you wish to pay for it, very close to 100% availability.

This gets to the essence of the matter. You are doing cloud computing when you are interested in one of two things:

Paying for only the computing resources you use so you do not have to buy enough hardware for peak scenarios that happen infrequently.
You want to achieve very high reliability, with almost no downtime. You have to design for failure.

We often refer to these goals as scalability and availability. Scalability is making sure your application can handle increase load with reasonable performance. Availability is making sure your application has reasonable performance for a reasonable amount of time. What is reasonable, depends on the economics of your business environment.

Technically they are different problems. Depending on your system load, you could have reasonable availability for the vast majority of the times your customer wants, and just be very unresponsive under very heavy loads. Or, you could handle very large loads for small periods of time, and be very unresponsive the remainder of the time.

But for most applications they are closely related. To have high scalability requires you not only to be able to acquire more computing resources, but it requires you to be able to detect and handle failures of existing computing resources. High availability requires the same thing.

Designing for failure is cloud computing.

Thursday, January 12, 2012

Hosting is Not Cloud Computing

Lincoln said that calling a tail a leg does not make a horse a five legged animal. Hosting an application in a cloud computing environment does not mean that you are doing cloud computing.

One of the reasons that people are attracted to cloud computing is because they do not have to host their own infrastructure or run a data center. But that is true of traditional hosting. Now the feature sets of a cloud computing environment may make you want to host there, but that is only an enlightened hosting decision.

Let's make this clear with an example.
Suppose I decide that I do not want to run my traditional ASP.NET application in my data center. I have a web front end, and a back-end database.

Potentially, I have to pay for:

   Hosting the application (pay for the machines virtual or otherwise that I need)
   Local File storage
   Database (relational or otherwise)
   Bandwidth (in and out of the data center)

Let's assume this translates to the following configuration:

   1 instance of a Web Site
   Virtual Machine or Equivalent Hardware
   One 1 GB SQL Server
   Inboud and Outbound Bandwidth Costs
   File System Costs

Researching some of the Service Providers I found the following costs for this configuration:

Provider	$/ Monthly Cost
Go Daddy	10
Orcs Web	69
Host Gator	70
Rackspace	117
Amazon	86 + Bring Your Own SQL Server
Windows Azure	102

None of these providers are traditional hosters (provide your own machines, and rent bandwidth, cooling, or electricity). Nor are ORCS Web's and Rackspace's offerings to build a colocated data center considered.

Virtualizing your data center is not considered here as cloud computing because you still have to build out to maximum capacity. Nontheless, it might be considered cloud computing from the point of view of your internal users if they can get resources elastically.

So we see there are cheaper options if you just want to host. Now if you want to some feature that one hosting company has that the other does not, say blob storage, you could use that in conjunction with your host (say Amazon or Azure blobs).

Hosting is hosting no matter where you do it. Hosting an application in a cloud computing environment does not mean you are doing cloud computing.

1/12/2012 11:19:47 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | SOA | Software Development

Wednesday, September 15, 2010

Imagineering for Data: What is Microsoft Dallas?

"Government, without popular information, or the means of acquiring it, is but a Prologue to a Farce or a Tragedy; or, perhaps both. Knowledge will forever govern ignorance." James Madison

What is it?

Control over information is a societal danger similar to control over economic resources or political power. Representative government will not survive without the information to help us create meaningful policies. Otherwise, advocates will too easily lead us to the conclusion they want us to support.

How does one get access to this data?

Right now, it is not easy to get access to authoritative data. If you have money you search for it, purchase it, or do the research to obtain it. Often, you have to negotiate licensing and payment terms. Why can’t we shop for data the same way we find food, clothing, shelter, or leisure activities? None of these activities requires extensive searches or complex legal negotiations.

Why can’t we have a marketplace for data?

Microsoft Dallas is a marketplace for data. It provides a standard way to purchase, license, and download data. Currently it is a CTP, and no doubt will undergo a name change, but the idea will not.

The data providers could be commercial or private. Right now, they range from government agencies such as NASA or the UN to private concerns such as Info USA and NAVTEQ. You can easily find out their reputations so you know how authoritative they are.

As a CTP there is no charge, but the product offering will have either transaction/query or subscription based pricing. Microsoft has promised “easy to understand licensing”.

What are the opportunities?

There is one billing relationship in the marketplace because Microsoft will handle the payment mechanisms. Content Providers will not have to bill individual users. They will not have to write a licensing agreement for each user. Large provider organizations can deal with businesses or individuals that in other circumstances would not have provided a reasonable economic return. Small data providers can offer their data where it would have previously been economically unfeasible. Content Users would then be able to easily find data that would have been difficult to find or otherwise unavailable. The licensing terms will be very clear, avoiding another potential legal headache. Small businesses can create new business opportunities.

The marketplace itself is scalable because it runs on Microsoft Azure.

For application developers, Dallas is about your imagination. What kind of business combinations can you imagine?

How do you access the data?

Dallas will use the standard OData API. Hence Dallas data can be used from Java, PHP, or on an IPhone. The data itself can be structured or unstructured.

An example of unstructured data is the Mars rover pictures. The Associated Press uses both structured and unstructured data. The news articles are just text, but there are relationships between various story categories.

Dallas can integrate with the Azure AppFabric Access Control Service.

Your imagination is the limit.

The standard API is very simple. The only real limit is your imagining the possibilities for combining data together.

What kind of combinations can you think of?

Sunday, July 11, 2010

Scaling Up vs. Scaling Out

Commodity hardware has gotten very cheap. Hence it often makes more economic sense to spread the load in the cloud over several cheap, commodity servers, rather than one large expensive server.

Microsoft's Azure data pricing makes this very clear. One Gigabyte of SQL Azure costs about $10 per month. Azure table storage costs $0.15 per GB per month.

The data transfer costs are the same for both. With Azure table storage you pay $0.01 for each 10,000 storage transactions.

To break even with the SQL Azure price you can get about 9,850,000 storage transactions per month. That is a lot of bandwidth!

Another way to look at the cost is to suppose you need only 2,600,000 storage transactions a month (1 a second assuming an equal time distribution over the day). That would cost you only $2.60. That means you could store almost 50 GB worth of data. To store 50 GB worth of data in SQL Azure would cost about $500 / month.

If you don't need the relational model, it is a lot cheaper to use table or blob storage.

Sunday, December 27, 2009

Why should you consider Cloud Computing? : Part 3

One way to approach the different architectural implications is to look at the various vendor offering and see how they match to the various application types.

You can divide the cloud vendors into four categories, although one vendor might have offerings in more than one category:

Platform as a Service providers
Software as a Service providers
Application as a Service providers
Cloud Appliance Vendors

The Platform as a Service providers attempt to provide a cloud operating system for users to build an application on. An operating system serves two basic functions: it abstracts the underlying hardware and manages the platform resources for each process or user. Google App Engine, Amazon EC2, Microsoft Azure, and Force.com are examples of platform providers.

The most restrictive platform is the Google App Engine because you program to the Google API which makes it difficult to port to another platform. One the other hand, precisely because you program to a specific API, Google can scale your application and provide recovery for a failed application.

At the other extreme is Amazon. Amazon gives you a virtual machine which with you can program directly against the native OS installed on the virtual machine. This freedom comes with a price. Since the Amazon virtual machine has no knowledge about the application you are running, it cannot provide recovery or scaling for you. You are responsible for doing that. You can use third party software, but that is just a means of fulfilling your responsibility.

Microsoft tries to achieve a balance between these two approaches. By using .NET you have a greater degree of portability than the Google API. You could move your application to an Amazon VM or even your own servers. By using metadata to describe your application to the cloud fabric, the Azure infrastructure can provide recovery and scalability.

The first architectural dimension (ignoring for a moment the relative econonmics which will be explored in another post) then is how much responsibility you want to take for scaling and recovery vs. the degrees of programming freedom you want to have. Of course the choice between the Google API and Microsoft Azure might come down to the skill set of your developers, but in my opinion, for any significant application, the architectural implications of the platform choice should be the more important factor.

Thursday, December 24, 2009

Cloud Computing is For Small Companies Too

ARCast.TV Special - Michael Stiefel on Cloud Computing is for Small Companies Too

Here is an interview that I did at last year's TechEd where I discuss why cloud computing is for small companies and divisions of large companies as well as the enterprise. In fact, there may be more immediate opportunites for cloud computing in small companies than in the enterprise.

Technorati Tags: ARCast,Architects,Architecture,Cloud Architecture,Cloud Computing,Cloud Patterns,Cloud Services,Windows Azure,Thought Leadership Leadership

ARCast.TV - Cloud Computing is for Small Companies Too"

12/24/2009 9:22:24 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | SOA | Software Development

Thursday, November 05, 2009

The Road to Azure

One of my clients, ITNAmerica has become a Microsoft case study for the idea of software + services. The idea behind software + services is that software should run where ever it makes sense: in the cloud, on the desktop, or on a mobile device, not just in a thin client such as a browser.

Latency, bandwidth limits, and the need for software to work if the connection to the cloud disappears makes this a logical approach. Anybody who has tried to get a cell phone signal should understand the issues about continual connectivity.

Curt Devlin, a Microsoft evangelist, demonstrates another reason why this approach makes sense. It makes the transition to a cloud provider such as Azure much simpler.

If you want some further ideas on how to take a software + services application to a cloud platform. Check out my recent ARCast on "Software + Services in the Cloud."

ARCast: Software as a Service in the Cloud

ARCast.TV Special - Michael Stiefel on Software as a Service in the Cloud

The Architecture Innovation Cafe presents my discussion of Software as a Service in the Cloud, I discuss how architecting and building a software as a service application requires solving a series of problems that are independent of a particular software platform. I focus on three areas of designing and building the application that you can leverage on new platforms such as Microsoft Azure.

Tags
ARCast, Architects, Architecture, Cloud Architecture, Cloud Computing, Cloud Patterns, Cloud Services, Software + Services, software as a service, Windows Azure

ARCast.TV Special - Michael Stiefel on Software as a Service in the Cloud

11/5/2009 11:05:17 AM (Eastern Standard Time, UTC-05:00) | Comments [0] | All | Cloud Computing | SOA

Wednesday, November 04, 2009

Why should you consider Cloud Computing? : Part 2

The application delivery scenarios focus around software as a service. Software as a service applications fall into three varieties: pure service, and software + service, hosted application.

The hosted application scenario is similar to hosted application delivery. Examples are SalesForce, or Hosted Microsoft Exchange. People provide or buy an application that runs in the cloud. At the other extreme is the pure service play. Providers create web services (SOAP or REST based) that provide services used by other applications. Examples are credit card approvals, or certain loan applications. Applications written by third parties use this software to compose their applications in conjunction with their own software. Then there is the mixed play. Providers create both web applications and web services to be used by third parties. These applications consume the same web services that are available to others to build their own applications. This is often done to allow the provider to share the web services among various offerings, or because they need to boot strap the application marketplace. The need for rich clients does not necessarily disappear here. If applications (such as emergency services) have to run with loss of internet connectivity, stand alone apps may be necessary with synchronization software used when connectivity is re-established. Transactional queuing is not enough here because substantive work has to be done by the rich client app when connectivity is absent.

Internet scale is the last class of application. The first scaling factor is number of users. In order to achieve such scale you may to use cloud features such as tables (Google Big Table, Azure Tables, Amazon Simple DB) instead of or in addition to relational databases. Note that transactional guarantees are often impossible to make here. The second scaling factor is geographic distance. If your clients are geographically separated by enough distance, the latency caused by the speed of light in fiber optic cable actually matters. You may have to use the cloud features mentioned previously to achieve the responsiveness for writeable data because transactions, especially distributed transactions are not feasible to achieve scalability.

The next post in the series will start to discuss the architectural implications of these different types of applications.

11/4/2009 9:50:26 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | SOA | Software Development

Tuesday, October 06, 2009

Architect's Cafe Webcast

Here is a webcast on Software as a Service that I did for Microsoft's Architect's Cafe:

https://www112.livemeeting.com/cc/microsoft/view?id=M7GT9D

Tuesday, August 11, 2009

How Does Your Utility's SLA Compare with Windows Azure?

Microsoft has yet to release all the details of its Azure SLA, but it has said that you will have a 99.95 per cent up-time for compute and 99.9 per cent up-time for SQL Azure.

How does this compare with my electric utility?

With my latest electric bill, my local utility listed its 2008 average number of service interruptions per customer as 1.051, and the average number of minutes without power for a customer at 78.55 minutes. So my electric utility has an up-time of .9998. I guess they don't get 4 or 5 "9"s either.

I presume these numbers include outages due to winter storms, but I do not know what the utility regulators allow them to exclude. Microsoft, to my knowledge, has not stated whether the SLA percentages include planned downtime for upgrades.

How many outage minutes per year could we expect with Azure under the SLA? That comes to about 262.8 compute minutes per year, or about 4.36 hours. Of course when those outages occur matters, and whether they are concentrated in one or many interruptions.

For SQL Azure that SLA is on a per month basis. So for data you could loose access to it for 43.8 minutes per month.

Is 4 hours a long time? Could you live without data access for 45 minutes a month?

For Facebook probably, for emergency services you would need some sort of fallback just like they have backup generators now.

I wonder what a cloud computing brownout looks like?

Monday, August 10, 2009

Microsoft Cloud Security Survey

Want to provide some input to Microsoft?

Here is a blog entry that contains a link to a survey that would help them decide on priorities for providing cloud security guidance: http://blogs.msdn.com/jmeier/archive/2009/08/04/cloud-security-survey.aspx.

The post explains the rationale behind the survey.

Sunday, July 05, 2009

. NET Rocks Interview

I just did an interview on .NET rocks about cloud computing.

We covered a whole bunch of topics including:
                     what is cloud computing
                     comparing the various offering of Google, Force.com, Amazon, and Microsoft
                     the social and economic environment required for cloud computing
                     the implications for transactional computing and the relational model
                     the importance of price and SLA for Microsoft whose offerring is different from Amazon and Google
                     the need for rich clients even in the world of cloud computing.

Wednesday, June 24, 2009

Cloud Computing, Utility Computing and Data

One of the big advantages of cloud computing is its utility computing model. Customers can use as much compute power or as little as they want without paying for what they do not need. Normally, most data centers have to be built for peak demand, with the servers unused when they are not needed.

Utility computing is based on the electric utility model. While this comparison has a lot of merit, there is one particular part of the analogy that really does not work.

Data are not electrons.

If someone steals some of your electric power by diverting it, you can get replacement power. If one part of the country's electric demand exceeds its generating ability, it can get power from another part of the grid. One electron is as good as another.

Data has identity, latency, and relationships to other pieces of data.

If someone steals your data, another piece of data cannot take its place. if your data is stolen, or even delayed it, can aversely affect you. Depending on your resolution of the CAP Theorem dilemma, your replication strategy might leave you with a window of vulnerability for data loss.

Curiously, the argument has been made that the utlity computing model makes denial of service attacks unfeasible because the economics of trying to get enough bot driven computers to assualt a hugh data center is prohibitive. Sooner or later, somebody is going to try to get the servers of one data center to attack the servers of another data center. Hopefully, the software that monitors the transactions would realize that somebody is exceeding their credit limit.

Tuesday, June 23, 2009

My interview on .NET Rocks about Cloud Computing

It's time for me to be interviewed on .NET Rocks again!

Carl and Richard will interview me about Cloud Computing. The interview will be published on June 30 at http://www.dotnetrocks.com/.

Based on my previous show (and related DNR TV segments) it will be a lot of fun to do and to listen to.

Wednesday, May 27, 2009

Misconceptions about Cloud Computing

Many people have misconceptions about cloud computing. For example, applications do not have to be built so they are all in the cloud. You can put the application in the cloud (to handle parallel computation), and have the database in your enterprise. I was interviewed at TechEd about some of the misconceptions about computing in the cloud. Other misconceptions discussed include what size business is right for the cloud, the role of the browser, guaranteed connectivity, and cloud security.

Thursday, May 21, 2009

TechEd Tech Talk on Cloud Computing and Small Business

Here is my Tech Ed podcast about how small businesses and small business units can benefit from Cloud Computing: http://www.msteched.com/online/view.aspx?tid=a4377dcf-ed90-4872-8d45-ec5108be118e.

I cover some of the material discussed in yesterday's post, but there is also some new content.

Wednesday, May 20, 2009

Why would a Small Business Put an App in the Cloud?

Small or medium sized companies can have the advantages of being able to act as a big company while maintaining the advantages of being small.

A hosted solution has many advantages.

You no longer need the staff, or have to spend money on installing and upgrading software on your clients' machines. Your customers and clients can use your application anywhere, not just on their office computers. If you provide services as well as an application, third parties can easily use your solution as part of their offering. Sometimes these services can be used in your own applications such as portals, or future applications. Perhaps your customers can extend your application making it more valuable to them. Having your application in the cloud means that your intellectual property (your secret sauce) is better protected because it is not in the hands of your users.

All these arguments also apply to small business units within a large enterprise.

Nonetheless, small businesses very often do not have the financial ability to economically run, or even rent a significant hosted application solution beyond a small scale web application.

Cloud computing offers a way out of the dilemma.

Cloud computing offers businesses a utility model for computation. Host your application on a cloud platform and you pay only for what you use. With minimal initial investment, you can scale up or down as your customers use more or less of your application or services.

With many cloud vendors (Amazon being a major exception) you do not even know on what infrastructure your machine runs on. Scaling and failover happen in those environments with minimal work on the client's part.

Clearly the cost and reliability of the cloud provider is crucial. Google's most recent outage shows that this is not a unreasonable fear. Private IT centers also have had their outages, but they are not made public.

Microsoft, Amazon, Google and others are spending huge amounts of money to build cloud data centers. Clearly they see the opportunity.

Right now many large companies already have data centers that can offer cheaper compute power than the current generation of cloud providers. This will eventually change.

But right now, small companies, start-ups, and other similar organizations should think about cloud computing for their hardware infrastructure.

Sunday, January 25, 2009

Lap Around Windows Azure and the Azure Platform

I have uploaded the slides and code for my talk on Windows Azure at the Microsoft MSDN day in Boston on January 22.

The talk was a combination of slides from several PDC talks with some of my own additions. I went through the fundamental architecture of the Azure cloud operating system and the basic elements of the Azure cloud services (i.e. Identity, Workflow, Live, SQL Data Services).

I did two demos. The first was a simple thumbnail generator. It illustrated a simple, scalable architecture using web roles and worker roles that used the primitive Azure operating system storage for blobs and queues. It also demonstrated how you model and configure a cloud application. The second, using the SQL Data Services, demonstrated how to integrate a non-cloud application (on a desktop or server) with the cloud. The app used a variety of industry standard mechanisms (WS*, REST, HTTP Get) to create and query data.

1/25/2009 11:20:05 AM (Eastern Standard Time, UTC-05:00) | Comments [0] | Cloud Computing | Microsoft .NET | SOA

Wednesday, October 29, 2008

Microsoft Cloud Computing: The Empire Strikes Back

At the PDC Microsoft announced its answer to Amazon and Google's cloud computing services.

This answer has two parts: the Azure platform and hosted applications. Unfortunately people confuse these two aspects of cloud computing although they do have some features in common.

The idea behind Azure is to have a hosted operating systems platform. Companies and individuals will be able to build applications that run on infrastructure inside one of Microsoft's data centers. Hosted services are applications that companies and individuals will use instead of running them on their own computers.

For example, a company wants to build a document approval system. It can outsource the infrastructure on which it runs by building the application on top of a cloud computing platform such as Azure. My web site and blog do not run on my own servers, I use a hosting company. That is an example of using a hosted application.

As people get more sophisticated about cloud computing we will see these two types as endpoints on a continuum. Right now as you start to think about cloud computing and where it makes sense, it is easier to treat these as distinct approaches.

The economics of outsourcing your computing infrastructure and certain applications is compelling as Nicholas Carr has argued.

Companies will be able to vary capacity as needed. They can focus scarce economic resources on building the software the organization needs, as opposed to the specialized skills needed to run computing infrastructure. Many small and mid-sized companies already using hosting companies to run their applications. The next logical step is for hosting on an operating system in the cloud.

Salesforce.com has already proven the viability of hosted CRM applications. If I am a small business and I need Microsoft Exchange, I have several choices. I can hire somebody who knows how to run an Exchange server. I can take one my already overburdened computer people and hope they can become expert enough on Exchange to run it without problems. Or I can outsource to a company that knows about Exchange, the appropriate patches, security issues, and how to get it to scale. The choice seems pretty clear to most businesses.

We are at the beginning of the cloud computing wave, and there are many legitimate concerns. What about service outages as Amazon and Salesforce.com have had that prevent us from accessing our critical applications and data? What about privacy issues? I have discussed the cloud privacy issue in a podcast. People are concerned about the ownership of information in the cloud.

All these are legitimate concerns. But we have faced these issues before. Think of the electric power industry. We produce and consume all kinds of products and services using electric power. Electric power is reliable enough that nobody produces their own power any more. Even survivalists still get their usual power from the grid.

This did not happen over night. Their were bitter arguments over the AC and DC standards for electric power transmission. Thomas Edison (the champion of DC power) built an alternating current electric chair for executing prisoners to demonstrate the "horrors" of Nikola Tesla's approach. There were bitter financial struggles between competing companies. Read Thomas Parke Hughes' classic work "Networks of power: Electrification in Western society 1880-1930". Yet in the end we have reliable electric power.

Large scale computing utilities could provide computation much more efficiently than individual business. Compare the energy and pollution efficiency of large scale electric utilities with individual automobiles.

Large companies with the ability to hire and retain infrastructure professionals might decide to build rather than outsource. Some companies may decide to do their own hosting for their own individual reasons.

You probably already have information in the cloud if you have ever used Amazon.com. You have already given plenty of information to banks, credit card companies, and other companies you have dealt with. This information surely already resides on a computer somewhere. Life is full of trust decisions that you make without realizing it.

Very few people grow their own food, sew their own clothes, build their own houses, or (even in these tenuous financial times) keep their money in their mattresses any more. We have learnt to trust in an economic system to provide these things. This too did not happen overnight.

I personally believe that Internet connectivity will never be 100% reliable, but how much reliability will be needed depends on the mission criticality of an application. That is why there will always be a role for rich clients and synchronization services.

Hosting companies will have to be large to have the financial stability to handle law suits and survive for the long term. We will have to develop the institutional and legal infrastructure to handle what happens to data and applications when a hosting company fails. We learned how to do this with bank failures and we will learn how to do this with hosting companies.

This could easily take 50 years with many false starts. People tend to overestimate what will happen in 5 years, and underestimate what will happen in 10-15 years.

Azure, the color Microsoft picked for the name of its platform, is the color of a bright, cloudless day. Interesting metaphor for a cloud computing platform. Is the future of clouds clear?

Monday, September 22, 2008

What is "Software + Services?

"Software + Services" is Microsoft's representation of what a large part of the future of computing is going to be. Microsoft, however, has not done a great job of explaining what "Software + Services" is.

Based on what I have read and heard, let me try to explain it as I see it.

The fundamental question that one has to ask is "Where does computation happen?"

The obvious answer to everyone today is: "Everywhere".

We compute on mobile devices, appliances, desktops and laptops, and remote computers. We communicate with text and voice.

Everybody understand this. The key question is: "Why?"

I think the answer is because "Hardware is cheap, and data is expensive to move."

The late Jim Gray did an analysis¹of the economics of distributed computing. His analysis came to two conclusions:

1. Put the computation near the data. Unless you have something that is very compute intensive, it is much cheaper to not move the data.
2. If you need data from multiple sites, push the processing closer to the data source by filtering the data early.

The assumption here is that telecommunication prices drop slower than Moore's Law. So far this has always been the case.

The natural conclusion is to do the computation where the data naturally resides. In other words: Do what makes sense. Some things will be in the cloud, some things will still be on the desktop. As long as Internet connectivity is not ubiquitous, and not always connected, you may have to cache data somewhere. Depending on the mission criticality of your application, a few seconds could be a long time.

As Ray Ozzie put it in his MIX Keynote, we live in a "World of small pieces loosely joined."

Software + Services means some things will be services in the cloud, others will be software as we know it today. That includes mobile devices and appliances that we are learning to love and hate, just as we have always done with traditional software.

1. MSR-TR-2003-24 "Distributed Computing Economics"

Tuesday, September 09, 2008

Developing with X509 Certificates

To further simplify the example, let us assume that the we want to use the certificate to encrypt a message from the client to the service. It is easy to apply what we discuss here to other scenarios.

As we discussed in the previous post, we need to generate two certificates, the root certificate that represents the Certificate Authority, and the certificate that represents the identity of the client or service. We will also create a Certificate Revocation List (CRL).

We will use a tool called makeCert to generate our certificates. Makecert, which ships with the .NET platform, allows you to build an X509 certificate that can be used for development and testing. It does three things:

Generates a public and private key
It associates the key pair with a name
It binds the name with the public key.

Many of the published examples use makecert to both create and install the certificate. We will do the installation in a separate step because this approach is closer to the use of real certificates. Separating the certificates also allows the certificates to be installed on many machines instead of just one. This makes distributing certificates to developer machines much easier.

First we will create the Root Certificate with the following command:

makecert -sv RootCATest.pvk -r -n "CN=RootCATest" RootCATest.cer

-n specifies the name for the root certificate authority. The convention is to prefix the name with "CN=" where CN stands for "Common Name"

-r indicates that the certificate will be a root certificate because it is self-signed.

-sv specifies the file that contains the private key. The private key will be used for signing certificates issued by this certificate authority. Makecert will ask you for a password to protect the private key in the file.

The file RootCATest.cer will just have the public key. It is in the Canonical Encoding Rules (CER) format. This is the file that will be installed on machines as the root of the trust chain.

Next we will create a certificate revocation list.

makecert -crl -n "CN=RootCATest" -r -sv RootCATest.pvk RootCATest.crl

-crl indicates we are creating a revocation list

-n is the name of the root certificate authority

-r indicates that this is the CRL for the root certificate, it is self-signed

-sv indicates the file that contains the private key

RootCATest.crl is the name of the CRL file.

At this point we could install the root certificate, but we will wait until we finish with the certificate we will use in our scenario. Here we need two files. We will need a CER file for the client machine so that we can install the public key associated with the service. Then we will create a PKCS12 format file that will be used to install the public and private key in the service.

The initial step is :

makecert -ic RootCATest.cer -iv RootCATest.pvk -n "CN=TempCert" -sv TempCert.pvk -pe -sky exchange TempCert.cer

-n specifies the name for the certificate

-sv specifies the file for the certificate. This must be unique for each certificate created. If you try to reuse a name, you will get an error message .

-iv specifies the name of the container file for the private key of the root certificate created in the first step.

-ic specifies the name of the root certificate file created in the first step

-sky specifies what kind of key we are creating. Using the exchange option enables the certificate to be used for signing and encrypting the message.

-pe specifies that the private key is exportable and is included with the certificate. For message security is this required because you need the corresponding private key.

The name of the CER file for the certificate is specified at the end of the command.

Now we need to create the PKCS12 file. We will use a the Software Publisher Certificate Test Tool to create a Software Publisher's Certificate. You use this format to create the PKCS12 file using the pvkimprt tool.

cert2spc TempCert.cer TempCert.spc

pvkimprt -pfx TempCert.spc TempCert.pvk

We now have four files:

RootCATest.cer

RootCATest.crl

TempCert.cer

TempCert.pvk

The next step is to install these on the appropriate machines. I could not get certmgr to work properly to do an automated install. The Winhttpcertcfg tool works for PKCS12 format files, but not CER format files. We will use the MMC snap-in for this.

Run the mmc snapin tool (type mmc in the Run menu). First we will open the Certificates snap-in. Choose: Add/Remove Snap-In.

Then Add the Certficate Snap-In.

When you add the snap-in, choose local computer account for the computer you want to install the certificate (usually the local one).

We want to install the root certificate on both the client and service machines in the Trusted Root Certificate Store.

Select that store, right mouse click and install both the RootCATest.cer and RootCATest.crl files. On the client side you want to install only the public key in the TempCert.cer file. On the service side only you want to install the PKCS12 format file (TempCert.pvk) which has the private key for the certificate. Install that in the Personal store. For private key installation you will have to provide the password for the PKCS12 file.

On the service side, we need to give the identity of the running process (NETWORK SERVICE) the rights to read the private key. We use two tools FindPrivateKey and cacls to do this. Run the following command:

for /F "delims=" %%i in ('FindPrivateKey.exe My LocalMachine -n "CN=TempITNCert" -a') do (cacls.exe "%%i" /E /G "NT AUTHORITY\NETWORK SERVICE":R)

Remember to delete these certificates when you are finished with them.

9/9/2008 8:07:45 PM (Eastern Standard Time, UTC-05:00) | Comments [6] | Microsoft .NET | SOA | Software Development

Sunday, August 24, 2008

X509 Certificates for Developers

Working with X509 certificates can be very frustrating for WCF developers.

This is the first of two posts. In this post I will explain just enough of the background for X509 certificates so that I can explain in the next post how to create and use certificates during .NET development with WCF. The second post is here.

I do not know any good books for a developer that explains how to use certificates. Even the excellent books on WCF just give you the certificates you need to get the sample code to work. They do not really explain to you why you are installing different certificates into different stores, or how to generate the certificates you need to get your software to work. Very often the examples run on one machine with the client and service sharing the same store. This is not a realistic scenario.

Obviously I cannot explain all about certificates in one blog post. I just wish to share some knowledge. Hopefully it will spare you some grief.

Here is the problem I want to solve.

Suppose you have a set of web services that is accessed by either an ASP.NET or rich client. The service requires the client application to use an X509 certificate to access the service. This could be to encrypt the data, to identify the client, to sign the data to avoid repudiation, or for a number of other reasons. How do you install the certificates on the client and service machines?

Certificate technology is based on asymmetric encryption.

In the encryption scenario, the client would use the public key of the service to encrypt the traffic. The service would use its private key to decrypt the message. In the identification scenario the service would use the public key of the client to identify a message signed with the client's private key.

One of the key issues is how you can be sure that the public key is associated with a given identity. Perhaps somebody substituted their key for the one you should be using. Perhaps somebody is hijacking calls to the service, or you made a mistake in the address of the service. A classic example of these types of vulnerabilities is the "man in the middle attack". Another key issue is that the private key cannot be read or modified by unauthorized parties.

Public Key Infrastructure (PKI) is the name for a technology that uses a certificate authority (CA) to bind the public key to an identity. This identity is unique to the certificate authority. X509 is a standard for implementing a PKI. An X509 certificate represents an association between an identity and a public key.

An X509 certificate is issued by a given Certificate Authority to represent its guarantee that a public key is associated with a particular identity. Depending on how much you trust the CA, and the amount of identity verification the CA did, would determine how much trust you have in the certificate. For example VeriSign issues different types of certificates depending on how much verification was done. Sometimes organizations will be their own certificate authorities and issues certificates because they want the maximum amount of control.

This relationship between a CA and its issued certificates is represented in the "chain of trust". Each X509 certificate is signed with the private key of the CA. In order to verify the chain of trust you need the CA's public key. If you are your own CA authority you can distribute the X509 certificate representing this "root certificate". Some browsers and operating systems install root certificates as part of their setup. So the manufacturer of the browser or operating system is part of the chain of trust.

The X509 standard also includes a certificate revocation list (CRL) which is a mechanism for checking whether a certificate has been revoked by the CA. The standard does not specify how often this checking is done. By default, Internet Explorer and Firefox do not check for certificate revocation. Certificates also contain an expiration date.

Another approach to trust is called "peer to peer" trust, or "web of trust". Given the difficulties of peer trust it is not practical for most Internet applications. It can, however, make development scenarios simpler. Your development environment, however, should mimic your deployment environment. Hence I do not recommend using peer to peer trust unless that is practical for your deployed solution.

There are various protocols for transmitting certificates. We will be interested in two of them.

The Canonical Encoding Rules (CER) protocol will be used to digitally transmit the public key of a given identity. The PKCS12 protocol will be used to transmit the public and private keys. The private key will be password protected.

The next post will describe the mechanisms for creating and installing certificates in a .NET development environment.

8/24/2008 9:02:20 AM (Eastern Standard Time, UTC-05:00) | Comments [0] | Microsoft .NET | SOA | Software Development

Sunday, June 01, 2008

Software + Services Is For Small Companies Too

On Friday, June 6 of Microsoft's Tech-Ed I will be hosting a Birds of a Feather Session on the topic "Software + Services is For Small Companies Too". It will be held in Room S330 E at noon.

To continue the conversation, please add your comments and opinions to this blog post. If you are unable to attend feel free to add your thoughts as well here.

Here are some questions to get you started thinking about the topic:

What is Software + Services?

Are small companies afraid of software + services? Are they afraid of cloud computing? Why?

Doesn't cloud computing leverage the efforts of small companies? If cloud computing makes IT a commodity, doesn't this allow small companies to be even more nimble in their development efforts?

What are the real advantages that large companies have over small companies? What about the innovators dillemma? How do large companies keep their current customers happy and assure future growth through innovation? Doesn't this help small companies. Doesn't cloud computing help small companies innovate even more?

6/1/2008 9:47:05 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | Microsoft .NET | SOA | Software Development

Thursday, April 03, 2008

Workflow Services Using WCF and WF Uploaded

I have put my VSLive! talk, explaining how to use Windows Comunication Foundation and Windows Workflow Foundation together to create distributed applications in the Presentations section of my web site.

Thursday, March 06, 2008

SaaS Podcast

I did a short podcast for Consortio Services about Software as a Service as part of their weekly techcast.

I very briefly cover what SaaS is about and some of the critical issues facing organizations looking at delivering services using the SaaS model.

3/6/2008 12:42:48 AM (Eastern Standard Time, UTC-05:00) | Comments [0] | All | SOA | Software Development

Tuesday, March 04, 2008

Speaking at VSLive! in San Francisco April 1-3.

I am going to be giving two talks and a workshop at VS Live! in San Francisco.

The first talk is an "Introduction to Windows Workflow Foundation" where I explain both the business reasons why Microsoft developed Workflow Foundation as well as the technical fundamentals. This talk will help you understand not only how to build workflows, but when it makes sense to do so and when to use some other technology.

The second is "Workflow Services Using WCF and WWF". WCF allows you to encapsulate business functionality into a service. Windows Workflow Foundation allows you to integrate these services into long running business processes. The latest version of the .NET Framework (3.5) makes it much easier to use these technologies together to build some very powerful business applications.

On Thursday I will give a whole day tutorial on Workflow Foundation where will dive into the details of how to use this technology to build business applications.

Other speakers will talk about VSTS, ALM, Silverlight, AJAX, .NET Framework 3.0 and 3.5, Sharepoint 2007, Windows WF, Visual Studio 2008, SQL Server 2008, and much more.

If you have not already registered for VSLive San Francisco, you can receive a $695 discount on the Gold Passport if you register using priority code SPSTI. More at www.vslive.com/sf

.

Thursday, November 22, 2007

Using the WF Rules Engine Outside of a Workflow

The Windows Workflow Foundation (WF) ships with a Policy Activity that allows you to execute a set of rules against your workflow. This activity contains a design time rules editor that allows you to create a set of rules. At run time, the Policy Activity runs these rules using the WF Rules engine.

Among other features, the rules engine allows you to prioritize rules and to set a chaining policy to govern rules evaluation. The rules engine uses a set of Code DOM expressions to represent the rules. These rules can be run against any managed object, not just a workflow. Hence, the mechanisms of the rules engine have nothing to do with workflow. You can actually instantiate and use this rules engine without having to embed it inside of a workflow. You can use this rules engine to build rules-driven .NET applications.

I gave a talk at the last Las Vegas VSLive! that demonstrates how to do this. The first sample in the talk uses a workflow to demonstrate the power of the rules engine. The second and third samples use a very simple example to demonstrate how to use the engine outside of a workflow.

Two problems have to be solved. You have to create a set of Code DOM expressions for the rules. You have to host the engine and supply it the rules and the object to run the rules against.

While the details are in the slides and the examples, here is the gist of the solution.

To use the rules engine at runtime, you pull the workflow rules out of some storage mechanism. The first sample uses a file. A WorkflowMarkupSerializer instance deserializes the stored rules to an instance of the RuleSet class. A RuleValidation instance validates the rules against the type of the business object against which you will run the rules against. The Execute method on the RuleExecution class is used to invoke the rules engine and run the rules.

How do you create the rules? Ideally you would use some domain language, or domain based application, that would generate the rules as Code DOM expressions. If you were masochistic enough, you could create those expressions by hand.

As an alternative, the second sample hosts the Workflow rules editor dialog (RuleSetDialog class) to let you create the rules. Unfortunately, like the workflow designer, this is a programmer's tool, not a business analyst's tool. A WorkflowMarkupSerializer instance is used to serialize the rules to the appropriate storage.

I would be interested in hearing about how people use this engine to build rules driven applications.

Wednesday, October 31, 2007

Developer Meditations

Meditation is supposed to develop awareness, help focus your attention, and relax while increasing your focus. At one of my current clients we are developing a Software as a Service (SaaS) application. We have developed the following "meditative principles":

1. It's not done until the tests are done.
2. If it's broke, fix it first.
3. If it's not in a script or code, it doesn't exist.
4. Don't explain, do it (but ask questions if you don't understand).

And finally (with apologies to Bobby McFerrin),

"Don't worry, be agile".

Here is a little song I wrote
You might want to sing it note for note
Don't worry be agile
In every software we have some trouble
When you worry you make it double
Don't worry, be agile

Ain't got no place to lay your head
Somebody came and took your machine
Don't worry, be agile
The manager say your code is late
He may have to litigate
Don't worry, be agile
Look at me I refactor
Don't worry, be agile
Here I give you my url
When you worry call me
I make you agile
Don't worry, be agile
Ain't got no time ain't got no style
Ain't got not money to make you smile
But don't worry self organize
Cause when you worry
Your face will frown
And that will bring everybody down
So don't worry, be agile (now)

There is this little song I wrote
I hope you learn it note for note
Like good little developers
Don't worry, be agile
Listen to what I say
In your software expect some trouble
But when you worry
You make it double
Don't worry, be agile
Don't worry don't do it, be agile
Put a smile on your face
Don't bring everybody down like this
Don't worry, it will soon pass
Whatever it is
Don't worry, be agile

10/31/2007 12:32:21 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | All | SOA | Software Development

Friday, September 29, 2006

Software Reuse and SOA

David Chappell (http://www.davidchappell.com/HTML_email/Opinari_No16_8_06.html) argues that SOA may not foster the service reuse that everyone has been hoping for. I think his analysis is correct, but I think with business services we at least have a reasonable hope of achieving reuse. Here we are least dealing with the way things actually happen in the world as opposed to programmer abstractions such as objects or components. That combined with the looser coupling of services gives me some hope.

The reason why frameworks like .NET are successful is they reflect years and years of experience with programming problems. Many examples of reuse (such as file systems and compilers) are so embedded in our experience that we no longer see them for what they are.

Reuse may fail here as well for all the reasons mentioned in David Chappell's analysis. At least now I feel we are on the right track.

9/29/2006 5:00:37 PM (Eastern Standard Time, UTC-05:00) | Comments [0] | All | SOA

Friday, September 01, 2006

SOA Reference Model Committee Standard

The Reference Model for Service Oriented Architecture defines a vocabulary for building service-oriented systems. Put together by a technical committee operating under the auspices of the OASIS standards organization, it is the result of individuals and organizations representing vendors, users, governments, consulting organizations, and academic institutions.

The Reference Model (RM) sees SOA as a means for organizing and using distributed capabilities that may be under the control of different ownership domains. The RM is not an architecture. It does not attempt to make any architecture normative. It does not try to make any standard or set of standards normative.

It does provide a common set of semantics that can be used across different implementations. This does sound rather fancy. Nonetheless, just like Moliere's bourgeois gentlemen that found out he was speaking prose all his life, many industries have been using reference models all along. They just never had to define them explicitly.

An architect for a residential dwelling knows that if they use the term door or window, the builder will understand what is meant. There are widely varied implementations of doors and windows depending, for example, if you are building a space station or an igloo. Nonetheless, everyone knows what the terms mean. Many of these terms are codified in building codes, and by standards bodies, and have evolved over the years. The software architecture community moves too quickly for such evolution; this is where standards organizations can help.

Software architectures, for sure, can have views and viewpoints, but the terms in which they are discussed have to be understood.

The core concepts that the RM discusses are service, visibility, execution context, service description, real world effect, interaction, and contract and policy.

I will discuss these core concepts over the next few posts.

None of this work is going on in isolation, or is it intended to denigrate other work such as the WS* specifications, or organizations such as the ISO, IEEE, IETF, the Ontolog Forum or other groups. The reference model just supplies standard definitions so that it becomes easier for each group to communicate with the others.

9/1/2006 11:32:59 AM (Eastern Standard Time, UTC-05:00) | Comments [0] | All | SOA

Subscribe:

newtelligence dasBlog 1.8.5223.1