Large-Scale Federated Processes

A presentation that I gave at the Stevens BPM day covered the subject of Large Scale Federated Processes. What is a federated process?

It is a distributed process that spans many servers. Distributed process support might be designed and implemented in a very centralized way: for example a single process application with parts of the application deployed to different machines. This allows the process to be much larger that it might be if limited to a single server, but that really is not the point of federation.

A federated process is a distributed process where the different parts of the process are controlled by different people. Not only is the running of the process decentralized, but the control of the design of the process is decentralized as well. Why does this matter?

One of the biggest challenges of automating processes is “process discovery” where you determine what, for a given organization, is the right process to use. It is common for people in different parts of the organization to disagree. You spend a lot of time coming to agreement with a single process that everyone can agree on. But federated processes avoid this: it allows each organizational unit to have the process that is right for them. You only need to agree on a common interface for invoking the process.
Another challenge is privacy. Not every organizational unit want to expose how they accomplish processes. Actually they usually won’t admit this, and instead they will make up a process that they want everyone to believe they follow. A federated process does not have this problem since each organizational unit can control their own part of the process.
Another challenge is keeping a large process up to date when so many different organization units are involved and have to sign off on every change. Federated processes can be more agile, especially in large organizations. With a federated process, when one part of the organization need to make an adjustment to their part of the process, they can do so without having to get all the other parts to review and approve the change.
Another challenge is appropriate, fine grained, access control. AS the size of an organization gets larger, there is an exponential increase in the effort needed to accurately map access control to the right resources to the right people. By federating the process, the access control to the individual process fragments can be controlled by people within that organizational unit, and much closer to the problem. It is even possible that each process fragment is accessible to completely disjoint sets of people.
By decentralizing the design of the process, an organization can easily delegate the responsibility for the process far better than if the process had to be designed in a completely centralized, unified representation. It is not surprising that delegation is best for supporting large organizations; it is only surprising that so many technical approaches attempt to centralize the automation (based in my opinion on the naive assumption that everything would be best if everyone did things exactly the same way.)
Federated processes can be built dynamically, and the really interesting part of this is that individuals can decide at run time which remote subprocess to bind in, according to the individual case. IT is also possible for them to use business rules to determine this. The result is an adaptive federated process which might be far more efficient than otherwise possible.

Federated processes are discussed in the literature under many names. For example, the BPMN specification talks about “collaboration processes” which described the interaction between two processes. B2B interactions can be seen as interactions between fragments within a single federated processes.

I have heard analysts recently use the term “Process Networks” which is the same concept of a collection of individual process fragments connected by information exchange.

In the extreme, we have what I call the “Ephemeral Network”: a collection of systems which are brought together to solve a single problem, connected by a single federated process. An Ephemeral Network can be seen a kind of logical connection which is built as the process executes, lasts as long as the process instance does, and is essentially thrown away when the process is complete. While connected, the people involved can communicate to help achieve the goal of the process.

Federated processes work in a non-homogenous infrastructure. A federated process can employ the latest software for one part of the process, while other parts run on 20-year old mainframes. Keep in mind as well that there is no reason your smart phone can not be hosting part of the federated process as well.

To allow federated processes to work in a non-homogenous infrastructure, a standard protocol is needed, and that is Wf-XML. More on this in future posts, but in the mean time, check out the SlideCast.

I have thought about that. Having been involved in the WS-DistributedTransaction spec at one point. There are two level on answer:

(1) the transaction that creates/starts the subprocess could be seen as atomic. For example the node in the parent process that starts the subprocess might be tied atomically to the starting of the subprocess. If for example, the remote server is off line, then the parent node goes back to the “unstarted” state so that later it can try again.

(2) Once the subprocess has started, the idea of keeping global state consistent like a distributed transaction is exactly the mental blind spot that causes people to go off track here. It should NOT behave atomically.

Comparing such a system to a distributed database is a mistake. What you will find is that the amount of resources needed increases exponentially to the size of the distributed network. Remember that a subprocess might be running for several months, and have additional subprocesses underneath it. Holding the resources necessary for that length of time would certainly cause deadlocks.

The real problem though is that while the subprocess is running, real live people are taking real live actions based on what is there. If you simply “rolled back” to an earlier state, not only would these people lose work, they would also be very confused. For example, a company ships a package based on the information in the process, you can not simply roll it back to an earlier state, because it would look like those people had no justification for sending the package.

The connection between the process fragments needs to be seen more like a contract between a customer and a supplier. Indeed, the customer can cancel the order at any time, and the supplier will want to find out about it (but there still might be a charge). Alternatively, the supplier may find that the article in question is discontinued, but you would never simply roll back the process to an earlier state.

Reply ↓

3 thoughts on “Large-Scale Federated Processes”

Michael Van Wesenbeeck on November 9, 2009 at 7:31 am said:

Hi Keith,

This is very interesting indeed.
Have you thought about how transactions could come into play as well?
For example, when a subprocess in the federation fails or is stopped, could the preceding processes be notified to rollback their work? I see this similar to how it’s done in relational DB systems.

Reply ↓
kswenson on November 9, 2009 at 7:45 am said:

I have thought about that. Having been involved in the WS-DistributedTransaction spec at one point. There are two level on answer:

(1) the transaction that creates/starts the subprocess could be seen as atomic. For example the node in the parent process that starts the subprocess might be tied atomically to the starting of the subprocess. If for example, the remote server is off line, then the parent node goes back to the “unstarted” state so that later it can try again.

(2) Once the subprocess has started, the idea of keeping global state consistent like a distributed transaction is exactly the mental blind spot that causes people to go off track here. It should NOT behave atomically.

Comparing such a system to a distributed database is a mistake. What you will find is that the amount of resources needed increases exponentially to the size of the distributed network. Remember that a subprocess might be running for several months, and have additional subprocesses underneath it. Holding the resources necessary for that length of time would certainly cause deadlocks.

The real problem though is that while the subprocess is running, real live people are taking real live actions based on what is there. If you simply “rolled back” to an earlier state, not only would these people lose work, they would also be very confused. For example, a company ships a package based on the information in the process, you can not simply roll it back to an earlier state, because it would look like those people had no justification for sending the package.

The connection between the process fragments needs to be seen more like a contract between a customer and a supplier. Indeed, the customer can cancel the order at any time, and the supplier will want to find out about it (but there still might be a charge). Alternatively, the supplier may find that the article in question is discontinued, but you would never simply roll back the process to an earlier state.

Reply ↓
Pingback: An Introduction to (some) BPM standards | blog.ricston.com

Thinking Matters

Empowering knowledge workers to be more efficient, adaptive, and effective.

Large-Scale Federated Processes

3 thoughts on “Large-Scale Federated Processes”

Leave a comment Cancel reply

Do the social thing:

Related

3 thoughts on “Large-Scale Federated Processes”

Leave a comment Cancel reply