Introduction
An important step to writing Clean Code is the notion of separating “commands” and “queries” by using the Command Query Separation Principle. The notion was first discussed by Bertrand Meyer in his book Object Oriented Software Construction. This means that the idea is not new. In its basic form, it means we should separate the things that read data from the system and things that write data to the system.
The Command/Query Separation Principle means that there should be a clear separation between the updating of information and status in your program and the way that you read information from the program. Commands and queries should be separately declared (though you’ll find that commands can call queries but not vice-versa.) It can be a complex as ensuring that reading and writing take place in completely separate object, or as simple as ensuring that commands and queries are simply done in different methods of your objects.
Of course, the first question you’ll have is “What do you mean by ‘command’ and ‘query’?” Well, I’ll tell you.
Queries
A query is an operation that returns a result without effecting the state of the class or application. In Object Pascal, this is typically a `function`. Queries should not mutate the state of a class. They should be idempotent. That means “denoting an element of a set that is unchanged in value when multiplied or otherwise operated on by itself.” (I had to look that up, by the way….)
As a general rule, queries should return a single value, and “asking a question should not change the answer”. They should be referentially transparent; that is, they should be perfectly replaceable with their literal result without changing the meaning of the system.
What this means practically is that your functions shouldn’t change the status of the system you are working on, whether that be a class, a framework, or an application. You should be able to run a query, i.e. a function, a hundred times in a row and get the same answer back each time. The function, because it doesn’t change the status of the system, can be safely called at any time without repercussions.
Now this should be a general rule — there are certainly cases where your query will change the state of the system (a dataset’s `Next` call comes to mind). But generally, it’s a good idea to have your queries not change state.
Commands
A command is any operation that has an observable side-effect. It is any code that changes something in your class or application. Typically, in Object Pascal, a command will be a `procedure` — that is, code that takes actions without returning a value. Commands can call queries (but queries should never call commands, because commands change the status of the system).
Commands should not in general return values. Thus, the use of `var` parameters should be discouraged if not down-right banned.
Don’t Mix the Two
All your methods and routines should be easily identifiable as either a command or a query. Commands and queries should be separate entities in your code — with the exception that a command can call a query if need be. The use of `var` or `out` parameters in a `procedure` will confuse this issue, and thus should be discouraged. If you follow this rule, your code should be more “reasonable” — it should be easier to understand and easier to modify.
CQRS also encourages you not to violate what I consider to be a bedrock of sound development technique: Don’t try to make one thing do two things. For instance, here is some code that does exactly that:
procedure ProcessWidgets(aCollectionOfWidgets: TWidgetCollection; var aNumberOfProcessedWidgets: integer);
This method is a clear violation of CQS [NOTE: I orginally had this as CQRS] as it obviously is trying to be a command and altering the state of the system by processing widgets, but also tries to be a query by “returning” through a `var` parameter the number of processed widgets. Instead, the system should have a simple command to process widgets and a separate query to return the number of widgets that were processed. The procedure is trying to be two things at once, and all kinds of mischief comes from making one thing do two things.
Following the CQRS principle in the design of your code will help to ensure the proper separation of concerns, resulting in cleaner, easier to read and easier to maintain code.
I disagree with your example. Applying the CQS here does not make the code any cleaner but breaks any possibility to execute it in multiple threads without having to create critical sections around these 2 calls.
CQS clearly is something that you should not blindly apply to all your classes and methods but only where it makes sense or is inherent part of your architecture.
Exactly. As long as we’re throwing around principles with fancy names, the first thing that came to mind when I read that example was “temporal coupling.” It’s something you really want to avoid, but following this principle too closely requires it instead!
“You should be able to run a query, i.e. a function, a hundred times in a row and get the same answer back each time.”
.
As long as the system hasn’t changed state between calls?
Rick —
Yes — I guess that should be clarified: System changes (like Time, Date, etc) will change the result. But if you query a database, or run the function on your closed system, the answer should remain the same.
Thanks Nick. An excellent article. With respect to the on-going maintenance of the code base, the separation of commands and queries can help a maintainer to clearly understand the intentions of the code they are managing. As a convention, knowing that data will not be mutated in a function (query) further helps in ensuring a robust system. Too much time has been spent debugging code to find that objects have changed as a side effect of calling a function – such a bad practice.
Just to be specific, CQS and CQRS are not exactly the same thing. CQRS (Command Query Responsibility Segregation) is the separation of commands and queries at the object level. So your read requests and write requests are handled by completely different objects as opposed to CQS which is separation at the method level (and usually in the same object).
For want of a better word, CQRS is more of an architectural model and usually used in conjunction with event-sourcing where instead of storing data in the normal CRUD fashion, data is stored additively as events (state changes) by the command object and data is restored by replaying the events to rebuild your entities by the read objects.
In any case, you should update your article to refer to CQS as it may confuse people should they google CQRS instead.
CQRS is not only to be used at local process, but is used also at client-server level.
This explain why you made some small mistakes/shortcuts about the assumed stateless nature of CQRS, or when you claim that it should be used everywhere. Like any pattern, CQRS is useful in some places, but not in others (see Martin Fowler’s article).
All this does make sense in the context of SOA, but not for in-process code.
When defining SOA interfaces, Query / Command segregation has several advantages, especially in two directions:
1. For (big data) scaling – e.g. for some master/slave replication patterns in which slaves are used to read the content, but .
2. Used in conjunction with an Event Driven architecture – in which Queries would use a storage local to each event node.
I suspect you should better read and link to Martin’s article at http://martinfowler.com/bliki/CQRS.html – e.g. “When to use it”.
CQS only works in “pure” form for single user, purely-sequential, isolated sub-systems.
Once you throw in multiple users (be they humans, data acquisition devices, code running in threads, real-world databases, networks…), you introduce the need for atomicity and/or transactions to ensure consistent state, and these are exactly about packing command and queries together.
An atomic increment is the minimal example of something that just cannot follow CQS.
Another example is a database query, even a pure “select”, which is not thought as a “command”, is not a pure query either under the CQS definition: it can have locking or retaining side-effects, and those are desirable/required to ensure consistent results.
In a multi-user database, there are no true queries, only commands to query a state, and those “query commands” will be affected and affect a myriad of states. This is probably the most important thing to understand when doing databases, be they SQL or not, and the reason why throwing random “select” queries on a production server can be problematic beyond having to wait a long time for a result.
So CQS is not a “principle” that can be blindly applied to all situations, it’s more of a guide/tool that applies to specific conditions, and should be applied in specific ways, with understanding of its purposes and limits.
I have a code sample at: visualdelphi.wordpress.com
It’s simple but it shows why this concept is a good one to keep in mind.
When we talk about CQRS we are talking mainly about a system that depends on a Publish Subscribe model. In other words, an enterprise scale system. One point that is most often overlooked in performance analysis is the load time to dial up an instance. CQRS done right will mean super light read components that load instantly. By leaving the big heavy domain logic in its own space you can make read components much more nimble. On the back end of CQRS is an event store. This is part of the idea not mentioned so far. An event is the past tense of a command. It is the summary of what the command allowed. As such it can have several flavors depending on the command itself and the outcome. These events can be transmitted via MQs to just about anywhere that has a need for that information. AMQP communication, usually using RabbitMQ is the key to distributed high performance cross platform dependency injection. If you know what that means you instantly recognize it to be a huge power up for enterprise architecture. If you havent seen the AMQP protocol before you are likely ready to kill me for suggesting it works. It works. It comes from the financial services sector. They dont do bugs and they do a lot of high speed high throughput processing. That is the lineage of our little Rabbit. Just one more little aside, there is a perfect database just waiting to give you all this event stuff on a platter. SQL Server and other traditional databases could do it if you analyze the log files and do a lot of work but Interbase just does it. With the new features in the latest release you dont need to write one line of code or do anything special to get the full event history.
Temporal coupling is not an issue for one simple reason. Events never fail. And they must be run in order. That means eventual consistency is enforced in the queue. When the required process doesnt have the required consistency it waits. That is the difference. Accept delays, but guarantee consistent results.
It is a bit ironic that MQ has come full circle. From the IBM original, which everyone thought was too clunky and complicated, to MSMQ that tried to make it simple, failed, and got updated at least internally at Microsoft to actually work in SQL Server, to RabbitMQ; an ERLANG implementation of AMQP that looks suspiciously like MQ all over again. But being written in ERLANG is a great thing. It means garbage collection just works. Speed, scaling, adaptive threading, and the kitchen sink all in one neat package. Dont ask me to write it or debug it personally, I leave that to others but RabbitMQ is a great implementation.
With all those toys we get the possibility of a Distributed Domain Driven Design. The pipeline that regulates the flow of data can be put in the hands of administrators. Coders can just write to the AMQP interface and forget it was ever a problem. The concerns of routing are handled and out of your hair. It requires us to plan for and embrace a publish subscribe model for those larger bits of communication. That isnt anything new, we used to have managers sitting in desks to do it. The distribution process being configurable, standardized, and implemented cross platform, cross language is new. There is just enough standardization in AMQP 0-9-1 to make it universal. Since that standard the 1.0 version has come out but it has some significant divergence from the 0-9-1 and I am not convinced that the direction is right. Thing is all clients and all servers speak the same language. It is email for databases.
As for Stefan’s argument, well this is inter-process and network level stuff. Interestingly there is a ZeroMQ implementation that is not AMQP proper because it makes not guarantees about delivery, but it is pretty smokin fast. Having worries about a critical sections in threads generically is not very good. We need to be precise and look at the fact that all threads have their critical sections. It only matters when communication is essential. In fact because most AMQP implementations are optimized for thread level communication they usually offer excellent speed despite having to share some memory space and interlocks. In a lot of applications the thread data is collected at the base and then transmitted later, possibly with new threads. Good AMQP can handle lots of threads, each with its own connection. The end point size problem is well managed.
The biggest advantage offered by this architecture is total throughput. You get a pretty optimal installation that partially self tunes. You only need to intervene when one or more parts of the system bottleneck. I think that it self explains to most experienced architects. It is definitely a big hammer, and you should be careful to decide actively that you need such a tool before using it.
It also works for you when you are faced with an incremental management that knows just enough to get some business done next week, but not the week after. Plug and play architecture lets you fiddle around and configure stuff almost ad-hoc to satisfy a lot of demands. You might need to fight to keep data systems in proper order. You might have warehouse issues with all that config activity but it is an upgrade to your doing business day to day headaches for sure. After you have it for a few years the SOA SAAS flexibility answer it offers is pretty hard to argue against. You slow down with developing new widgets and mostly just reconfigure the ones you already own.
Stay tuned, I offered to David I to talk about this in the next Code Rage sessions. We will see if there is enough interest to put together a talk.