Monday, September 1, 2014

Java Concurrency Tutorial - Locking: Intrinsic locks

In previous posts we reviewed some of the main risks of sharing data between different threads (like atomicity and visibility) and how to design classes in order to be shared safely (thread-safe designs). In many situations though, we will need to share mutable data, where some threads will write and others will act as readers. It may be the case that you only have one field, independent to others, that needs to be shared between different threads. In this case, you may go with atomic variables. For more complex situations you will need synchronization.


1   The coffee store example


Let’s start with a simple example like a CoffeeStore. This class implements a store where clients can buy coffee. When a client buys coffee, a counter is increased in order to keep track of the number of units sold. The store also registers who was the last client to come to the store.

In the following program, four clients decide to come to the store to get their coffee:

The main thread will wait for all four client threads to finish, using Thread.join(). Once the clients have left, we should obviously count four coffees sold in our store, but you may get unexpected results like the one above:

Mike bought some coffee
Steve bought some coffee
Anna bought some coffee
John bought some coffee
Sold coffee: 3
Last client: Anna
Total time: 3001 ms

We lost one unit of coffee, and also the last client (John) is not the one displayed (Anna). The reason is that since our code is not synchronized, threads interleaved. Our buyCoffee operation should be made atomic.


2   How synchronization works


A synchronized block is an area of code which is guarded by a lock. When a thread enters a synchronized block, it needs to acquire its lock and once acquired, it won’t release it until exiting the block or throwing an exception. In this way, when another thread tries to enter the synchronized block, it won’t be able to acquire its lock until the owner thread releases it. This is the Java mechanism to ensure that only on thread at a given time is executing a synchronized block of code, ensuring the atomicity of all actions within that block.

Ok, so you use a lock to guard a synchronized block, but what is a lock? The answer is that any Java object can be used as a lock, which is called intrinsic lock. We will now see some examples of these locks when using synchronization.


3   Synchronized methods


Synchronized methods are guarded by two types of locks:

  • Synchronized instance methods: The implicit lock is ‘this’, which is the object used to invoke the method. Each instance of this class will use their own lock.
  • Synchronized static methods: The lock is the Class object. All instances of this class will use the same lock.

As usual, this is better seen with some code.

First, we are going to synchronize an instance method. This works as follows: We have one instance of the class shared by two threads (Thread-1 and Thread-2), and another instance used by a third thread (Thread-3):

Since doSomeTask method is synchronized, you would expect that only one thread will execute its code at a given time. But that’s wrong, since it is an instance method; different instances will use a different lock as the output demonstrates:

Thread-1 | Entering method. Current Time: 0 ms
Thread-3 | Entering method. Current Time: 1 ms
Thread-3 | Exiting method
Thread-1 | Exiting method
Thread-2 | Entering method. Current Time: 3001 ms
Thread-2 | Exiting method

Since Thread-1 and Thread-3 use a different instance (and hence, a different lock), they both enter the block at the same time. On the other hand, Thread-2 uses the same instance (and lock) as Thread-1. Therefore, it has to wait until Thread-1 releases the lock.

Now let’s change the method signature and use a static method. StaticMethodExample has the same code except the following line:

If we execute the main method we will get the following output:

Thread-1 | Entering method. Current Time: 0 ms
Thread-1 | Exiting method
Thread-3 | Entering method. Current Time: 3001 ms
Thread-3 | Exiting method
Thread-2 | Entering method. Current Time: 6001 ms
Thread-2 | Exiting method

Since the synchronized method is static, it is guarded by the Class object lock. Despite using different instances, all threads will need to acquire the same lock. Hence, any thread will have to wait for the previous thread to release the lock.


4   Back to the coffee store example


I have now modified the Coffee Store example in order to synchronize its methods. The result is as follows:

Now, if we execute the program, we won’t lose any sale:

Mike bought some coffee
Steve bought some coffee
Anna bought some coffee
John bought some coffee
Sold coffee: 4
Last client: John
Total time: 12005 ms

Perfect! Well, it really is? Now the program’s execution time is 12 seconds.  You sure have noticed a someLongRunningProcess method executing during each sale. It can be an operation which has nothing to do with the sale, but since we synchronized the whole method, now each thread has to wait for it to execute. Could we leave this code out of the synchronized block? Sure! Have a look at synchronized blocks in the next section.


5   Synchronized blocks


The previous section showed us that we may not always need to synchronize the whole method. Since all the synchronized code forces a serialization of all thread executions, we should minimize the length of the synchronized block. In our Coffee store example, we could leave the long running process out of it. In this section’s example, we are going to use synchronized blocks:

In SynchronizedBlockCoffeeStore, we modify the buyCoffee method to exclude the long running process outside of the synchronized block:

In the previous synchronized block, we use ‘this’ as its lock. It’s the same lock as in synchronized instance methods. Beware of using another lock, since we are using this lock in other methods of this class (countSoldCoffees and getLastClient).

Let’s see the result of executing the modified program:

Mike bought some coffee
John bought some coffee
Anna bought some coffee
Steve bought some coffee
Sold coffee: 4
Last client: Steve
Total time: 3015 ms

We have significantly reduced the duration of the program while keeping the code synchronized.


6   Using private locks


The previous section used a lock on the instance object, but you can use any object as its lock. In this section we are going to use a private lock and see what the risk is of using it.

In PrivateLockExample, we have a synchronized block guarded by a private lock (myLock):

If one thread enters executeTask method will acquire myLock lock. Any other thread entering other methods within this class guarded by the same myLock lock, will have to wait in order to acquire it.

But now, let’s imagine that someone wants to extend this class in order to add its own methods, and these methods also need to be synchronized because need to use the same shared data. Since the lock is private in the base class, the extended class won’t have access to it. If the extended class synchronizes its methods, they will be guarded by ‘this’. In other words, it will use another lock.

MyPrivateLockExample extends the previous class and adds its own synchronized method executeAnotherTask:

The program uses two worker threads that will execute executeTask and executeAnotherTask respectively. The output shows how threads are interleaved since they are not using the same lock:

executeTask - Entering...
executeAnotherTask - Entering...
executeAnotherTask - Exiting...
executeTask - Exiting...


7   Conclusion


We have reviewed the use of intrinsic locks by using Java’s built-in locking mechanism. The main concern here is that synchronized blocks that need to use shared data; have to use the same lock.

This post is part of the Java Concurrency Tutorial series. Check here to read the rest of the tutorial.

You can find the source code at Github.

I'm publishing my new posts on Google plus and Twitter. Follow me if you want to be updated with new content.



Java Concurrency Tutorial

This tutorial consists of several posts that explain the main concepts of concurrency in Java. It starts with the basics with posts about the main concerns or risks of using non-synchronized programs, and it then continues with more specific features.


Basics


Atomicity and race conditions
Atomicity is one of the main concerns in concurrent programs. This post shows the effects of executing compound actions in non-synchronized code.

Visibility between threads
Another of the risks of executing code concurrently is how values written by one thread can become visible to other threads accessing the same data.

Thread-safe designs
After looking at the main risks of sharing data, this post describes several class designs that can be shared safely between different threads.


Synchronization


Locking - Intrinsic locks
Intrinsic locks are Java's built-in mechanism for locking in order to ensure that compound actions within a synchronized block are atomic and create a happens-before relationship.

Monday, August 25, 2014

Java Concurrency Tutorial - Thread-safe designs

After reviewing what the main risks are when dealing with concurrent programs (like atomicity or visibility), we will go through some class designs that will help us prevent the aforementioned bugs. Some of these designs result in the construction of thread-safe objects, allowing us to share them safely between threads. As an example, we will consider immutable and stateless objects. Other designs will prevent different threads from modifying the same data, like thread-local variables.

You can see all the source code at github.


1   Immutable objects


Immutable objects have a state (have data which represent the object's state), but it is built upon construction, and once the object is instantiated, the state cannot be modified.

Although threads may interleave, the object has only one possible state. Since all fields are read-only, not a single thread will be able to change object's data. For this reason, an immutable object is inherently thread-safe.

Product shows an example of an immutable class. It builds all its data during construction and none of its fields are modifiable:

In some cases, it won't be sufficient to make a field final. For example, MutableProduct class is not immutable although all fields are final:

Why is the above class not immutable? The reason is we let a reference to escape from the scope of its class. The field 'categories' is a mutable reference, so after returning it, the client could modify it. In order to show this, consider the following program:

And the console output:

Product categories
A
B
C

Modified Product categories
B
C

Since categories field is mutable and it escaped the object's scope, the client has modified the categories list. The product, which was supposed to be immutable, has been modified, leading to a new state.

If you want to expose the content of the list, you could use an unmodifiable view of the list:


2   Stateless objects


Stateless objects are similar to immutable objects but in this case, they do not have a state, not even one. When an object is stateless it does not have to remember any data between invocations.

Since there is no state to modify, one thread will not be able to affect the result of another thread invoking the object's operations. For this reason, a stateless class is inherently thread-safe.

ProductHandler is an example of this type of objects. It contains several operations over Product objects and it does not store any data between invocations. The result of an operation does not depend on previous invocations or any stored data:

In its sumCart method, the ProductHandler converts the product list to an array since for-each loop uses an iterator internally to iterate through its elements. List iterators are not thread-safe and could throw a ConcurrentModificationException if modified during iteration. Depending on your needs, you might choose a different strategy.


3  Thread-local variables


Thread-local variables are those variables defined within the scope of a thread. No other threads will see nor modify them.

The first type is local variables. In the below example, the total variable is stored in the thread's stack:

Just take into account that if instead of a primitive you define a reference and return it, it will escape its scope. You may not know where the returned reference is stored. The code that calls sumCart method could store it in a static field and allow it being shared between different threads.

The second type is ThreadLocal class. This class provides a storage independent for each thread. Values stored into an instance of ThreadLocal are accessible from any code within the same thread.

The ClientRequestId class shows an example of ThreadLocal usage:

The ProductHandlerThreadLocal class uses ClientRequestId to return the same generated id within the same thread:

If you execute the main method, the console output will show different ids for each thread. As an example:

T1 - 23dccaa2-8f34-43ec-bbfa-01cec5df3258
T2 - 936d0d9d-b507-46c0-a264-4b51ac3f527d
T2 - 936d0d9d-b507-46c0-a264-4b51ac3f527d
T3 - 126b8359-3bcc-46b9-859a-d305aff22c7e
...

If you are going to use ThreadLocal, you should care about some of the risks of using it when threads are pooled (like in application servers). You could end up with memory leaks or information leaking between requests. I won't extend myself in this subject since the post How to shoot yourself in foot with ThreadLocals explains well how this can happen.


4   Using synchronization


Another way of providing thread-safe access to objects is through synchronization. If we synchronize all accesses to a reference, only a single thread will access it at a given time. We will discuss this on further posts.


5   Conclusion


We have seen several techniques that help us build simpler objects that can be shared safely between threads. It is much harder to prevent concurrent bugs if an object can have multiple states. On the other hand, if an object can have only one state or none, we won't have to worry about different threads accessing it at the same time.

This post is part of the Java Concurrency Tutorial series. Check here to read the rest of the tutorial.

I'm publishing my new posts on Google plus and Twitter. Follow me if you want to be updated with new content.

Thursday, August 14, 2014

Java Concurrency Tutorial - Visibility between threads

When sharing an object’s state between different threads, other issues besides atomicity come into play. One of them is visibility.

The key fact is that without synchronization, instructions are not guaranteed to be executed in the order in which they appear in your source code. This won’t affect the result in a single-threaded program but, in a multi-threaded program, it is possible that if one thread updates a value, another thread doesn’t see the update when it needs it or doesn’t see it at all.

In a multi-threaded environment, it is the program’s responsibility to identify when data is shared between different threads and act in consequence (using synchronization).

The example in NoVisibility consists in two threads that share a flag. The writer thread updates the flag and the reader thread waits until the flag is set:

This program might result in an infinite loop, since the reader thread may not see the updated flag and wait forever.


With synchronization we can guarantee that this reordering doesn’t take place, avoiding the infinite loop. To ensure visibility we have two options:
  • Locking: Guarantees visibility and atomicity (as long as it uses the same lock).
  • Volatile field: Guarantees visibility.

The volatile keyword acts like some sort of synchronized block. Each time the field is accessed, it will be like entering a synchronized block. The main difference is that it doesn’t use locks. For this reason, it may be suitable for examples like the above one (updating a shared flag) but not when using compound actions.

We will now modify the previous example by adding the volatile keyword to the ready field.

Visibility will not result in an infinite loop anymore. Updates made by the writer thread will be visible to the reader thread:

Writer thread - Changing flag...
Reader Thread - Flag change received. Finishing thread.


Conclusion


We learned about another risk when sharing data in multi-threaded programs. For a simple example like the one shown here, we can simply use a volatile field. Other situations will require us to use atomic variables or locking.

This post is part of the Java Concurrency Tutorial series. Check here to read the rest of the tutorial.

You can take a look at the source code at github.

I'm publishing my new posts on Google plus and Twitter. Follow me if you want to be updated with new content.

Monday, August 11, 2014

Java Concurrency Tutorial - Atomicity and race conditions

Atomicity is one of the key concepts in multi-threaded programs. We say a set of actions is atomic if they all execute as a single operation, in an indivisible manner. Taking for granted that a set of actions in a multi-threaded program will be executed serially may lead to incorrect results. The reason is due to thread interference, which means that if two threads execute several steps on the same data, they may overlap.

The following Interleaving example shows two threads executing several actions (prints in a loop) and how they are overlapped:

When executed, it will produce unpredictable results. As an example:

Thread 2 - Number: 0
Thread 2 - Number: 1
Thread 2 - Number: 2
Thread 1 - Number: 0
Thread 1 - Number: 1
Thread 1 - Number: 2
Thread 1 - Number: 3
Thread 1 - Number: 4
Thread 2 - Number: 3
Thread 2 - Number: 4

In this case, nothing wrong happens since they are just printing numbers. However, when you need to share the state of an object (its data) without synchronization, this leads to the presence of race conditions.


Race condition


Your code will have a race condition if there’s a possibility to produce incorrect results due to thread interleaving. This section describes two types of race conditions:

  1. Check-then-act
  2. Read-modify-write
To remove race conditions and enforce thread safety, we must make these actions atomic by using synchronization. Examples in the following sections will show what the effects of these race conditions are.


Check-then-act race condition


This race condition appears when you have a shared field and expect to serially execute the following steps:

  1. Get a value from a field.
  2. Do something based on the result of the previous check.


The problem here is that when the first thread is going to act after the previous check, another thread may have interleaved and changed the value of the field. Now, the first thread will act based on a value that is no longer valid. This is easier seen with an example.

UnsafeCheckThenAct is expected to change the field number once. Following calls to changeNumber method, should result in the execution of the else condition:

But since this code is not synchronized, it may (there's no guarantee) result in several modifications of the field:

T13 | Changed
T17 | Changed
T35 | Not changed
T10 | Changed
T48 | Not changed
T14 | Changed
T60 | Not changed
T6 | Changed
T5 | Changed
T63 | Not changed
T18 | Not changed

Another example of this race condition is lazy initialization.

A simple way to correct this is to use synchronization.

SafeCheckThenAct is thread-safe because it has removed the race condition by synchronizing all accesses to the shared field.

Now, executing this code will always produce the same expected result; only a single thread will change the field:

T0 | Changed
T54 | Not changed
T53 | Not changed
T62 | Not changed
T52 | Not changed
T51 | Not changed
...

In some cases, there will be other mechanisms which perform better than synchronizing the whole method but I won’t discuss them in this post.


Read-modify-write race condition


Here we have another type of race condition which appears when executing the following set of actions:

  1. Fetch a value from a field.
  2. Modify the value.
  3. Store the new value to the field.


In this case, there’s another dangerous possibility which consists in the loss of some updates to the field. One possible outcome is:

Field’s value is 1.
Thread 1 gets the value from the field (1).
Thread 1 modifies the value (5).
Thread 2 reads the value from the field (1).
Thread 2 modifies the value (7).
Thread 1 stores the value to the field (5).
Thread 2 stores the value to the field (7).

As you can see, update with the value 5 has been lost.

Let’s see a code sample. UnsafeReadModifyWrite shares a numeric field which is incremented each time:

Can you spot the compound action which causes the race condition?

I’m sure you did, but for completeness, I will explain it anyway. The problem is in the increment (number++). This may appear to be a single action but in fact, it is a sequence of three actions (get-increment-write).

When executing this code, we may see that we have lost some updates:

2014-08-08 09:59:18,859|UnsafeReadModifyWrite|Final number (should be 10_000): 9996

Depending on your computer it will be very difficult to reproduce this update loss, since there’s no guarantee on how threads will interleave. If you can’t reproduce the above example, try UnsafeReadModifyWriteWithLatch, which uses a CountDownLatch to synchronize thread’s start, and repeats the test a hundred times. You should probably see some invalid values among all the results:

Final number (should be 1_000): 1000
Final number (should be 1_000): 1000
Final number (should be 1_000): 1000
Final number (should be 1_000): 997
Final number (should be 1_000): 999
Final number (should be 1_000): 1000
Final number (should be 1_000): 1000
Final number (should be 1_000): 1000
Final number (should be 1_000): 1000
Final number (should be 1_000): 1000
Final number (should be 1_000): 1000

This example can be solved by making all three actions atomic.

SafeReadModifyWriteSynchronized uses synchronization in all accesses to the shared field:

Let’s see another example to remove this race condition. In this specific case, and since the field number is independent to other variables, we can make use of atomic variables. SafeReadModifyWriteAtomic uses atomic variables to store the value of the field:

Following posts will further explain mechanisms like locking or atomic variables.


Conclusion


This post explained some of the risks implied when executing compound actions in non-synchronized multi-threaded programs. To enforce atomicity and prevent thread interleaving, one must use some type of synchronization.

This post is part of the Java Concurrency Tutorial series. Check here to read the rest of the tutorial.

You can take a look at the source code at github.

I'm publishing my new posts on Google plus and Twitter. Follow me if you want to be updated with new content.


Monday, May 5, 2014

Spring Integration 4.0: A complete XML-free example

1   Introduction


Spring Integration 4.0 is finally here, and this release comes with very nice features. The one covered in this article is the possibility to configure an integration flow without using XML at all. Those people that don’t like XML will be able to develop an integration application with just using JavaConfig.

This article is divided in the following sections:
  1. Introduction.
  2. An overview of the flow.
  3. Spring configuration.
  4. Detail of the endpoints.
  5. Testing the entire flow.
  6. Conclusion.

The source code can be found at github.

The source code of the web service invoked in this example can be found at the spring-samples repository at github.


2   An overview of the flow


The example application shows how to configure several messaging and integration endpoints. The user asks for a course by specifying the course Id. The flow will invoke a web service and return the response to the user. Additionally, some type of courses will be stored to a database. 

The flow is as follows:
  • An integration gateway (course service) serves as the entry to the messaging system.
  • A transformer builds the request message from the user specified course Id.
  • A web service outbound gateway sends the request to a web service and waits for a response.
  • A service activator is subscribed to the response channel in order to return the course name to the user.
  • A filter is also subscribed to the response channel. This filter will send some types of courses to a mongodb channel adapter in order to store the response to a database.

The following diagram better shows how the flow is structured:




3   Spring configuration


As discussed in the introduction section, the entire configuration is defined with JavaConfig. This configuration is split into three files: infrastructure, web service and database configuration. Let’s check it out:

3.1   Infrastructure configuration


This configuration file only contains the definition of message channels. The messaging endpoints (transformer, filter, etc...) are configured with annotations.

InfrastructureConfiguration.java
The @ComponentScan annotation searches for @Component annotated classes, which are our defined messaging endpoints; the filter, the transformer and the service activator.

The @IntegrationComponentScan annotation searches for specific integration annotations. In our example, it will scan the entry gateway which is annotated with @MessagingGateway.

The @EnableIntegration annotation enables integration configuration. For example, method level annotations like @Transformer or @Filter.

3.2   Web service configuration


This configuration file configures the web service outbound gateway and its required marshaller.

WebServiceConfiguration.java
The gateway allows us to define its output channel but not the input channel. We need to annotate the adapter with @ServiceActivator in order to subscribe it to the invocation channel and avoid having to autowire it in the message channel bean definition.

3.3   Database configuration


This configuration file defines all necessary beans to set up mongoDB. It also defines the mongoDB outbound channel adapter.

MongoDBConfiguration.java
Like the web service gateway, we can’t set the input channel to the adapter. I also have done that by specifying the input channel in the @ServiceActivator annotation.


4   Detail of the endpoints


The first endpoint of the flow is the integration gateway, which will put the argument (courseId) into the payload of a message and send it to the request channel.

The message containing the course id will reach the transformer. This endpoint will build the request object that the web service is expecting:

Subscribed to the response channel, which is the channel where the web service reply will be sent, there’s a service activator that will receive the response message and deliver the course name to the client:

Also subscribed to the response channel, a filter will decide based on its type, if the course is required to be stored to a database:


5   Testing the entire flow


The following client will send two requests; a BC type course request that will be stored to the database and a DF type course that will be finally filtered:

This will result in the following console output:

CourseRequestBuilder|Building request for course [BC-45]
CourseResponseHandler|Course with ID [BC-45] received: Introduction to Java
StoredCoursesFilter|Course [BC-45] validated. Storing to database
CourseRequestBuilder|Building request for course [DF-21]
CourseResponseHandler|Course with ID [DF-21] received: Functional Programming Principles in Scala
StoredCoursesFilter|Course [DF-21] filtered. Not a BF course



6   Conclusion


We have learnt how to set up and test an application powered with Spring Integration using no XML configuration. Stay tuned, because Spring Integration Java DSL with Spring Integration extensions is on its way!

I'm publishing my new posts on Google plus and Twitter. Follow me if you want to be updated with new content.

Monday, April 28, 2014

Spring Integration - Configure web service client timeout

1   Introduction


With the support of Spring Integration, your application can invoke a web service by using an outbound web service gateway. The invocation is handled by this gateway, thus you just need to worry about building the request message and handling the response. However, with this approach it is not obvious how to configure additional options like setting timeouts or caching of operations. This article will show how to set a client timeout and integrate it with the gateway.

This article is divided in the following sections:
  1. Introduction.
  2. Web service invocation overview.
  3. Configuring a message sender.
  4. The sample application.
  5. Conclusion.

The source code can be found at github.


2   Web service invocation overview


The web service outbound gateway delegates the web service invocation to the Spring Web Services WebServiceTemplate. When a message arrives to the outbound gateway, this template uses a message sender in order to create a new connection. The diagram below shows an overview of the flow:


By default, the web service template sets an HttpUrlConnectionMessageSender as its message sender, which is a basic implementation without support for configuration options. This behavior though, can be overridden by setting a more advanced message sender with the capability of setting both read and connection timeouts.

We are going to configure the message sender in the next section.


3   Configuring a message sender


We are going to configure a message sender to the outbound gateway. This way, the gateway will set the template’s message sender with the one provided.

The implementation we are providing in the example is the HttpComponentsMessageSender class, also from the Spring Web Services project. This message sender allows us to define the following timeouts:
  • connectionTimeout: Sets the timeout until the connection is established.
  • readTimeout: Sets the socket timeout for the underlying HttpClient. This is the time required for the service to reply.

Configuration:

The properties file contains the values, which are both set to two seconds:

timeout.connection=2000
timeout.read=2000

Once configured, we add it to the web service outbound gateway configuration:

To use this message sender, you will need to add the following dependency:

And that’s it; the next section will show the sample application to see how it works.


4   The sample application


The flow is simple; it consists in an application that sends a request to a web service and receives a response. The web service source code can be found at github.

The gateway contains the method through which we will enter the messaging system:

Finally, the test:


5   Conclusion


We have learnt how to set additional options to the web service outbound gateway in order to establish a timeout.

I'm publishing my new posts on Google plus and Twitter. Follow me if you want to be updated with new content.