Concurrency And Server-Side Networking APIs — Part 3

The original article can be found here.

Introduction

This is part three of an exploration of how concurrency in networking APIs works in Unix-like operating systems and Java technology. This article now looks at the Java equivalents of several C-based server implementations that have been presented. If you haven’t read this series from the beginning, it is important to note that the examples presented here are not necessarily in line with best practices–these are examples designed to illustrate a point.

Accepting Incoming Connections in a JVM: Simple Java Case

The last example concluded the tour of C server designs. The first part of this article demonstrates how servers could be architected on a Unix-like OS such as Linux or Solaris.

The C networking APIs are cumbersome. In the following example, you will see a Java program that is equivalent in functionality to our first C example: the simple server program. You’ll notice immediately, that the 90 lines of C code are compressed into about 20 lines of Java code.

A very simple, single-threaded, Java server program, Server.java, is presented here. The program can be compiled with a simple invocation of javac.

Again, this is a single threaded application that can only process one request at a time. Any backlog in requests would be queued up in the kernel connection queue for the endpoint.

If you were to run strace against the java process running the Server program, you would see multiple threads, but one would be sleeping in an accept() system call waiting for the next incoming connection.

32725 restart_syscall(<... resuming interrupted call ...>
32724 futex(0x955611c, FUTEX_WAIT_PRIVATE, 1, NULL
32723 futex(0x9554c64, FUTEX_WAIT_PRIVATE, 7, NULL
32722 futex(0xb7b2f880, FUTEX_WAIT_PRIVATE, 0, NULL
32721 futex(0x954e00c, FUTEX_WAIT_PRIVATE, 3, NULL
32720 futex(0x954d12c, FUTEX_WAIT_PRIVATE, 3, NULL
32719 restart_syscall(<... resuming interrupted call ...>
32718 accept(3,
32725 <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
32725 futex(0xb2098288, FUTEX_WAKE_PRIVATE, 1) = 0
32725 gettimeofday({1225521817, 930252}, NULL) = 0
32725 gettimeofday({1225521817, 930335}, NULL) = 0
32725 clock_gettime(CLOCK_REALTIME, {1225521817, 930413459}) = 0
32725 futex(0x95574e4, FUTEX_WAIT_PRIVATE, 1, {0, 49921541}) = -1 ETIMEDOUT (Connection timed out)
32725 futex(0xb2098288, FUTEX_WAKE_PRIVATE, 1) = 0

Notice, lwpid 32718 is in an accept() system call. Disregarding the JVM subsystems’ noise, this Server program is doing essentially the same thing as the C Server program.

Ideally, we want to build a java-based, multithreaded Server program that functions the same way as the multithreaded C server example. The next section will attempt to do just this, but there is a catch.

A Multithreaded Java Server Program

Not only is the Java networking API greatly simplified over its C counterpart, the Threading API built into the Java language is much improved over the Posix API. The TServer.java program provides a straightforward implementation of an N thread pool handling incoming requests written in Java.

TServer can be launched with the following command:

java TServer 5001

A Java thread dump of the JVM running TServer will show ten threads waiting in calls to ServerSocket.accept():

Full thread dump Java HotSpot(TM) Client VM (1.5.0_16-b02 mixed mode, sharing):

“DestroyJavaVM” prio=1 tid=0x0994e870 nid=0x1be waiting on condition [0x00000000..0xbfb51540]

“Thread-9” prio=1 tid=0x09bdecf8 nid=0x1cf waiting for monitor entry [0xb1a64000..0xb1a64e20]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-8” prio=1 tid=0x09bddd60 nid=0x1ce waiting for monitor entry [0xb1ae5000..0xb1ae60a0]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-7” prio=1 tid=0x09bdcdc8 nid=0x1cd waiting for monitor entry [0xb1b66000..0xb1b67120]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-6” prio=1 tid=0x09bdbc18 nid=0x1cc waiting for monitor entry [0xb1be7000..0xb1be7fa0]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-5” prio=1 tid=0x09bdac80 nid=0x1cb waiting for monitor entry [0xb1c68000..0xb1c69020]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-4” prio=1 tid=0x09bd9d48 nid=0x1ca waiting for monitor entry [0xb1ce9000..0xb1ce9ea0]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-3” prio=1 tid=0x09ba87d8 nid=0x1c9 waiting for monitor entry [0xb1d6a000..0xb1d6af20]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-2” prio=1 tid=0x09ba78e0 nid=0x1c8 waiting for monitor entry [0xb1deb000..0xb1debda0]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-1” prio=1 tid=0x09ba6a08 nid=0x1c7 waiting for monitor entry [0xb1e6c000..0xb1e6ce20]
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:382)
– waiting to lock <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Thread-0” prio=1 tid=0x09ba8f00 nid=0x1c6 runnable [0xb1eed000..0xb1eee0a0]
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
– locked <0x88b5cfd8> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at AcceptorThread.run(TServer.java:36)

“Low Memory Detector” daemon prio=1 tid=0x099972e8 nid=0x1c4 runnable [0x00000000..0x00000000]

“CompilerThread0” daemon prio=1 tid=0x09995e30 nid=0x1c3 waiting on condition [0x00000000..0xb219ea08]

“Signal Dispatcher” daemon prio=1 tid=0x09994dd0 nid=0x1c2 waiting on condition [0x00000000..0x00000000]

“Finalizer” daemon prio=1 tid=0x0998f168 nid=0x1c1 in Object.wait() [0xb2585000..0xb2585f20]
at java.lang.Object.wait(Native Method)
– waiting on <0x88b30ae8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:120)
– locked <0x88b30ae8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:136)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

“Reference Handler” daemon prio=1 tid=0x0998d338 nid=0x1c0 in Object.wait() [0xb2606000..0xb2606da0]
at java.lang.Object.wait(Native Method)
– waiting on <0x88b309f0> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:474)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
– locked <0x88b309f0> (a java.lang.ref.Reference$Lock)

“VM Thread” prio=1 tid=0x0998bef8 nid=0x1bf runnable

“VM Periodic Task Thread” prio=1 tid=0x09998858 nid=0x1c5 waiting on condition

This thread dump clearly shows that there are ten threads waiting in the ServerSocket.accept() system call (or its implementation class). So, we have a pool of ten threads waiting for requests coming into port 5001, which is the desired effect. But, digging under the covers further will show that it isn’t exactly the same as with the multithreaded C Server.

Once again, we will look at the system calls being made by the corresponding native threads on behalf of the Java threads using strace. Running the following command:

strace -o /tmp/strace1.out -f -p pid

gives the following output:

463 futex(0x9bdfc5c, FUTEX_WAIT_PRIVATE, 1, NULL
462 futex(0x9bdecc4, FUTEX_WAIT_PRIVATE, 1, NULL
461 futex(0x9bddd2c, FUTEX_WAIT_PRIVATE, 1, NULL
460 futex(0x9bdcd94, FUTEX_WAIT_PRIVATE, 1, NULL
459 futex(0x9bdcb7c, FUTEX_WAIT_PRIVATE, 1, NULL
458 futex(0x9bdbbac, FUTEX_WAIT_PRIVATE, 1, NULL
457 futex(0x9bdac14, FUTEX_WAIT_PRIVATE, 1, NULL
456 futex(0x9bd9cdc, FUTEX_WAIT_PRIVATE, 1, NULL
455 futex(0x9ba876c, FUTEX_WAIT_PRIVATE, 1, NULL
454 accept(3,
453 restart_syscall(<... resuming interrupted call ...>
452 futex(0x9998124, FUTEX_WAIT_PRIVATE, 1, NULL
451 futex(0x9996c6c, FUTEX_WAIT_PRIVATE, 7, NULL
450 futex(0xb7b34880, FUTEX_WAIT_PRIVATE, 0, NULL
449 futex(0x9990014, FUTEX_WAIT_PRIVATE, 3, NULL
448 futex(0x998f134, FUTEX_WAIT_PRIVATE, 3, NULL
447 restart_syscall(<... resuming interrupted call ...>
446 futex(0x9bdfcec, FUTEX_WAIT_PRIVATE, 1, NULL
453 <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
453 futex(0xb209d288, FUTEX_WAKE_PRIVATE, 1) = 0
453 gettimeofday({1225523178, 524393}, NULL) = 0

From the thread dump, we can see that the client handler threads are:

0x1cf=463
0x1ce=462
0x1cd=461
0x1cc=460
0x1cb=459
0x1ca=458
0x1c9=457
0x1c8=456
0x1c7=455
0x1c6=454

Notice, nine of the ten client handler Java threads are in a “waiting for monitor entry” state, while one of the client handlers is in a runnable state. Now, notice in the strace output, of the ten LWP system call entries (first ten entries) corresponding to the Java thread nids, nine of the LWPs are in futex() calls and one is in an accept() call.

So, only ONE Java thread at a time is allowed to actually enter under-the-hood logic to accept an incoming connection! This is a very important architectural limit of the JVM specification. This is done to ensure that homogenous behavior is achieved across target platforms. While Solaris and Unix can have N threads accept requests for the same end point, this isn’t true on all systems. To avoid differences in behavior across platforms, Sun’s JVM designers seem to have dictated that only one thread can intercept an incoming connection at a time.

This is an example of the Leader-Follower pattern being applied to ensure network API functionality is homogenous across JVM target platforms. This concept is discussed in section 4.3.1 of [1] in details.

Even though the JVM implements this behavior, it is a best practice to implement the Leader-Follower pattern in your own application code (or another pattern discussed in [1] & [2]) so that you do not rely upon the JVM’s syncronization in this scenerio. The same holds for the C server examples provided above. Although, it is probably more efficient to have a small pool of threads that are accepting connections (that the kernel can hand requests to directly as demonstrated in this article). Then, have those accepting threads hand requests off to a larger pool of handler threads.

References:

[1] http://www.cs.vu.nl/~mathijs/publications/2002-designpatternsnetworking.ps.gz
[2] http://www.cs.wustl.edu/~schmidt/PDF/lf-PLOPD.pdf
[3] http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#solaris