fastest (low latency) method for Inter Process Communication between Java and C/C++

后端 未结 10 857
隐瞒了意图╮
隐瞒了意图╮ 2020-11-29 14:45

I have a Java app, connecting through TCP socket to a \"server\" developed in C/C++.

both app & server are running on the same machine, a Solaris box (but we\'re

10条回答
  •  [愿得一人]
    2020-11-29 15:22

    Just tested latency from Java on my Corei5 2.8GHz, only single byte send/received, 2 Java processes just spawned, without assigning specific CPU cores with taskset:

    TCP         - 25 microseconds
    Named pipes - 15 microseconds
    

    Now explicitly specifying core masks, like taskset 1 java Srv or taskset 2 java Cli:

    TCP, same cores:                      30 microseconds
    TCP, explicit different cores:        22 microseconds
    Named pipes, same core:               4-5 microseconds !!!!
    Named pipes, taskset different cores: 7-8 microseconds !!!!
    

    so

    TCP overhead is visible
    scheduling overhead (or core caches?) is also the culprit
    

    At the same time Thread.sleep(0) (which as strace shows causes a single sched_yield() Linux kernel call to be executed) takes 0.3 microsecond - so named pipes scheduled to single core still have much overhead

    Some shared memory measurement: September 14, 2009 – Solace Systems announced today that its Unified Messaging Platform API can achieve an average latency of less than 700 nanoseconds using a shared memory transport. http://solacesystems.com/news/fastest-ipc-messaging/

    P.S. - tried shared memory next day in the form of memory mapped files, if busy waiting is acceptable, we can reduce latency to 0.3 microsecond for passing a single byte with code like this:

    MappedByteBuffer mem =
      new RandomAccessFile("/tmp/mapped.txt", "rw").getChannel()
      .map(FileChannel.MapMode.READ_WRITE, 0, 1);
    
    while(true){
      while(mem.get(0)!=5) Thread.sleep(0); // waiting for client request
      mem.put(0, (byte)10); // sending the reply
    }
    

    Notes: Thread.sleep(0) is needed so 2 processes can see each other's changes (I don't know of another way yet). If 2 processes forced to same core with taskset, the latency becomes 1.5 microseconds - that's a context switch delay

    P.P.S - and 0.3 microsecond is a good number! The following code takes exactly 0.1 microsecond, while doing a primitive string concatenation only:

    int j=123456789;
    String ret = "my-record-key-" + j  + "-in-db";
    

    P.P.P.S - hope this is not too much off-topic, but finally I tried replacing Thread.sleep(0) with incrementing a static volatile int variable (JVM happens to flush CPU caches when doing so) and obtained - record! - 72 nanoseconds latency java-to-java process communication!

    When forced to same CPU Core, however, volatile-incrementing JVMs never yield control to each other, thus producing exactly 10 millisecond latency - Linux time quantum seems to be 5ms... So this should be used only if there is a spare core - otherwise sleep(0) is safer.

提交回复
热议问题