Java Solutions for Distributed Transactions and/or Data Shared in Cluster

ぃ、小莉子 提交于 2019-11-29 19:27:57

Thought I found a great Java Clustering/Distributed platform, wanted to reopen this-

Checkout http://www.hazelcast.com

I ran the test programs, it is very cool, very light-weight and simple to use. It automatically detects the Cluster Members in a Peer-to-Peer configuration. The opportunities are limitless.

Thanks for nicely summarizing all possibilities in one place.

One technique is missing here though. It is MapReduce-Hadoop. If it is possible to fit the problem into the MapReduce paradigm, it is perhaps the most widely available solution. I also wonder if the Actor Framework pattern (JetLang, Kilim, etc) can be extended to a cluster.

Don't forget Erlang's Mnesia.

Mnesia gives you stuff like transactions that you're used to in a normal DB, but provides real-time operations and fault-tolerance. Plus you can reconfigure things without downtime. Downside is that it's a memory resident database, so you have to fragment really large tables. Largest table size is 4Gb.

While Oracle Coherence and a lot of the other solutions suggested are good for sharing data, you only cited locking and STM as ways to manage state mutation in a distributed environment; those are both generally pretty poor ways to scale state management. On a different site, I recently posted the following about how to implement (for example) sequence counters:

If you're looking at a counter, then using something like a Coherence EntryProcessor will easily achieve "once-and-only-once" behavior and HA for any number of monotonically increasing sequences; here's the entire implementation:

public class SequenceCounterProcessor
        extends AbstractProcessor
    {
    public Object process(InvocableMap.Entry entry)
        {
        long l = entry.isPresent() ? (Long) entry.getValue() + 1 : 0;
        entry.setValue(l);
        return l;
        }
    }

Yup. That's it. Automatic and seamless HA, dynamic scale-out elasticity, once-and-only-once behavior, etc. Done.

The EntryProcessor is a type of distributed closure that we introduced in 2005.

As an aside, in Java 8 (not yet release), project Lambda introduces official closure support in the language and the standard libraries.

Basically, the idea is to deliver the closure to the location of the "owner" of the data in a distributed environment. Coherence dynamically manages data ownership by using dynamic partitioning, allowing the distributed system to load balance data across the various machines and nodes that are running. In fact, by default all of this is 100% automated, so you never actually tell it where to put the data, or how much data goes where. Additionally, there are secondary (and perhaps tertiary etc.) copies of the data managed on other nodes and other physical servers, to provide high availability in case a process fails or a server dies. Again, the management of these backup copies is completely automatic and completely synchronous by default, meaning that the system is 100% HA by default (i.e. with no configuration).

When the closure arrives at the data owner, it is executed in a transactional workspace, and if the operation completes successfully then it is shipped to the backup for safe keeping. The data mutation (e.g. the result of the operation) is only made visible to the remainder of the system once the backup has been successfully made.

A few optimizations to the above include adding the ExternalizableLite & PortableObject interfaces for optimized serialization, and avoiding the serialization of the boxed long by going after the "network ready" form of the data directly:

public Object process(InvocableMap.Entry entry)
    {
    try
        {
        BinaryEntry binentry = (BinaryEntry) entry;
        long l = entry.isPresent() ? binentry.getBinaryValue()
                .getBufferInput().readLong() + 1 : 0L;
        BinaryWriteBuffer buf = new BinaryWriteBuffer(8);
        buf.getBufferOutput().writeLong(l);
        binentry.updateBinaryValue(buf.toBinary());
        return l;
        }
    catch (IOException e)
        {
        throw new RuntimeException(e);
        }
    }

And since it's stateless, why not have a singleton instance ready to go?

public static final SequenceCounterProcessor INSTANCE =
        new SequenceCounterProcessor();

Using it from anywhere on the network is as simple as a single line of code:

long l = (Long) sequences.invoke(x, SequenceCounterProcessor.INSTANCE);

Where "x" is any object or name that identifies the particular sequence counter you want to use. For more info, see the Coherence knowledge base at: http://coherence.oracle.com/

Oracle Coherence is a distributed system. Whenever you start a Coherence node, it joins with other Coherence nodes that are already running, and dynamically forms an elastic cluster. That cluster hosts data in a partitioned, highly available (HA), and transactionally consistent manner, and hosts operations (like the one I showed above) that operate on that data in a "once and only once" manner.

Furthermore, in addition to the ability to invoke any of that logic or access any of that data transparently from any Coherence node, you can also invoke any of that logic or access any of that data transparently from any process on the network (subject to authentication and authorization, of course). So this code would work from any Coherence cluster node or from any (Java / C / C++ / C# / .NET) client:

For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.

Maybe those slides will be helpful. From our experience I would recommend Oracle (Tangosol) Coherence and GigaSpaces as a most powerful data and processing distribution frameworks out there. Depending on exact nature of the problem, one of those may shine. Terracotta also quite applicable for some of the problems.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!