distributed

how to rapidly increment counters in Cassandra w/o staleness

这一生的挚爱 提交于 2019-11-30 03:28:23
I have a Cassandra question. Do you know how Cassandra does updates/increments of counters? I want to use a storm bolt (CassandraCounterBatchingBolt from storm-contrib repo on github) which writes into cassandra. However, I'm not sure how some of the implementation of the incrementCounterColumn() method works .. and there is also the limitations with cassandra counters (from: http://wiki.apache.org/cassandra/Counters ) which makes them useless for my scenario IMHO: If a write fails unexpectedly (timeout or loss of connection to the coordinator node) the client will not know if the operation

Microservices: What are smart endpoints and dumb pipes?

半城伤御伤魂 提交于 2019-11-30 02:23:06
I have read an article " Microservices " by Martin Fowler and find it difficult to understand smart endpoint s and dumb pipes . Please explain these terms, examples are welcome. I didn’t read the article, so I can only speculate what he can mean exactly, but as he gives ESB as an example against microservices and ZeroMQ as an example for micro services I hope my speculation will be pretty exact: One of the ideas of Unix (and Linux) is to build small independent applications and connect them via pipes. The probably most common set of two command which I’m using is ps and grep like this: ps aux

Distributed Tensorflow: CreateSession still waiting

喜夏-厌秋 提交于 2019-11-29 23:44:42
问题 Simple script below is launched with args shown in it's header. It behaves differently, but often one of the workers hangs and prints these "CreateSession still waiting for some other task" messages. Why does a new MonitoredTrainingSession need others? And why don't the others wait for it to start? # #!/bin/bash # python train.py --job master --task 0 & # python train.py --job worker --task 0 & # python train.py --job worker --task 1 & # python train.py --job worker --task 2 & import argparse

Hadoop Web Authentication using Kerberos

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-29 23:15:47
问题 I configured hadoop using kerberos, everything works fine, I can browse hdfs, submit jobs, etc. But failed http web authentication. I use hadoop-0.20.2 in cdh3u2, which supports HTTP SPNEGO. HTTP authentication related configurations in core-site.xml are as follows: <!-- HTTP web-consoles Authentication --> <property> <name>hadoop.http.filter.initializers</name> <value>org.apache.hadoop.security.AuthenticationFilterInitializer</value> </property> <property> <name>hadoop.http.authentication

Distributed Computing Framework (.NET) - Specifically for CPU Intensive operations [closed]

匆匆过客 提交于 2019-11-29 22:23:46
I am currently researching the options that are available (both Open Source and Commercial) for developing a distributed application. "A distributed system consists of multiple autonomous computers that communicate through a computer network." Wikipedia The application is focused on distributing highly cpu intensive operations (as opposed to data intensive) so I'm sure MapReduce solutions don't fit the bill. Any framework that you can recommend ( + give a brief summary of any experience or comparison to other frameworks ) would be greatly appreciated. StevenH MPAPI Framework - http://www

In C#, if 2 processes are reading and writing to the same file, what is the best way to avoid process locking exceptions?

人盡茶涼 提交于 2019-11-29 20:59:11
With the following file reading code: using (FileStream fileStream = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.None)) { using (TextReader tr = new StreamReader(fileStream)) { string fileContents = tr.ReadToEnd(); } } And the following file write code: using (TextWriter tw = new StreamWriter(fileName)) { tw.Write(fileContents); tw.Close(); } The following exception details are seen: The process cannot access the file 'c:\temp\myfile.txt' because it is being used by another process. What is the best way of avoiding this? Does the reader need to retry upon receipt of the

Spread vs MPI vs zeromq?

血红的双手。 提交于 2019-11-29 20:15:14
In one of the answers to Broadcast like UDP with the Reliability of TCP , a user mentions the Spread messaging API. I've also run across one called ØMQ . I also have some familiarity with MPI . So, my main question is: why would I choose one over the other? More specifically, why would I choose to use Spread or ØMQ when there are mature implementations of MPI to be had? MPI was deisgned tightly-coupled compute clusters with fast, reliable networks. Spread and ØMQ are designed for large distributed systems. If you're designing a parallel scientific application, go with MPI, but if you are

How scalable is distributed Erlang?

霸气de小男生 提交于 2019-11-29 19:44:59
Part A: Erlang has a lot of success stories about running concurrent agents e.g. the millions of simultaneous Facebook chats. That's millions of agents, but of course it's not millions of CPUs across a network. I'm having trouble finding metrics on how well Erlang scales when scaling is "horizontal" across a LAN/WAN. Let's assume that I have many (tens of thousands) physical nodes (running Erlang on Linux) that need to communicate and synchronize small infrequent amounts of data across the LAN/WAN. At what point will I have communications bottlenecks, not between agents, but between physical

SQL-Server DB design time scenario (distributed or centralized)

陌路散爱 提交于 2019-11-29 12:46:59
We've an SQL Server DB design time scenario .. we've to store data about different Organizations in our database (i.e. like Customer, Vendor, Distributor, ...). All the diff organizations share the same type of information (almost) .. like Address details, etc... And they will be referred in other tables (i.e. linked via OrgId and we have to lookup OrgName at many diff places) I see two options: We create a table for each organization like OrgCustomer, OrgDistributor, OrgVendor, etc... all the tables will have similar structure and some tables will have extra special fields like the customer

calculate object delta

时光怂恿深爱的人放手 提交于 2019-11-29 11:47:11
I am working on an application where client and server share an object model, and the object graphs can become rather big. To save an object from client to server, ideally i would like to send only the difference over the wire, to minimize network traffic. I can pull the original object graph on the server and apply the delta to it Wondering if there are any tools or projects out there or if anyone has had any experience with doing such a thing .. many thanks At a previous job, we had large 3-D models that we wanted to share between clients. To save actual model changes would have been