distributed-system

Set up of Hyperledger fabric on 2 different PCs

心已入冬 提交于 2019-12-22 13:02:58
问题 I need to run Hyperledger-Fabric instances on 4 different machines PC-1 should contain CA and peers of ORG-1 in containers, Pc-2 should contain CA and peers of ORG-2, PC-3 should contain orderer(solo) and PC-4 should Node api Is my approach missing something ? if not how can I achieve this? 回答1: I would recommend that you look at the Ansible driver in Hyperledger Cello project to manage deployment across multiple hosts/vms. In short, you need to establish network visibility across the set of

Generating monotonically increasing integers (max 64bit)

穿精又带淫゛_ 提交于 2019-12-22 10:23:16
问题 As part of a new project we need a service which can generate monotonically increasing integers. Requirements for the service are: service does not need to produce contiguous integer as long as it produces monotonically increasing integers it should be fine It should produce 64 bit integer the service should be highly available services should be resilient to failure (or restarts) I was planing to use redis ( INCR ) as a back-end store with replication enable but the issue is if the master

What's the difference between ZooKeeper and any distributed Key-Value stores?

旧巷老猫 提交于 2019-12-21 09:15:10
问题 I am new to zookeeper and distributed systems, and am learning it myself. From what I understand for now, it seems that ZooKeeper is simply a key-value store whose keys are paths and values are strings, which is nothing different from, say, Redis. (And apparently we can use slash-separated path as keys in redis as well.) So my question is, what is the essential difference between ZooKeeper and other distributed KV store? Why is ZooKeeper using so called "paths" as keys, instead of simple

Error handling in hadoop map reduce

狂风中的少年 提交于 2019-12-21 05:34:10
问题 Based on the documentation, there are a few ways, how the error handling is performed in map reduce. Below are the few: a. Custom counters using enum - increment for every failed record. b. Log error and analyze later. Counters give the number of failed records. However to get the identifier of the failed record(may be its unique key), and details of the exception occurred, node on which the error occurred - we need to perform centralized log analysis and there are many nodes running.

Are PHP sessions hard to scale across a distributed system?

自闭症网瘾萝莉.ら 提交于 2019-12-21 04:04:08
问题 At work we do almost everything in Java and perl, but I wanted to build out a feature using PHP and sessions. Some peeps thought that it was a bad idea to try to do PHP sessions on our system, cause it's distributed to many servers. What would the specific problem be? 回答1: The answer to your specific question, what would the problem be, lies in the fact that by default PHP stores its sessions in files on the filesystem. For a single webserver serving requests, this is not a problem because

How to get Filename/File Contents as key/value input for MAP when running a Hadoop MapReduce Job?

故事扮演 提交于 2019-12-18 05:05:22
问题 I am creating a program to analyze PDF, DOC and DOCX files. These files are stored in HDFS. When I start my MapReduce job, I want the map function to have the Filename as key and the Binary Contents as value. I then want to create a stream reader which I can pass to the PDF parser library. How can I achieve that the key/value pair for the Map Phase is filename/filecontents? I am using Hadoop 0.20.2 This is older code that starts a job: public static void main(String[] args) throws Exception {

paxos vs raft for leader election

柔情痞子 提交于 2019-12-17 23:25:28
问题 After reading paxos and raft paper, I have following confusion: paxos paper only describe consensus on single log entry, which is equivalent the leader election part of the raft algorithm. What's the advantage of paxos's approach over the simple random timeout approach in raft's leader election? 回答1: It is a common misconception that the original Paxos papers don't use a stable leader. In Paxos Made Simple on page 6 in the section entitled “The Implementation” Lamport wrote: The algorithm

what is distributed queue?

旧城冷巷雨未停 提交于 2019-12-12 03:56:06
问题 My Understanding :- A distributed destination is a single, logical(not physical) destination to a client which internally contains set of physical destinations (queues or topics) . It helps in scalable applications in terms of High availability(HA) and Load Balancing(LB). So when i do distributedQueue.put(someObject) , distributed queue will put the object on one of the phyicalQueue and also maintains some meta data to record which object lies on which on which queue Now when i do

What problem does the redis distributed lock solve?

你离开我真会死。 提交于 2019-12-11 16:47:15
问题 So I just read about redlock. What I understood is that it needs 3 independent machines to work. By independent they mean that all the machines are masters and there is no replication amongst them, which means they are serving different types of data. So why would I need to lock a key present in three independent redis instances acting as masters ? What are the use cases where I would need to use redlock ? 回答1: So why would I need to lock a key present in three independent redis instances

Kafka connect internals - how connectors and tasks got deployed around the connect cluster

橙三吉。 提交于 2019-12-11 15:27:59
问题 I use Kafka Connect for different purposes and it's working fine. It's more a curiosity question. Trying to figure out reading the code might take some time, so I'm asking here.. ( but I'll try read Kafka code anyway..) I know a Connector is the one responsible for giving/updating configurations for the tasks, but what it is exactly ? Is it some piece of code that will run over the Connect cluster ? If yes, I imagine a worker did initiate it, but it does it arbitrary on one worker JVM ?