cluster-computing | 易学教程

Determine asymmetric latencies in a network

阅读更多关于 Determine asymmetric latencies in a network

Imagine you have many clustered servers, across many hosts, in a heterogeneous network environment, such that the connections between servers may have wildly varying latencies and bandwidth. You want to build a map of the connections between servers by transferring data between them. Of course, this map may become stale over time as the network topology changes - but lets ignore those complexities for now and assume the network is relatively static. Given the latencies between nodes in this host graph, calculating the bandwidth is a relative simply timing exercise. I'm having more difficulty

Run a hadoop cluster on docker containers

阅读更多关于 Run a hadoop cluster on docker containers

I want to run a multi-node hadoop cluster, with each node inside a docker container on a different host. This image - https://github.com/sequenceiq/hadoop-docker works well to start hadoop in a pseudo distributed mode, what is the easiest way to modify this to have each node in a different container on a separate ec2 host? I did this with two containers running master and slave nodes on two different ubuntu hosts. I did the networking between containers using weave. I have added the images of the containers on docker hub account div4. I installed hadoop in the same way, as its installed on

My Spark's Worker cannot connect Master.Something wrong with Akka?

阅读更多关于 My Spark's Worker cannot connect Master.Something wrong with Akka?

I want to install Spark Standlone mode to a Cluster with my two virtual machines. With the version of spark-0.9.1-bin-hadoop1, I execute spark-shell successfully in each vm. I follow the offical document to make one vm(ip:xx.xx.xx.223) as both Master and Worker and to make the other(ip:xx.xx.xx.224) as Worker only. But the 224-ip vm cannot connect the 223-ip vm. Followed is 223(Master)'s master log: [@tc-52-223 logs]# tail -100f spark-root-org.apache.spark.deploy.master.Master-1-tc-52-223.out Spark Command: /usr/local/jdk/bin/java -cp :/data/test/spark-0.9.1-bin-hadoop1/conf:/data/test/spark-0

Is there a way to add nodes to a running Hadoop cluster?

阅读更多关于 Is there a way to add nodes to a running Hadoop cluster?

问题 I have been playing with Cloudera and I define the number of clusters before I start my job then use the cloudera manager to make sure everything is running. I’m working on a new project that instead of using hadoop is using message queues to distribute the work but the results of the work are stored in HBase. I might launch 10 servers to process the job and store to Hbase but I’m wondering if I later decided to add a few more worker nodes can I easily (read: programmable) make them

Running TensorFlow on a Slurm Cluster?

阅读更多关于 Running TensorFlow on a Slurm Cluster?

I could get access to a computing cluster, specifically one node with two 12-Core CPUs, which is running with Slurm Workload Manager . I would like to run TensorFlow on that system but unfortunately I were not able to find any information about how to do this or if this is even possible. I am new to this but as far as I understand it, I would have to run TensorFlow by creating a Slurm job and can not directly execute python/tensorflow via ssh. Has anyone an idea, tutorial or any kind of source on this topic? It's relatively simple. Under the simplifying assumptions that you request one process

Python “FileExists” error when making directory

阅读更多关于 Python “FileExists” error when making directory

I have several threads running in parallel from Python on a cluster system. Each python thread outputs to a directory mydir . Each script, before outputting checks if mydir exists and if not creates it: if not os.path.isdir(mydir): os.makedirs(mydir) but this yields the error: os.makedirs(self.log_dir) File "/usr/lib/python2.6/os.py", line 157, in makedirs mkdir(name,mode) OSError: [Errno 17] File exists I suspect it might be due to a race condition, where one job creates the dir before the other gets to it. Is this possible? If so, how can this error be avoided? I'm not sure it's a race

Tomcat's Clustering / Session Replication not replicating properly

阅读更多关于 Tomcat's Clustering / Session Replication not replicating properly

I'm setting up clustering/replication on Tomcat 7 on my local machine, to evaluate it for use with my environment/codebase. Setup I have two identical tomcat servers in sibling directories running on different ports. I have httpd listening on two other ports and connecting to the two tomcat instances as VirtualHosts. I can access and interact with both environments on the configured ports; everything is working as expected. The tomcat servers have clustering enabled like this, in server.xml: <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" channelSendOptions="8"> <Manager

How does weblogic clustering work?

阅读更多关于 How does weblogic clustering work?

问题 I'm new to weblogic. I've read http://download.oracle.com/docs/cd/E11035_01/wls100/cluster/overview.html and searched this topic on the internet but still had a hard time understanding some of weblogic's clustering concepts. Can anybody confirm/correct my understandings below? a cluster contains one or more logical servers which can reside on one or many physical servers when deploying a j2ee app to a cluster, it is tied to one server in that cluster external users of the deployed app aren't

How do I use Node.js clusters with my simple Express app?

阅读更多关于 How do I use Node.js clusters with my simple Express app?

问题 — I built a simple app that pulls in data (50 items) from a Redis DB and throws it up at localhost. I did an ApacheBench (c = 100, n = 50000) and I'm getting a semi-decent 150 requests/sec on a dual-core T2080 @ 1.73GHz (my 6 y.o laptop), but the proc usage is very disappointing as shown: Only one core is used, which is as per design in Node, but I think I can nearly double my requests/sec to ~300, maybe even more, if I can use Node.js clusters. I fiddled around quite a bit but I haven't been

Data-driven cluster colour with mapboxgl

阅读更多关于 Data-driven cluster colour with mapboxgl

I am trying to draw circle which colour depends on a "group" attribute in my geojson. I followed a simple example with these colours: map.addSource("data", { type: "geojson", data: url, cluster: true, clusterMaxZoom: 12, // Max zoom to cluster points on clusterRadius: 20 // Radius of each cluster when clustering points (defaults to 50) }); map.addLayer({ 'id': 'population', 'type': 'circle', 'source': 'data', 'paint': { // make circles larger as the user zooms from z12 to z22 'circle-radius': { 'base': 1.75, 'stops': [[12, 2], [22, 180]] }, // color circles by ethnicity, using a match