cluster-computing | 易学教程

Run @Scheduled task only on one WebLogic cluster node?

阅读更多关于 Run @Scheduled task only on one WebLogic cluster node?

问题 We are running a Spring 3.0.x web application (.war) with a nightly @Scheduled job in a clustered WebLogic 10.3.4 environment. However, as the application is deployed to each node (using the deployment wizard in the AdminServer's web console), the job is started on each node every night thus running multiple times concurrently. How can we prevent this from happening? I know that libraries like Quartz allow coordinating jobs inside clustered environment by means of a database lock table or I

Erlang clusters

阅读更多关于 Erlang clusters

问题 I'm trying to implement a cluster using Erlang as the glue that holds it all together. I like the idea that it creates a fully connected graph of nodes, but upon reading different articles online, it seems as though this doesn't scale well (having a max of 50 - 100 nodes). Did the developers of OTP impose this limitation on purpose? I do know that you can setup nodes to have explicit connections only as well as have hidden nodes, etc. But, it seems as though the default out-of-the-box setup

Map Job Performance on cluster

阅读更多关于 Map Job Performance on cluster

问题 Suppose I have 15 blocks of data and two clusters. The first cluster has 5 nodes and a replication factor is 1, while the second one has a replication factor is 3. If I run my map job, should I expect any change in the performance or the execution time of the map job? In other words, how does replication affect the performance of the mapper on a cluster? 回答1: When the JobTracker assigns a job to a TaskTracker on HDFS, a job is assigned to a particular node based upon locality of data

how can Python see 12 cpus on a cluster where I got allocated 4 cores by LSF?

阅读更多关于 how can Python see 12 cpus on a cluster where I got allocated 4 cores by LSF?

问题 I access a Linux cluster where resources are allocated using LSF, which I think is a common tool and comes from Scali (http://www.scali.com/workload-management/high-performance-computing). In an interactive queue, I asked for and got the maximum number of cores: 4. But if I check how many cpus does Python's multiprocessing module see, the number is 12, the number of physical cores the node I was allocated to has. It looks like the multiprocessing module has problems respecting the bounds that

Hadoop, MapReduce: how to add second node to mapReduce?

阅读更多关于 Hadoop, MapReduce: how to add second node to mapReduce?

问题 I have a Hadoop 0.2.2 cluster of 2 nodes. On the first machine I start: namenode datanode NodeManager ResourceManager JobHistoryServer On the second I start all those as well, except for namenode: datanode NodeManager ResourceManager JobHistoryServer My mapred-site.xml on both machines contains: <property> <name>mapred.job.tracker</name> <value>firstMachine:54311</value> </property> My core-site.xml on both machines contains: <property> <name>fs.default.name</name> <value>hdfs://firstMachine

Wordcount C++ Hadoop pipes does not work

阅读更多关于 Wordcount C++ Hadoop pipes does not work

问题 I am trying to run the example of wordcount in C++ like this link describes the way to do : Running the WordCount program in C++. The compilation works fine, but when I tried to run my program, an error appeared : bin/hadoop pipes -conf ../dev/word.xml -input testtile.txt -output wordcount-out 11/06/06 14:23:40 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 11/06/06 14:23:40 INFO mapred.FileInputFormat: Total input

How to set the qsub to run job2 five seconds (or any desired value) after the job1 is finished?

阅读更多关于 How to set the qsub to run job2 five seconds (or any desired value) after the job1 is finished?

问题 Currently what I do is to estimate when job1 will be finished, then using the “#PBS -a [myEstimatedTime+5]" directive I run qsub for job2. But I’m not happy with my approach since sometimes it is over/under estimated. Is there any better solution? 回答1: Add a time-killing job that runs 5 minutes between job1 and job2. The cluster's running order will be job1 -> job (for waiting 5 mins) -> job2. 回答2: The best way to do this is through job dependencies. You can submit the jobs: job1id=`qsub

How to wait for a torque job array to complete

阅读更多关于 How to wait for a torque job array to complete

问题 I have a script that splits a data structure into chunks. The chunks are processed using a torque job array and then merged back into a single structure. The merge operation is dependent on the job array completing. How do I make the merge operation wait for the torque job array to complete? $ qsub --version Version: 4.1.6 My script is as follows: # Splits the data structure and processes the chunks qsub -t 1-100 -l nodes=1:ppn=40,walltime=48:00:00,vmem=120G ./job.sh # Merges the processed

Socket.io 'Handshake' failing with cluster and sticky-session

阅读更多关于 Socket.io 'Handshake' failing with cluster and sticky-session

问题 I am having problems getting the sticky-sessions socket.io module to work properly with even a simple example. Following the very minimal example given in the readme (https://github.com/indutny/sticky-session), I am just trying to get this example to work: var cluster = require('cluster'); var sticky = require('sticky-session'); var http = require('http'); if (cluster.isMaster) { for (var i = 0; i < 4; i++) { cluster.fork(); } Object.keys(cluster.workers).forEach(function(id) { console.log(

Is it possible to start multi physical node hadoop clustster using docker?

阅读更多关于 Is it possible to start multi physical node hadoop clustster using docker?

问题 I've seen searching for a way to start docker on multiple physical machines and connect them to a hadoop cluster, so far I only found ways to start a cluster locally on 1 machine. Is there a way to do this? 回答1: You can very well provision a multinode hadoop cluster with docker. Please look at some posts below which will give you some insights on doing it: http://blog.sequenceiq.com/blog/2014/06/19/multinode-hadoop-cluster-on-docker/ Run a hadoop cluster on docker containers 来源： https:/