cluster-computing

Run @Scheduled task only on one WebLogic cluster node?

淺唱寂寞╮ 提交于 2019-12-09 08:33:34
问题 We are running a Spring 3.0.x web application (.war) with a nightly @Scheduled job in a clustered WebLogic 10.3.4 environment. However, as the application is deployed to each node (using the deployment wizard in the AdminServer's web console), the job is started on each node every night thus running multiple times concurrently. How can we prevent this from happening? I know that libraries like Quartz allow coordinating jobs inside clustered environment by means of a database lock table or I

Erlang clusters

大憨熊 提交于 2019-12-09 04:42:42
问题 I'm trying to implement a cluster using Erlang as the glue that holds it all together. I like the idea that it creates a fully connected graph of nodes, but upon reading different articles online, it seems as though this doesn't scale well (having a max of 50 - 100 nodes). Did the developers of OTP impose this limitation on purpose? I do know that you can setup nodes to have explicit connections only as well as have hidden nodes, etc. But, it seems as though the default out-of-the-box setup

Map Job Performance on cluster

冷暖自知 提交于 2019-12-08 12:24:36
问题 Suppose I have 15 blocks of data and two clusters. The first cluster has 5 nodes and a replication factor is 1, while the second one has a replication factor is 3. If I run my map job, should I expect any change in the performance or the execution time of the map job? In other words, how does replication affect the performance of the mapper on a cluster? 回答1: When the JobTracker assigns a job to a TaskTracker on HDFS, a job is assigned to a particular node based upon locality of data

how can Python see 12 cpus on a cluster where I got allocated 4 cores by LSF?

你。 提交于 2019-12-08 07:53:20
问题 I access a Linux cluster where resources are allocated using LSF, which I think is a common tool and comes from Scali (http://www.scali.com/workload-management/high-performance-computing). In an interactive queue, I asked for and got the maximum number of cores: 4. But if I check how many cpus does Python's multiprocessing module see, the number is 12, the number of physical cores the node I was allocated to has. It looks like the multiprocessing module has problems respecting the bounds that

Hadoop, MapReduce: how to add second node to mapReduce?

柔情痞子 提交于 2019-12-08 07:23:07
问题 I have a Hadoop 0.2.2 cluster of 2 nodes. On the first machine I start: namenode datanode NodeManager ResourceManager JobHistoryServer On the second I start all those as well, except for namenode: datanode NodeManager ResourceManager JobHistoryServer My mapred-site.xml on both machines contains: <property> <name>mapred.job.tracker</name> <value>firstMachine:54311</value> </property> My core-site.xml on both machines contains: <property> <name>fs.default.name</name> <value>hdfs://firstMachine

Wordcount C++ Hadoop pipes does not work

我们两清 提交于 2019-12-08 06:46:09
问题 I am trying to run the example of wordcount in C++ like this link describes the way to do : Running the WordCount program in C++. The compilation works fine, but when I tried to run my program, an error appeared : bin/hadoop pipes -conf ../dev/word.xml -input testtile.txt -output wordcount-out 11/06/06 14:23:40 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 11/06/06 14:23:40 INFO mapred.FileInputFormat: Total input

How to set the qsub to run job2 five seconds (or any desired value) after the job1 is finished?

别等时光非礼了梦想. 提交于 2019-12-08 03:57:57
问题 Currently what I do is to estimate when job1 will be finished, then using the “#PBS -a [myEstimatedTime+5]" directive I run qsub for job2. But I’m not happy with my approach since sometimes it is over/under estimated. Is there any better solution? 回答1: Add a time-killing job that runs 5 minutes between job1 and job2. The cluster's running order will be job1 -> job (for waiting 5 mins) -> job2. 回答2: The best way to do this is through job dependencies. You can submit the jobs: job1id=`qsub

How to wait for a torque job array to complete

六眼飞鱼酱① 提交于 2019-12-08 03:12:23
问题 I have a script that splits a data structure into chunks. The chunks are processed using a torque job array and then merged back into a single structure. The merge operation is dependent on the job array completing. How do I make the merge operation wait for the torque job array to complete? $ qsub --version Version: 4.1.6 My script is as follows: # Splits the data structure and processes the chunks qsub -t 1-100 -l nodes=1:ppn=40,walltime=48:00:00,vmem=120G ./job.sh # Merges the processed

Socket.io 'Handshake' failing with cluster and sticky-session

爷,独闯天下 提交于 2019-12-08 02:28:22
问题 I am having problems getting the sticky-sessions socket.io module to work properly with even a simple example. Following the very minimal example given in the readme (https://github.com/indutny/sticky-session), I am just trying to get this example to work: var cluster = require('cluster'); var sticky = require('sticky-session'); var http = require('http'); if (cluster.isMaster) { for (var i = 0; i < 4; i++) { cluster.fork(); } Object.keys(cluster.workers).forEach(function(id) { console.log(

Is it possible to start multi physical node hadoop clustster using docker?

萝らか妹 提交于 2019-12-08 02:16:43
问题 I've seen searching for a way to start docker on multiple physical machines and connect them to a hadoop cluster, so far I only found ways to start a cluster locally on 1 machine. Is there a way to do this? 回答1: You can very well provision a multinode hadoop cluster with docker. Please look at some posts below which will give you some insights on doing it: http://blog.sequenceiq.com/blog/2014/06/19/multinode-hadoop-cluster-on-docker/ Run a hadoop cluster on docker containers 来源: https:/