cluster-computing | 易学教程

Stacktrace does not print in Glassfish 4.1 Cluster log

阅读更多关于 Stacktrace does not print in Glassfish 4.1 Cluster log

问题 Doing our first cluster setup on Glassfish (4.1). Application(EAR) level logs (ex printing a stacktrace) don't seem to reach the server.log in (GF-dir)/domains//logs/server.log or (GF-dir)/nodes/(node-name)/(instance-name)/server.log (There is no cluster.log as stated in documentation) We didn't change any of the default logging options in logging.properties. The current logs only show cluster and instance related information. 回答1: I had similar probrem. server.log is not output after MQJMSRA

How multiple executors are managed on the worker nodes with a Spark standalone cluster?

阅读更多关于 How multiple executors are managed on the worker nodes with a Spark standalone cluster?

问题 Until now, I have only used Spark on a Hadoop cluster with YARN as the resource manager. In that type of cluster, I know exactly how many executors to run and how the resource management works. However, know that I am trying to use a Standalone Spark Cluster, I have got a little bit confused. Correct me where I am wrong. From this article, by default, a worker node uses all the memory of the node minus 1 GB. But I understand that by using SPARK_WORKER_MEMORY , we can use lesser memory. For

Hadoop client and cluster separation

阅读更多关于 Hadoop client and cluster separation

问题 I am a newbie in hadoop, linux as well. My professor asked us to seperate Hadoop client and cluster using port mapping or VPN. I don't understand the meaning of such separation. Can anybody give me a hint? Now I get the idea of cluster client separation. I think it is required that hadoop is also installed in the client machine. When the client submit a hadoop job , it is submit to the masters of the clusters. And I have some naiive ideas: 1.Create a client machine and install hadoop . 2.set

Adding a generic service to cluster from powershell

阅读更多关于 Adding a generic service to cluster from powershell

问题 I'm a newbie in clustering and I'm trying to create a generic service to a cluster using PowerShell. I can add it without any issues using the GUI, but for some reason I cannot add it from PowerShell. Following the first example from the documentation for Add-ClusterGenericServiceRole, I've tried the following command: Add-ClusterGenericServiceRole -ServiceName "MyService" This throws the following error: Static network was [network range] was not configured. Please use -StaticAddress to use

Variable sized message in MPI

阅读更多关于 Variable sized message in MPI

问题 Is there a library call that would allow for sending/receiving of variable sized messages using MPI? A work around would be to send the data size in the first message and follow it with the actual payload, but I was wondering if there was a convention for combining these two separate messages. 回答1: The count provided to MPI_Recv is only an upper bound. MPI_Get_count can be used to find the exact number of items received. Kind of like sockets I guess. 回答2: You could also use MPI_Probe or MPI

Compute dissimilarity matrix for large data

阅读更多关于 Compute dissimilarity matrix for large data

问题 I'm trying to compute a dissimilarity matrix based on a big data frame with both numerical and categorical features. When I run the daisy function from the cluster package I get the error message: Error: cannot allocate vector of size X. In my case X is about 800 GB. Any idea how I can deal with this problem? Additionally it would be also great if someone could help me to run the function in parallel cores. Below you can find the function that computes the dissimilarity matrix on the iris

Allow foreach workers to register and distribute sub-tasks to other workers

阅读更多关于 Allow foreach workers to register and distribute sub-tasks to other workers

问题 I have an R code that involves several foreach workers to perform some tasks in parallel. I am using foreach and doMC for this purpose. I want to let each of the foreach workers recruits some new workers and distribute some parts of their code, which is parallelizable, to them. The current code looks like: require(doMC) require(foreach) registerDoMC(cores = 8) foreach (i = (1:8)) %dopar% { <<some code here>> for (j in c(1:4)) { <<some other code here>> } } I am looking for an ideal code that

Using socket.io with Cluster?

阅读更多关于 Using socket.io with Cluster?

问题 I'm curious that I can use both socket.io and Cluster. I know that cluster uses multi-core to work on node.js with multiple workers. That means if I use cluster for socket.io, two users with connected on two different socket.io might cause problem that they cannot communicate each other? So rather not using cluster on socket.io would be an answer? 回答1: Checkout dshaw's talk and sample app regarding scaling Socket.IO: https://github.com/dshaw/talks/tree/master/2011-10-jsclub/sample-app Also

Using socket.io with Cluster?

阅读更多关于 Using socket.io with Cluster?

R, issue with a Hierarchical clustering after a Multiple correspondence analysis

阅读更多关于 R, issue with a Hierarchical clustering after a Multiple correspondence analysis

问题 I want to cluster a dataset (600000 observations), and for each cluster I want to get the principal components. My vectors are composed by one email and by 30 qualitative variables. Each quantitative variable has 4 classes: 0,1,2 and 3. So first thing I'm doing is to load the library FactoMineR and to load my data: library(FactoMineR) mydata = read.csv("/home/tom/Desktop/ACM/acm.csv") Then I'm setting my variables as qualitative (I'm excluding the variable 'email' though): for(n in 1:length