distributed

Job queue with job affinity

故事扮演 提交于 2021-02-18 10:08:31
问题 I am currently facing a problem for which I am pretty sure there is an official name, but I don't know what to search the web for. I hope that if I describe the problem and the solution I have in mind, somebody is able to tell me the name of the design pattern (if there is one that matches what I am going to describe). Basically, what I want to have is a job queue: I have multiple clients that create jobs (publishers), and a number of workers that process these jobs (consumers). Now I want to

Distributed Transaction on mysql

扶醉桌前 提交于 2021-02-11 17:31:48
问题 I'm working on a distributed system that uses distributed transactions, which means that I may have a transaction that needs to edit multiple databases (on multiple servers) at the same time. In my system there is a controller to manage the distribution. the scenario that I want to satisfy is: server A wants to initiate a distributed transaction. the participants are server A and server B. so server A sends a request to the controller to initiate a distributed transaction. the controller

Distributed Transaction on mysql

喜夏-厌秋 提交于 2021-02-11 17:30:57
问题 I'm working on a distributed system that uses distributed transactions, which means that I may have a transaction that needs to edit multiple databases (on multiple servers) at the same time. In my system there is a controller to manage the distribution. the scenario that I want to satisfy is: server A wants to initiate a distributed transaction. the participants are server A and server B. so server A sends a request to the controller to initiate a distributed transaction. the controller

create index using openquery

一世执手 提交于 2021-02-10 18:44:24
问题 How do I create an index on a table that exist in a remote SQL Server database using the openquery syntax? 回答1: You can't on your side. The index must be added to a local object only. You can't use an indexed view either. You can ask the other party to add an index for you to their table... Edit: Expanding John's answer... You could try: SELECT * FROM OPENQUERY(LinkedServer, 'CREATE INDEX etc;SELECT 0 AS foobar') 回答2: I'm not certain however I suspect that this cannot be done. OPENQUERY is

how to store worker-local variables in dask/distributed

不羁的心 提交于 2021-02-07 13:12:23
问题 Using dask 0.15.0, distributed 1.17.1. I want to memoize some things per worker, like a client to access google cloud storage, because instantiating it is expensive. I'd rather store this in some kind of worker attribute. What is the canonical way to do this? Or are globals the way to go? 回答1: On the worker You can get access to the local worker with the get_worker function. A slightly cleaner thing than globals would be to attach state to the worker: from dask.distributed import get_worker

how to store worker-local variables in dask/distributed

谁说我不能喝 提交于 2021-02-07 13:01:47
问题 Using dask 0.15.0, distributed 1.17.1. I want to memoize some things per worker, like a client to access google cloud storage, because instantiating it is expensive. I'd rather store this in some kind of worker attribute. What is the canonical way to do this? Or are globals the way to go? 回答1: On the worker You can get access to the local worker with the get_worker function. A slightly cleaner thing than globals would be to attach state to the worker: from dask.distributed import get_worker

Java framework/tool for simple distributed computing problem

北城以北 提交于 2021-02-05 20:12:35
问题 We generate pdf files with data regarding monthly financial balance of tens of thousands of clients. At it's peak (100.000 files at the end of year), the process may take as long as 5 days to complete using distribute the load between 5 servers. The distribution of workload is a manual process (eg. server 1 generates pdf for clients 1 to 20.000, server 2 from 20.001 to 40.000, and so on). We use Java, so we would like to use a Java tool or framework in a fashion similar to BOINC (BOINC is not

Java framework/tool for simple distributed computing problem

若如初见. 提交于 2021-02-05 20:12:06
问题 We generate pdf files with data regarding monthly financial balance of tens of thousands of clients. At it's peak (100.000 files at the end of year), the process may take as long as 5 days to complete using distribute the load between 5 servers. The distribution of workload is a manual process (eg. server 1 generates pdf for clients 1 to 20.000, server 2 from 20.001 to 40.000, and so on). We use Java, so we would like to use a Java tool or framework in a fashion similar to BOINC (BOINC is not

What is the reason to use parameter server in distributed tensorflow learning?

可紊 提交于 2021-02-05 13:20:27
问题 Short version: can't we store variables in one of the workers and not use parameter servers? Long version: I want to implement synchronous distributed learning of neural network in tensorflow. I want each worker to have a full copy of the model during training. I've read distributed tensorflow tutorial and code of distributed training imagenet and didn't get why do we need parameter servers. I see that they are used for storing values of variables and replica_device_setter takes care that

Solr Search Across Multiple Cores

一曲冷凌霜 提交于 2021-01-29 02:21:50
问题 I have two Solr cores. Core0 imports data from a Oracle table called items. Each item has a unique id (item_id) and is either a video item or a audio item (item_type). Other fields contain searchable texts (description, comments etc) Core1 imports data from two tables (from a different database) called video_item_dates and audio_item_dates which record occurrence dates of an item in a specific market. The fields are item_id, item_market and dates. A single row would look like (item_001,