reliability

How do I detect unexpected worker role failures and reprocess data in those cases?

对着背影说爱祢 提交于 2019-12-20 03:36:07
问题 I want to create a web service hosted in Windows Azure. The clients will upload files for processing, the cloud will process those files, produce resulting files, the client will download them. I guess I'll use web roles for handling HTTP requests and worker roles for actual processing and something like Azure Queue or Azure Table Storage for tracking requests. Let's pretend it'll be Azure Table Storage - one "request" record per user uploaded file. A major design problem is processing a

Adding Custom WCF header to Endpoint Programatically for Reliable Sessions

吃可爱长大的小学妹 提交于 2019-12-13 14:01:03
问题 I'm building a WCF router and my client uses Reliable Sessions. In this scenario when the client opens a channel a message is sent (establishing a Reliable Session?). Its contents is as follows: <s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing"> <s:Header> <a:Action s:mustUnderstand="1">http://docs.oasis-open.org/ws-rx/wsrm/200702/CreateSequence</a:Action> <a:MessageID>urn:uuid:1758f794-c5d3-4573-b252-7a07344cc257</a:MessageID> <a:To

Is it possible to trash an Azure role host and get it started on the same host without cleanup?

拟墨画扇 提交于 2019-12-13 03:35:17
问题 Suppose my Azure role creates a lot of temporary files in Windows temporary folder and forgets to delete them. At some point it will receive "can't create temporary file" error. Suppose that once that happens my role code throws an exception out of RoleEntryPoint.Run() and the role is restarted. I'm not talking about perfect Azure aware code here. My role might use third-party black box code that would now nothing about Azure and "local storage" and would just call System.IO.Path.GetTempPath(

What assumptions can I make about global time on Azure?

蓝咒 提交于 2019-12-11 20:39:01
问题 I want my Azure role to reprocess data in case of sudden failures. I consider the following option. For every block of data to process I have a database table row and I could add a column meaning "time of last ping from a processing node". So when a node grabs a data block for processing it sets "processing" state and that time to "current time" and then it's the node responsibility to update that time say every one minute. Then periodically some node will ask for "all blocks that have

Is libmcrypt not reliable?

爷,独闯天下 提交于 2019-12-11 13:54:04
问题 A few days ago I put a question on SO, without any meaningful answer. Bellow is it on short: I have a client server program in C that encrypts/decrypts data with mcrypt C 's library. The client encrypts the string that wants to send to server, send it, and after the server reads, decrypts it. Bellow are my encrypt and decrypt function: encrypt function: void encrypt(char *es, char *key, char *civ, size_t length) { MCRYPT td; int n; td = mcrypt_module_open(MCRYPT_TWOFISH, NULL, MCRYPT_CFB,

McDonalds omega: warnings in R

自闭症网瘾萝莉.ら 提交于 2019-12-10 08:40:39
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 3 years ago . I'm computing omega for several different scales; and get different warning messages for different scales with different omega functions in R. My questions are regarding how to interpret these warnings and if it is safe to report the retrieved omega statistics. When I'm using the following function from the article "From alpha to omega: A practical solution to the pervasive

Is activemq reliable?

僤鯓⒐⒋嵵緔 提交于 2019-12-07 23:44:32
问题 We have put ActiveMQ on a fresh server. Configured it to use 'kahadb' (the preferred as we read) and set it to allow the file to expand to 2gb. Then when we put load on the queue (+- 500/sec), within a few minutes activemq crashes. When ActiveMQ tries to restart, it can't because the db is corrupt: 2010-11-29 13:00:50,359 | ERROR | Failed to start ActiveMQ JMS Message Broker. Reason: java.io.EOFException | org.apache.activemq.broker.BrokerService | WrapperSimpleAppMain java.io.EOFException at

How to keep a script running all the time in linux?

旧城冷巷雨未停 提交于 2019-12-07 20:19:21
问题 I'm trying to run a simple python script all the time. I want it to start automatically on bootup and be able to recover from failures. That is, if there is a failure that causes the script to stop, I don't really care and just want it to start running again. I'm OK if the whole device restarts. I just tested a working script using init.d, but am not sure how to recover from a failure? Have a cronjob check for the existence of a pid? I'd also like to check for integrity. That is, I'd like to

How to test reliability of my own (small) embedded operating system?

与世无争的帅哥 提交于 2019-12-07 03:38:50
问题 I've written a small operating system for embedded project running on small to medium target. I added some automated unit test with a high test code coverage (>95%), but the scope is only the static part. I got some code metrics as complexity and readability. I'm testing my code with a rule checker with MiSRA support, and of course fixed all warnings. I'm testing the code with a static analyzer and again fixed all warnings. What can I do now to test - and improve - the reliability of my OS ?

Are sockets reliable?

笑着哭i 提交于 2019-12-07 01:09:12
问题 Is it a good idea to use sockets to send data between two servers, or should I use something like MQ for moving data. My questions: are sockets reliable, if I need once only/assured delivery of the data? Are there any other solutions? Thanks. 回答1: Sockets are an application level API for performing network communication. The reliability of sockets depends on the network protocol that you select when you create the socket. If you select TCP/IP, you will get "reliable" transfer ... up to a