reduction | 易学教程

Lambda Calculus reduction

阅读更多关于 Lambda Calculus reduction

All, Below is the lambda expression which I am finding difficult to reduce i.e. I am not able to understand how to go about this problem. (λm λn λa λb . m (n a b) b) (λ f x. x) (λ f x. f x) This is what I tried, but I am stuck: Considering the above expression as : (λm.E) M equates to E= (λn λa λb. m (n a b) b) M = (λf x. x)(λ f x. f x) => (λn λa λb. (λ f x. x) (λ f x. f x) (n a b) b) Considering the above expression as (λn. E)M equates to E = (λa λb. (λ f x. x) (λ f x. f x) (n a b) b) M = ?? .. and I am lost!! Can anyone please help me understand that, for ANY lambda calculus expression, what

General rules for simplifying SQL statements

阅读更多关于 General rules for simplifying SQL statements

I'm looking for some "inference rules" (similar to set operation rules or logic rules) which I can use to reduce a SQL query in complexity or size. Does there exist something like that? Any papers, any tools? Any equivalencies that you found on your own? It's somehow similar to query optimization, but not in terms of performance. To state it different: Having a (complex) query with JOINs, SUBSELECTs, UNIONs is it possible (or not) to reduce it to a simpler, equivalent SQL statement, which is producing the same result, by using some transformation rules? So, I'm looking for equivalent

Is it possible to do a reduction on an array with openmp?

阅读更多关于 Is it possible to do a reduction on an array with openmp?

问题 Does OpenMP natively support reduction of a variable that represents an array? This would work something like the following... float* a = (float*) calloc(4*sizeof(float)); omp_set_num_threads(13); #pragma omp parallel reduction(+:a) for(i=0;i<4;i++){ a[i] += 1; // Thread-local copy of a incremented by something interesting } // a now contains [13 13 13 13] Ideally, there would be something similar for an omp parallel for, and if you have a large enough number of threads for it to make sense,

Lambda Calculus reduction

阅读更多关于 Lambda Calculus reduction

问题 All, Below is the lambda expression which I am finding difficult to reduce i.e. I am not able to understand how to go about this problem. (λm λn λa λb . m (n a b) b) (λ f x. x) (λ f x. f x) This is what I tried, but I am stuck: Considering the above expression as : (λm.E) M equates to E= (λn λa λb. m (n a b) b) M = (λf x. x)(λ f x. f x) => (λn λa λb. (λ f x. x) (λ f x. f x) (n a b) b) Considering the above expression as (λn. E)M equates to E = (λa λb. (λ f x. x) (λ f x. f x) (n a b) b) M = ??

General rules for simplifying SQL statements

阅读更多关于 General rules for simplifying SQL statements

问题 I'm looking for some "inference rules" (similar to set operation rules or logic rules) which I can use to reduce a SQL query in complexity or size. Does there exist something like that? Any papers, any tools? Any equivalencies that you found on your own? It's somehow similar to query optimization, but not in terms of performance. To state it different: Having a (complex) query with JOINs, SUBSELECTs, UNIONs is it possible (or not) to reduce it to a simpler, equivalent SQL statement, which is

CUDA: In warp reduction and volatile keyword

阅读更多关于 CUDA: In warp reduction and volatile keyword

After reading the question and its answer from the following LINK I still have a question remaining in my mind. From my background in C/C++; I understand that using volatile has it's demerits. And also it is pointed in the answers that in case of CUDA, the optimizations can replace shared array with registers to keep data if volatile keyword is not used. I want to know what would be the performance issues that can be encountered when calculating (sum) reduction. e.g. __device__ void sum(volatile int *s_data, int tid) { if (tid < 16) { s_data[tid] += s_data[tid + 16]; s_data[tid] += s_data[tid

CUDA Thrust: reduce_by_key on only some values in an array, based off values in a “key” array

阅读更多关于 CUDA Thrust: reduce_by_key on only some values in an array, based off values in a “key” array

Let's say I have two device_vector<byte> arrays, d_keys and d_data . If d_data is, for example, a flattened 2D 3x5 array ( e.g. { 1, 2, 3, 4, 5, 6, 7, 8, 9, 8, 7, 6, 5, 4, 3 } ) and d_keys is a 1D array of size 5 ( e.g. { 1, 0, 0, 1, 1 } ), how can I do a reduction such that I'd end up only adding values on a per-row basis if the corresponding d_keys value is one ( e.g. ending up with a result of { 10, 23, 14 } )? The sum_rows.cu example allows me to add every value in d_data , but that's not quite right. Alternatively, I can, on a per-row basis, use a zip_iterator and combine d_keys with one

Class Scheduling to Boolean satisfiability [Polynomial-time reduction]

阅读更多关于 Class Scheduling to Boolean satisfiability [Polynomial-time reduction]

I have some theoretical/practical problem and I don't have clue for now on how to manage, Here it is: I create a SAT solver able to find a model when one is existing and to prove the contradiction when it's not the case on CNF problems in C using genetics algorithms. A SAT-problem looks basically like this kind of problem : My goal is to use this solver to find solutions in a lot of different NP-completes problems. Basically, I translate different problems into SAT, solve SAT with my solver and then transform the solution into a solution acceptable for the original problem. I already succeed

How to perform reduction on a huge 2D matrix along the row direction using cuda? (max value and max value's index for each row)

阅读更多关于 How to perform reduction on a huge 2D matrix along the row direction using cuda? (max value and max value's index for each row)

I'm trying to implement a reduction along the row direction of a 2D matrix. I'm starting from a code I found on stackoverflow (thanks a lot Robert!) thrust::max_element slow in comparison cublasIsamax - More efficient implementation? The above link shows a custom kernel that performs reduction on a single row. It divides the input row into many rows and each row has 1024 threads. Works very well. For the 2D case, everything's the same except that now there's a y grid dimension. So each block's y dimension is still 1. The problem is that when I try to write data onto the shared memory within

Openmp and reduction on std::vector?

阅读更多关于 Openmp and reduction on std::vector?

I want to make this code parallel: std::vector<float> res(n,0); std::vector<float> vals(m); std::vector<float> indexes(m); // fill indexes with values in range [0,n) // fill vals and indexes for(size_t i=0; i<m; i++){ res[indexes[i]] += //something using vas[i]; } In this article it's suggested to use: #pragma omp parallel for reduction(+:myArray[:6]) In this question the same approach is proposed in the comments section. I have two questions: I don't know m at compile time, and from these two examples it seems that's required. Is it so? Or if I can use it for this case, what do I have to