numerical-methods

Fast algorithm to calculate Pi in parallel

隐身守侯 提交于 2019-12-18 11:45:59
问题 I am starting to learn CUDA and I think calculating long digits of pi would be a nice, introductory project. I have already implemented the simple Monte Carlo method which is easily parallelize-able. I simply have each thread randomly generate points on the unit square, figure out how many lie within the unit circle, and tally up the results using a reduction operation. But that is certainly not the fastest algorithm for calculating the constant. Before, when I did this exercise on a single

Overflow issues when implementing math formulas

戏子无情 提交于 2019-12-18 08:39:19
问题 I heard that, when computing mean value, start+(end-start)/2 differs from (start+end)/2 because the latter can cause overflow. I do not quite understand why this second one can cause overflow while the first one does not. What are the generic rule to implement a math formula that can avoid overflow. 回答1: Suppose you are using a computer where the maximum integer value is 10 and you want to compute the average of 5 and 7. The first method (begin + (end-begin)/2) gives 5 + (7-5)/2 == 5 + 2/2 ==

Solving the Lorentz model using Runge Kutta 4th Order in Python without a package

孤者浪人 提交于 2019-12-18 07:11:50
问题 I wish to solve the Lorentz model in Python without the help of a package and my codes seems not to work to my expectation. I do not know why I am not getting the expected results and Lorentz attractor. The main problem I guess is related to how to store the various values for the solution of x,y and z respectively.Below are my codes for the Runge-Kutta 45 for the Lorentz model with 3D plot of solutions: import numpy as np import matplotlib.pyplot as plt #from scipy.integrate import odeint #a

Newton-Raphson Method in Matlab

荒凉一梦 提交于 2019-12-17 20:54:50
问题 I am new to matlab and I need to create a function that does n iterations of the Newton-Raphson method with starting approximation x = a. This starting approximation does not count as an interation and another requirement is that a for loop is required. I have looked at other similar questions posted but in my case I do not want to use a while loop. This is what my inputs are supposed to be: mynewton(f,a,n) which takes three inputs: f: A function handle for a function of x. a: A real number.

How do I determine the coefficients for a linear regression line in MATLAB? [closed]

早过忘川 提交于 2019-12-17 13:50:53
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 5 months ago . I'm going to write a program where the input is a data set of 2D points and the output is the regression coefficients of the line of best fit by minimizing the minimum MSE error. I have some sample points that I would like to process: X Y 1.00 1.00 2.00 2.00 3.00 1.30 4.00 3.75 5

How do I determine the coefficients for a linear regression line in MATLAB? [closed]

空扰寡人 提交于 2019-12-17 13:50:12
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 5 months ago . I'm going to write a program where the input is a data set of 2D points and the output is the regression coefficients of the line of best fit by minimizing the minimum MSE error. I have some sample points that I would like to process: X Y 1.00 1.00 2.00 2.00 3.00 1.30 4.00 3.75 5

__builtin_prefetch, How much does it read?

廉价感情. 提交于 2019-12-17 02:38:18
问题 I'm trying to optimize some C++ (RK4) by using __builtin_prefetch I can't figure out how to prefetch a whole structure. I don't understand how much of the const void *addr is read. I want to have the next values of from and to loaded. for (int i = from; i < to; i++) { double kv = myLinks[i].kv; particle* from = con[i].Pfrom; particle* to = con[i].Pto; //Prefetch values at con[i++].Pfrom & con[i].Pto; double pos = to->px- from->px; double delta = from->r + to->r - pos; double k1 = axcel(kv,

Cumulative summation in CUDA

谁说我不能喝 提交于 2019-12-13 13:32:53
问题 Can someone please point me in the right direction on how to do this type of calculation in parallel, or tell me what the general name of this method is? I don't think these will return the same result. C++ for (int i = 1; i < width; i++) x[i] = x[i] + x[i-1]; CUDA int i = blockIdx.x * blockDim.x + threadIdx.x if ((i > 0) && (i < (width))) X[i] = X[i] + X[i-1]; 回答1: This looks like a cumulative sum operation, in which the final value of x[i] is the sum of all values x[0]...x[i] in the

What is a fast simple solver for a large Laplacian matrix?

[亡魂溺海] 提交于 2019-12-13 13:22:38
问题 I need to solve some large (N~1e6) Laplacian matrices that arise in the study of resistor networks. The rest of the network analysis is being handled with boost graph and I would like to stay in C++ if possible. I know there are lots and lots of C++ matrix libraries but no one seems to be a clear leader in speed or usability. Also, the many questions on the subject, here and elsewhere seem to rapidly devolve into laundry lists which are of limited utility. In an attempt to help myself and

efficiently determining if a polynomial has a root in the interval [0,T]

大城市里の小女人 提交于 2019-12-13 11:54:02
问题 I have polynomials of nontrivial degree (4+) and need to robustly and efficiently determine whether or not they have a root in the interval [0,T]. The precise location or number of roots don't concern me, I just need to know if there is at least one. Right now I'm using interval arithmetic as a quick check to see if I can prove that no roots can exist. If I can't, I'm using Jenkins-Traub to solve for all of the polynomial roots. This is obviously inefficient since it's checking for all real