scientific-computing

Python (scikit learn) lda collapsing to single dimension

依然范特西╮ 提交于 2019-12-04 16:05:43
I'm very new to scikit learn and machine learning in general. I am currently designing a SVM to predict if a specific amino acid sequence will be cut by a protease. So far the the SVM method seems to be working quite well: I'd like to visualize the distance between the two categories (cut and uncut), so I'm trying to use the linear discrimination analysis, which is similar to the principal component analysis, using the following code: from sklearn.discriminant_analysis import LinearDiscriminantAnalysis lda = LinearDiscriminantAnalysis(n_components=2) targs = np.array([1 if _ else 0 for _ in

Composing VTK file from multiple MPI outputs

陌路散爱 提交于 2019-12-04 13:12:03
问题 For a Lattice Boltzmann simulation of a lid-driven cavity (CFD) I'm decomposing my cubic domain into (also cubic) 8 subdomains, which are computed independently by 8 ranks. Each MPI rank is producing a VTK file for each timestep and since I'm using ParaView I want to visualize the whole thing as one cube. To be more specific about what I am trying to achieve: I have a cube with length 8 (number of elements for each direction) => 8x8x8 = 512 elements. Each dimension is distributed to 2 ranks,

Using scipy.signal.spectral.lombscargle for period discovery

孤人 提交于 2019-12-04 09:59:37
The new Scipy v0.11 offers a package for spectral analysis. Unfortunately the documentation is sparse and there aren't many available examples. As a baby example, I'm trying to do period discovery of a sine wave. Unfortunately it predicts a period of 1 instead of the expected 2pi . Any ideas? # imports the numerical array and scientific computing packages import numpy as np import scipy as sp from scipy.signal import spectral # generates 100 evenly spaced points between 1 and 1000 time = np.linspace(1, 1000, 100) # computes the sine value of each of those points mags = np.sin(time) # scales

Performance comparison of FPU with software emulation

老子叫甜甜 提交于 2019-12-04 07:58:26
While I know (so I have been told) that Floating-point coprocessors work faster than any software implementation of floating-point arithmetic, I totally lack the gut feeling how large this difference is, in order of magnitudes. The answer probably depends on the application and where you work, between microprocessors and supercomputers. I am particularly interested in computer simulations. Can you point out articles or papers for this question? A general answer will obviously very vague, because performance depends on so many factors. However, based on my understanding, in processors that do

what changes when your input is giga/terabyte sized?

对着背影说爱祢 提交于 2019-12-04 07:42:10
问题 I just took my first baby step today into real scientific computing today when I was shown a data set where the smallest file is 48000 fields by 1600 rows (haplotypes for several people, for chromosome 22). And this is considered tiny. I write Python, so I've spent the last few hours reading about HDF5, and Numpy, and PyTable, but I still feel like I'm not really grokking what a terabyte-sized data set actually means for me as a programmer. For example, someone pointed out that with larger

HDF5 Storage Overhead

删除回忆录丶 提交于 2019-12-04 06:33:35
I'm writing a large number of small datasets to an HDF5 file, and the resulting filesize is about 10x what I would expect from a naive tabulation of the data I'm putting in. My data is organized hierarchically as follows: group 0 -> subgroup 0 -> dataset (dimensions: 100 x 4, datatype: float) -> dataset (dimensions: 100, datatype: float) -> subgroup 1 -> dataset (dimensions: 100 x 4, datatype: float) -> dataset (dimensions: 100, datatype: float) ... group 1 ... Each subgroup should take up 500 * 4 Bytes = 2000 Bytes, ignoring overhead. I don't store any attributes alongside the data. Yet, in

multinomial pmf in python scipy/numpy

孤街浪徒 提交于 2019-12-04 00:25:21
Is there a built-in function in scipy/numpy for getting the PMF of a Multinomial? I'm not sure if binom generalizes in the correct way, e.g. # Attempt to define multinomial with n = 10, p = [0.1, 0.1, 0.8] rv = scipy.stats.binom(10, [0.1, 0.1, 0.8]) # Score the outcome 4, 4, 2 rv.pmf([4, 4, 2]) What is the correct way to do this? thanks. There's no built-in function that I know of, and the binomial probabilities do not generalize (you need to normalise over a different set of possible outcomes, since the sum of all the counts must be n which won't be taken care of by independent binomials).

Integrating a multidimensional integral in scipy

孤街浪徒 提交于 2019-12-03 18:57:10
问题 Motivation: I have a multidimensional integral, which for completeness I have reproduced below. It comes from the computation of the second virial coefficient when there is significant anisotropy: Here W is a function of all the variables. It is a known function, one which I can define a python function for. Programming Question: How do I get scipy to integrate this expression? I was thinking of chaining two triple quads (scipy.integrate.tplquad) together, but I'm worried about performance

best lib for vector array in c++

女生的网名这么多〃 提交于 2019-12-03 16:46:52
I have to do calculation on array of 1,2,3...9 dimensional vectors, and the number of those vectors varies significantly (say from 100 to up to couple of millions). Of course, it would be great if the data container can be easily decomposed to enable parallel algorithms. I came across blitz++(almost impossible to compile for me), but are there any other fast libs that manipulate array of vector data? Is boost::fusion worth a look? Furthermore, vtk's vtkDoubleArray seems nice, but vtk is lib used only for visualization. I must admit that having array of tuples is a tempting idea, but I didn't

Logarithm Calculation with Windows 7 Calculator [closed]

帅比萌擦擦* 提交于 2019-12-03 14:58:52
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 7 years ago . I would like to use the Windows Calculator in Scientific Mode in order solve a very basic Logarithm equation but, unfortunately, I couldn't do that. Here is the problem: log_5 125=? Thank you very much for your help... Well, I know it equals to "3", but, how can I use the Windows Calculator to get computed that