scientific-computing | 易学教程

Python (scikit learn) lda collapsing to single dimension

阅读更多关于 Python (scikit learn) lda collapsing to single dimension

I'm very new to scikit learn and machine learning in general. I am currently designing a SVM to predict if a specific amino acid sequence will be cut by a protease. So far the the SVM method seems to be working quite well: I'd like to visualize the distance between the two categories (cut and uncut), so I'm trying to use the linear discrimination analysis, which is similar to the principal component analysis, using the following code: from sklearn.discriminant_analysis import LinearDiscriminantAnalysis lda = LinearDiscriminantAnalysis(n_components=2) targs = np.array([1 if _ else 0 for _ in

Composing VTK file from multiple MPI outputs

阅读更多关于 Composing VTK file from multiple MPI outputs

问题 For a Lattice Boltzmann simulation of a lid-driven cavity (CFD) I'm decomposing my cubic domain into (also cubic) 8 subdomains, which are computed independently by 8 ranks. Each MPI rank is producing a VTK file for each timestep and since I'm using ParaView I want to visualize the whole thing as one cube. To be more specific about what I am trying to achieve: I have a cube with length 8 (number of elements for each direction) => 8x8x8 = 512 elements. Each dimension is distributed to 2 ranks,

Using scipy.signal.spectral.lombscargle for period discovery

阅读更多关于 Using scipy.signal.spectral.lombscargle for period discovery

The new Scipy v0.11 offers a package for spectral analysis. Unfortunately the documentation is sparse and there aren't many available examples. As a baby example, I'm trying to do period discovery of a sine wave. Unfortunately it predicts a period of 1 instead of the expected 2pi . Any ideas? # imports the numerical array and scientific computing packages import numpy as np import scipy as sp from scipy.signal import spectral # generates 100 evenly spaced points between 1 and 1000 time = np.linspace(1, 1000, 100) # computes the sine value of each of those points mags = np.sin(time) # scales

Performance comparison of FPU with software emulation

阅读更多关于 Performance comparison of FPU with software emulation

While I know (so I have been told) that Floating-point coprocessors work faster than any software implementation of floating-point arithmetic, I totally lack the gut feeling how large this difference is, in order of magnitudes. The answer probably depends on the application and where you work, between microprocessors and supercomputers. I am particularly interested in computer simulations. Can you point out articles or papers for this question? A general answer will obviously very vague, because performance depends on so many factors. However, based on my understanding, in processors that do

what changes when your input is giga/terabyte sized?

阅读更多关于 what changes when your input is giga/terabyte sized?

问题 I just took my first baby step today into real scientific computing today when I was shown a data set where the smallest file is 48000 fields by 1600 rows (haplotypes for several people, for chromosome 22). And this is considered tiny. I write Python, so I've spent the last few hours reading about HDF5, and Numpy, and PyTable, but I still feel like I'm not really grokking what a terabyte-sized data set actually means for me as a programmer. For example, someone pointed out that with larger

HDF5 Storage Overhead

阅读更多关于 HDF5 Storage Overhead

I'm writing a large number of small datasets to an HDF5 file, and the resulting filesize is about 10x what I would expect from a naive tabulation of the data I'm putting in. My data is organized hierarchically as follows: group 0 -> subgroup 0 -> dataset (dimensions: 100 x 4, datatype: float) -> dataset (dimensions: 100, datatype: float) -> subgroup 1 -> dataset (dimensions: 100 x 4, datatype: float) -> dataset (dimensions: 100, datatype: float) ... group 1 ... Each subgroup should take up 500 * 4 Bytes = 2000 Bytes, ignoring overhead. I don't store any attributes alongside the data. Yet, in

multinomial pmf in python scipy/numpy

阅读更多关于 multinomial pmf in python scipy/numpy

Is there a built-in function in scipy/numpy for getting the PMF of a Multinomial? I'm not sure if binom generalizes in the correct way, e.g. # Attempt to define multinomial with n = 10, p = [0.1, 0.1, 0.8] rv = scipy.stats.binom(10, [0.1, 0.1, 0.8]) # Score the outcome 4, 4, 2 rv.pmf([4, 4, 2]) What is the correct way to do this? thanks. There's no built-in function that I know of, and the binomial probabilities do not generalize (you need to normalise over a different set of possible outcomes, since the sum of all the counts must be n which won't be taken care of by independent binomials).

Integrating a multidimensional integral in scipy

阅读更多关于 Integrating a multidimensional integral in scipy

问题 Motivation: I have a multidimensional integral, which for completeness I have reproduced below. It comes from the computation of the second virial coefficient when there is significant anisotropy: Here W is a function of all the variables. It is a known function, one which I can define a python function for. Programming Question: How do I get scipy to integrate this expression? I was thinking of chaining two triple quads (scipy.integrate.tplquad) together, but I'm worried about performance

best lib for vector array in c++

阅读更多关于 best lib for vector array in c++

I have to do calculation on array of 1,2,3...9 dimensional vectors, and the number of those vectors varies significantly (say from 100 to up to couple of millions). Of course, it would be great if the data container can be easily decomposed to enable parallel algorithms. I came across blitz++(almost impossible to compile for me), but are there any other fast libs that manipulate array of vector data? Is boost::fusion worth a look? Furthermore, vtk's vtkDoubleArray seems nice, but vtk is lib used only for visualization. I must admit that having array of tuples is a tempting idea, but I didn't

Logarithm Calculation with Windows 7 Calculator [closed]

阅读更多关于 Logarithm Calculation with Windows 7 Calculator [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 7 years ago . I would like to use the Windows Calculator in Scientific Mode in order solve a very basic Logarithm equation but, unfortunately, I couldn't do that. Here is the problem: log_5 125=? Thank you very much for your help... Well, I know it equals to "3", but, how can I use the Windows Calculator to get computed that