statistics

How should the interquartile range be calculated in Python?

时间秒杀一切 提交于 2019-12-21 05:04:11
问题 I have a list of numbers [1, 2, 3, 4, 5, 6, 7] and I want to have a function to return the interquartile range of this list of numbers. The interquartile range is the difference between the upper and lower quartiles. I have attempted to calculate the interquartile range using NumPy functions and using Wolfram Alpha. I find all of the answers, from my manual one, to the NumPy one, tothe Wolfram Alpha, to be different. I do not know why this is. My attempt in Python is as follows: >>> a = numpy

How to find if the numbers are continuous in R?

做~自己de王妃 提交于 2019-12-21 04:58:24
问题 I have a range of values c(1,2,3,4,5,8,9,10,13,14,15) And I want to find the ranges where the numbers become discontinuous. All I want is this as output: (1,5) (8,10) (13,15) I need to find break points. I need to do it in R. 回答1: Something like this? x <- c(1:5, 8:10, 13:15) # example data unname(tapply(x, cumsum(c(1, diff(x)) != 1), range) # [[1]] # [1] 1 5 # # [[2]] # [1] 8 10 # # [[3]] # [1] 13 15 Another example: x <- c(1, 5, 10, 11:14, 20:21, 23) unname(tapply(x, cumsum(c(1, diff(x)) !=

Getting statistical history from TeamCity API

佐手、 提交于 2019-12-21 04:38:12
问题 From looking at the TeamCity REST API Documentation, the request for statistical data is: http://teamcity:8111/httpAuth/app/rest/builds/<buildLocator>/statistics/ Which works, however , it only gives statistics for the current build (tests passed, code coverage, number of duplicates, etc.), I am looking to build a graph for my build radiator showing trends, therefore I want the historical data for the past month. Is there a way to get this historical statistic data from the TeamCity API? 回答1:

In Python, how can I calculate correlation and statistical significance between two arrays of data?

大兔子大兔子 提交于 2019-12-21 04:12:59
问题 I have sets of data with two equally long arrays of data, or I can make an array of two-item entries, and I would like to calculate the correlation and statistical significance represented by the data (which may be tightly correlated, or may have no statistically significant correlation). I am programming in Python and have scipy and numpy installed. I looked and found Calculating Pearson correlation and significance in Python, but that seems to want the data to be manipulated so it falls

Is my python implementation of the Davies-Bouldin Index correct?

给你一囗甜甜゛ 提交于 2019-12-21 02:45:37
问题 I'm trying to calculate the Davies-Bouldin Index in Python. Here are the steps the code below tries to reproduce. 5 Steps : For each cluster, compute euclidean distances between each point to the centroid For each cluster, compute the average of these distances For each pair of clusters, compute the euclidean distance between their centroids Then, For each pair of clusters, make the sum of the average distances to their respective centroid (computed at step 2) and divide it by the distance

Open source or free financial analysis programs/libraries

戏子无情 提交于 2019-12-21 02:37:17
问题 I'm looking for something containing similar functions to Matlab’s financial and financial derivatives toolbox but don’t have the cash to spend on matlab. I would appreciate any info on free or open source libraries or programs that will let me easily calculate interest rates, risk etc. 回答1: How about JQuantLib or QuantLib? 回答2: How about the Octave financial functions? http://www.gnu.org/software/octave/doc/interpreter/Financial-Functions.html#Financial-Functions I'm not familiar with the

Scaling and fitting to a log-normal distribution using a logarithmic axis in python

夙愿已清 提交于 2019-12-20 18:32:24
问题 I have a log-normal distributed set of samples. I can visualize the samples using a histrogram with either linear or logarithmic x-axis. I can perform a fit to the histogram to get the PDF and then scale it to the histrogram in the plot with the linear x-axis, see also this previously posted question. I am, however, not able to properly plot the PDF into the plot with the logarithmic x-axis. Unfortunately, it is not only a problem with the scaling of the area of the PDF to the histogram but

How do I display database query statistics on Wordpress site?

瘦欲@ 提交于 2019-12-20 17:39:58
问题 I've noticed that a few Wordpress blogs have query statistics present in their footer that simply state the number of queries and the total time required to process them for the particular page, reading something like: 23 queries. 0.448 seconds I was wondering how this is accomplished. Is it through the use of a particular Wordpress plug-in or perhaps from using some particular php function in the page's code? 回答1: Try adding this to the bottom of the footer in your template: <?php echo $wpdb

Python : easy way to do geometric mean in python?

冷暖自知 提交于 2019-12-20 17:34:12
问题 I wonder is there any easy way to do geometric mean using python but without using python package. If there is not, is there any simple package to do geometric mean? 回答1: The formula of the gemetric mean is: So you can easily write an algorithm like: import numpy as np def geo_mean(iterable): a = np.array(iterable) return a. prod() **(1.0/len(a)) You do not have to use numpy for that, but it tends to perform operations on arrays faster than Python (since there is less "overhead" with casting)

weight data with R Part II

泪湿孤枕 提交于 2019-12-20 16:42:43
问题 Given is the following data frame: structure(list(UH6401 = c(1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1), UH6402 = c(1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0