mean | 易学教程

Python计算数据集均值和标准差

阅读更多关于 Python计算数据集均值和标准差

import pandas as pd import numpy as np from md import read_image from tqdm import tqdm import cv2 mean = 0.49999999997525235 std = 0.2092156204202305 filepath = r'/data/dataset/xx.xx/naru_data_mean/naru_data/0' pathDir = os.listdir(filepath) img_mean_list = [] img_std_list = [] for idx in range(len(pathDir)): filename = pathDir[idx] img = read_image(os.path.join(filepath, filename)).to_numpy() img = (img - img.min()) / (img.max() - img.min()) * 255 # 我的图像灰度值为1-1000，此操作为归一化 img_mean = np.mean(img) img_std = np.std(img) img_mean_list.append(img_mean) img_std_list.append(img_std) data_mean = np

How to add a line in boxplot?

阅读更多关于 How to add a line in boxplot?

问题 I would like to add lines between "mean" in my boxplot. My code: library(ggplot2) library(ggthemes) Gp=factor(c(rep("G1",80),rep("G2",80))) Fc=factor(c(rep(c(rep("FC1",40),rep("FC2",40)),2))) Z <-factor(c(rep(c(rep("50",20),rep("100",20)),4))) Y <- c(0.19 , 0.22 , 0.23 , 0.17 , 0.36 , 0.33 , 0.30 , 0.39 , 0.35 , 0.27 , 0.20 , 0.22 , 0.24 , 0.16 , 0.36 , 0.30 , 0.31 , 0.39 , 0.33 , 0.25 , 0.23 , 0.13 , 0.16 , 0.18 , 0.20 , 0.16 , 0.15 , 0.09 , 0.18 , 0.21 , 0.20 , 0.14 , 0.17 , 0.18 , 0.22 , 0

R语言：常用统计检验

阅读更多关于 R语言：常用统计检验

统计检验是将抽样结果和抽样分布相对照而作出判断的工作。主要分5个步骤：建立假设求抽样分布选择显著性水平和否定域计算检验统计量判定 —— 百度百科假设检验(hypothesis test)亦称显著性检验(significant test)，是统计推断的另一重要内容，其目的是比较总体参数之间有无差别。假设检验的实质是判断观察到的“差别”是由抽样误差引起还是总体上的不同，目的是评价两种不同处理引起效应不同的证据有多强，这种证据的强度用概率P来度量和表示。除t分布外，针对不同的资料还有其他各种检验统计量及分布，如F分布、X2分布等，应用这些分布对不同类型的数据进行假设检验的步骤相同，其差别仅仅是需要计算的检验统计量不同。正态总体均值的假设检验 t检验 t.test() => Student's t-Test require(graphics) t.test(1:10, y = c(7:20)) # P = .00001855 t.test(1:10, y = c(7:20, 200)) # P = .1245 -- 不在显著 ## 经典案例: 学生犯困数据 plot(extra ~ group, data = sleep) ## 传统表达式 with(sleep, t.test(extra[group == 1], extra[group == 2])) Welch Two

How to handle more than multiple sets of data in R programming?

阅读更多关于 How to handle more than multiple sets of data in R programming?

问题 Ca data <- cut(data$Time, breaks=seq(0, max(data$Time)+400, 400))  by(data$Oxytocin, cuts, mean) but this would only work for only one person's data....But I have ten people with their own Time and oxytocin data....How would I get their averages simultaneously? Also instead of having this type output : cuts: (0,400] [1] 0.7 ------------------------------------------------------------ cuts: (400,800] [1] 0.805 Is there a way I can get a list of those cuts? 回答1: Here's a solution using IRanges

means and SD for columns in a dataframe with NA values

阅读更多关于 means and SD for columns in a dataframe with NA values

问题 I'm trying to calculate the mean and standard deviation of several columns (except the first column) in a data.frame with NA values. I've tried colMeans , sapply , etc., to create a loop that runs through the data.frame and then stores means and standard deviations in a separate table but keep getting a "FUN" error. any help would be great. Thanks a 回答1: sapply(df, function(cl) list(means=mean(cl,na.rm=TRUE), sds=sd(cl,na.rm=TRUE))) col1 col2 col3 col4 col5 means 3 8 12.5 18.25 22.5 sds 1

How to ignore values when using numpy.sum and numpy.mean in matrices

阅读更多关于 How to ignore values when using numpy.sum and numpy.mean in matrices

问题 Is there a way to avoid using specific values when applying sum and mean in numpy? I'd like to avoid, for instance, the -999 value when calculating the result. In [14]: c = np.matrix([[4., 2.],[4., 1.]]) In [15]: d = np.matrix([[3., 2.],[4., -999.]]) In [16]: np.sum([c, d], axis=0) Out[16]: array([[ 7., 4.], [ 8., -998.]]) In [17]: np.mean([c, d], axis=0) Out[17]: array([[ 3.5, 2. ], [ 4. , -499. ]]) 回答1: Use a masked array: >>> c = np.ma.array([[4., 2.], [4., 1.]]) >>> d = np.ma.masked

pytorch BCELoss和BCEWithLogitsLoss

阅读更多关于 pytorch BCELoss和BCEWithLogitsLoss

BCELoss CLASS torch.nn. BCELoss ( weight=None , size_average=None , reduce=None , reduction='mean' ) 创建一个标准来度量目标和输出之间的二进制交叉熵。 unreduced (i.e. with reduction set to 'none' ) 时该损失描述为：其中N是批尺寸，如果 reduction 不是 'none' (默认为 'mean' ), 则：即，对批次中各样本损失求均值或求和。其可以用来测量重构误差，例如一个自编码器。注意目标y应该是0到1之间的数字。 Parameters： weight ( Tensor , optional ) – a manual rescaling weight given to the loss of each batch element. If given, has to be a Tensor of size nbatch . size_average ( bool , optional ) –（已弃用） Deprecated (see reduction ). By default, the losses are averaged over each loss element in the batch. Note that

pytorch常用损失函数

阅读更多关于 pytorch常用损失函数

损失函数的基本用法： criterion = LossCriterion() #构造函数有自己的参数 loss = criterion(x, y) #调用标准时也有参数得到的loss结果已经对mini-batch数量取了平均值 1.BCELoss(二分类) CLASS torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='mean') 创建一个衡量目标和输出之间二进制交叉熵的criterion unreduced loss函数(即reduction参数设置为'none')为： N表示batch size，x n 为输出，y n 为目标如果reduction不为'none'(默认设为'mean'),则：即默认情况下，loss会基于 element求平均值，如果 size_average=False 的话， loss 会被累加。这是用来测量误差error的重建，例如一个自动编码器。注意 0<=target[i]<=1。参数： weight ( Tensor , 可选 ) – 每批元素损失的手工重标权重。如果给定，则必须是一个大小为“nbatch”的张量。 size_average ( bool , 可选 ) – 弃用(见 reduction 参数)。默认情况下，设置为True

harmonic mean in python

阅读更多关于 harmonic mean in python

问题 The Harmonic Mean function in Python ( scipy.stats.hmean ) requires that the input be positive numbers. For example: from scipy import stats print stats.hmean([ -50.2 , 100.5 ]) results in: ValueError: Harmonic mean only defined if all elements greater than zero I don't mathematically see why this should be the case, except for the rare instance where you would end up dividing by zero. Instead of checking for a divide by zero, hmean() then throws an error upon inputing any positive number,

Add a row with means of columns to pandas DataFrame

阅读更多关于 Add a row with means of columns to pandas DataFrame

问题 I have a pandas DataFrame consisting of some sensor readings taken over time like this: diode1 diode2 diode3 diode4 Time 0.530 7 0 10 16 1.218 17 7 14 19 1.895 13 8 16 17 2.570 8 2 16 17 3.240 14 8 17 19 3.910 13 6 17 18 4.594 13 5 16 19 5.265 9 0 12 16 5.948 12 3 16 17 6.632 10 2 15 17 I have written code to add another row with the means of each column: # List of the averages for the test. averages = [df[key].describe()['mean'] for key in df] indexes = df.index.tolist() indexes.append('mean