mean

Python计算数据集均值和标准差

自闭症网瘾萝莉.ら 提交于 2019-12-21 01:21:30
import pandas as pd import numpy as np from md import read_image from tqdm import tqdm import cv2 mean = 0.49999999997525235 std = 0.2092156204202305 filepath = r'/data/dataset/xx.xx/naru_data_mean/naru_data/0' pathDir = os.listdir(filepath) img_mean_list = [] img_std_list = [] for idx in range(len(pathDir)): filename = pathDir[idx] img = read_image(os.path.join(filepath, filename)).to_numpy() img = (img - img.min()) / (img.max() - img.min()) * 255 # 我的图像灰度值为1-1000,此操作为归一化 img_mean = np.mean(img) img_std = np.std(img) img_mean_list.append(img_mean) img_std_list.append(img_std) data_mean = np

How to add a line in boxplot?

混江龙づ霸主 提交于 2019-12-20 10:34:45
问题 I would like to add lines between "mean" in my boxplot. My code: library(ggplot2) library(ggthemes) Gp=factor(c(rep("G1",80),rep("G2",80))) Fc=factor(c(rep(c(rep("FC1",40),rep("FC2",40)),2))) Z <-factor(c(rep(c(rep("50",20),rep("100",20)),4))) Y <- c(0.19 , 0.22 , 0.23 , 0.17 , 0.36 , 0.33 , 0.30 , 0.39 , 0.35 , 0.27 , 0.20 , 0.22 , 0.24 , 0.16 , 0.36 , 0.30 , 0.31 , 0.39 , 0.33 , 0.25 , 0.23 , 0.13 , 0.16 , 0.18 , 0.20 , 0.16 , 0.15 , 0.09 , 0.18 , 0.21 , 0.20 , 0.14 , 0.17 , 0.18 , 0.22 , 0

R语言:常用统计检验

ぐ巨炮叔叔 提交于 2019-12-20 07:41:37
统计检验是将抽样结果和抽样分布相对照而作出判断的工作。主要分5个步骤: 建立假设 求抽样分布 选择显著性水平和否定域 计算检验统计量 判定 —— 百度百科 假设检验(hypothesis test)亦称显著性检验(significant test),是统计推断的另一重要内容,其目的是比较总体参数之间有无差别。假设检验的实质是判断观察到的“差别”是由抽样误差引起还是总体上的不同,目的是评价两种不同处理引起效应不同的证据有多强,这种证据的强度用概率P来度量和表示。除t分布外,针对不同的资料还有其他各种检验统计量及分布,如F分布、X2分布等,应用这些分布对不同类型的数据进行假设检验的步骤相同,其差别仅仅是需要计算的检验统计量不同。 正态总体均值的假设检验 t检验 t.test() => Student's t-Test require(graphics) t.test(1:10, y = c(7:20)) # P = .00001855 t.test(1:10, y = c(7:20, 200)) # P = .1245 -- 不在显著 ## 经典案例: 学生犯困数据 plot(extra ~ group, data = sleep) ## 传统表达式 with(sleep, t.test(extra[group == 1], extra[group == 2])) Welch Two

How to handle more than multiple sets of data in R programming?

允我心安 提交于 2019-12-20 06:19:39
问题 Ca data <- cut(data$Time, breaks=seq(0, max(data$Time)+400, 400))  by(data$Oxytocin, cuts, mean) but this would only work for only one person's data....But I have ten people with their own Time and oxytocin data....How would I get their averages simultaneously? Also instead of having this type output : cuts: (0,400] [1] 0.7 ------------------------------------------------------------ cuts: (400,800] [1] 0.805 Is there a way I can get a list of those cuts? 回答1: Here's a solution using IRanges

means and SD for columns in a dataframe with NA values

余生颓废 提交于 2019-12-20 02:28:13
问题 I'm trying to calculate the mean and standard deviation of several columns (except the first column) in a data.frame with NA values. I've tried colMeans , sapply , etc., to create a loop that runs through the data.frame and then stores means and standard deviations in a separate table but keep getting a "FUN" error. any help would be great. Thanks a 回答1: sapply(df, function(cl) list(means=mean(cl,na.rm=TRUE), sds=sd(cl,na.rm=TRUE))) col1 col2 col3 col4 col5 means 3 8 12.5 18.25 22.5 sds 1

How to ignore values when using numpy.sum and numpy.mean in matrices

自作多情 提交于 2019-12-19 07:18:20
问题 Is there a way to avoid using specific values when applying sum and mean in numpy? I'd like to avoid, for instance, the -999 value when calculating the result. In [14]: c = np.matrix([[4., 2.],[4., 1.]]) In [15]: d = np.matrix([[3., 2.],[4., -999.]]) In [16]: np.sum([c, d], axis=0) Out[16]: array([[ 7., 4.], [ 8., -998.]]) In [17]: np.mean([c, d], axis=0) Out[17]: array([[ 3.5, 2. ], [ 4. , -499. ]]) 回答1: Use a masked array: >>> c = np.ma.array([[4., 2.], [4., 1.]]) >>> d = np.ma.masked

pytorch BCELoss和BCEWithLogitsLoss

一世执手 提交于 2019-12-19 04:55:49
BCELoss CLASS torch.nn. BCELoss ( weight=None , size_average=None , reduce=None , reduction='mean' ) 创建一个标准来度量目标和输出之间的二进制交叉熵。 unreduced (i.e. with reduction set to 'none' ) 时该损失描述为: 其中N是批尺寸, 如果 reduction 不是 'none' (默认为 'mean' ), 则: 即,对批次中各样本损失求均值或求和。 其可以用来测量重构误差,例如一个自编码器。 注意目标y应该是0到1之间的数字。 Parameters: weight ( Tensor , optional ) – a manual rescaling weight given to the loss of each batch element. If given, has to be a Tensor of size nbatch . size_average ( bool , optional ) –(已弃用) Deprecated (see reduction ). By default, the losses are averaged over each loss element in the batch. Note that

pytorch常用损失函数

无人久伴 提交于 2019-12-19 03:21:45
损失函数的基本用法: criterion = LossCriterion() #构造函数有自己的参数 loss = criterion(x, y) #调用标准时也有参数 得到的loss结果已经对mini-batch数量取了平均值 1.BCELoss(二分类) CLASS torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='mean') 创建一个衡量目标和输出之间二进制交叉熵的criterion unreduced loss函数(即reduction参数设置为'none')为: N表示batch size,x n 为输出,y n 为目标 如果reduction不为'none'(默认设为'mean'),则: 即默认情况下,loss会基于 element求 平均值,如果 size_average=False 的话, loss 会被累加。 这是用来测量误差error的重建,例如一个自动编码器 。注意 0<=target[i]<=1。 参数: weight ( Tensor , 可选 ) – 每批元素损失的手工重标权重。如果给定,则必须是一个大小为“nbatch”的张量。 size_average ( bool , 可选 ) – 弃用(见 reduction 参数)。默认情况下,设置为True

harmonic mean in python

空扰寡人 提交于 2019-12-18 19:06:14
问题 The Harmonic Mean function in Python ( scipy.stats.hmean ) requires that the input be positive numbers. For example: from scipy import stats print stats.hmean([ -50.2 , 100.5 ]) results in: ValueError: Harmonic mean only defined if all elements greater than zero I don't mathematically see why this should be the case, except for the rare instance where you would end up dividing by zero. Instead of checking for a divide by zero, hmean() then throws an error upon inputing any positive number,

Add a row with means of columns to pandas DataFrame

て烟熏妆下的殇ゞ 提交于 2019-12-18 14:53:40
问题 I have a pandas DataFrame consisting of some sensor readings taken over time like this: diode1 diode2 diode3 diode4 Time 0.530 7 0 10 16 1.218 17 7 14 19 1.895 13 8 16 17 2.570 8 2 16 17 3.240 14 8 17 19 3.910 13 6 17 18 4.594 13 5 16 19 5.265 9 0 12 16 5.948 12 3 16 17 6.632 10 2 15 17 I have written code to add another row with the means of each column: # List of the averages for the test. averages = [df[key].describe()['mean'] for key in df] indexes = df.index.tolist() indexes.append('mean