p-value

ggplot2: Add p-value to grouped box plots

人盡茶涼 提交于 2021-01-20 09:55:06
问题 I am trying to add p_values to my graph using "stat_signif" function. The problem is that my boxplots are grouped box plots where I want to compare every 2 box plots of the same category and stat_signif function requires the x-axis values for comparing. This is my code: p <- ggplot(plot.data, aes(x = Element, y = Value, fill = Group)) + #Define the elements for plotting - group by "strandness". geom_boxplot(outlier.shape = NA, colour = "black") + scale_fill_manual(values = c("goldenrod",

p-values from ridge regression in python

南笙酒味 提交于 2020-06-25 05:29:29
问题 I'm using ridge regression (ridgeCV). And I've imported it from: from sklearn.linear_model import LinearRegression, RidgeCV, LarsCV, Ridge, Lasso, LassoCV How do I extract the p-values? I checked but ridge has no object called summary. I couldn't find any page which discusses this for python (found one for R). alphas = np.linspace(.00001, 2, 1) rr_scaled = RidgeCV(alphas = alphas, cv =5, normalize = True) rr_scaled.fit(X_train, Y_train) 回答1: You can use the regressors package to output p

R语言可视化学习笔记之添加p-value和显著性标记

夙愿已清 提交于 2020-03-10 04:21:26
R语言可视化学习笔记之添加p-value和显著性标记 http://www.jianshu.com/p/b7274afff14f?from=timeline 上篇文章中提了一下如何通过ggpubr包为 ggplot 图添加 p-value 以及显著性标记,本文将详细介绍。利用数据集ToothGrowth进行演示 #先加载包 library(ggpubr) #加载数据集ToothGrowth data("ToothGrowth") head(ToothGrowth) ## len supp dose ## 1 4.2 VC 0.5 ## 2 11.5 VC 0.5 ## 3 7.3 VC 0.5 ## 4 5.8 VC 0.5 ## 5 6.4 VC 0.5 ## 6 10.0 VC 0.5 比较方法 R中常用的比较方法主要有下面几种: 方法 R函数 描述 T-test t.test() 比较两组(参数) Wilcoxon test wilcox.test() 比较两组(非参数) ANOVA aov()或anova() 比较多组(参数) Kruskal-Wallis kruskal.test() 比较多组(非参数) 各种比较方法后续有时间一一讲解。 添加 p-value 主要利用ggpubr包中的两个函数: compare_means() :可以进行一组或多组间的比较 stat

R时间序列分析实例

杀马特。学长 韩版系。学妹 提交于 2020-02-17 09:22:05
一、作业要求 自选时间序列完成时间序列的建模过程,要求序列的长度>=100。 报告要求以下几部分内容: 数据的描述:数据来源、期间、数据的定义、数据长度。 作时间序列图并进行简单评价。 进行时间序列的平稳性检验,得出结论,不平稳时间序列要进行转化,最终平稳。 进行自相关、偏自相关图,得出模型的阶数。 对时间序列模型进行拟合,得出参数的估计值。 检验模型的残差项,判断模型是否合格,给出模型最终的估计结果。 应用建立的时间序列模型进行预测。 二、数据描述 数据来源 :国家统计局——统计数据——月度数据——交通运输——旅客运输量 时间范围选择“2005-”,表示2005年至今 点击“下载”,格式选择CSV,并重命名为“旅客运输量.csv” http://data.stats.gov.cn/easyquery.htm?cn=A01 本次使用的数据为表中的第8行——铁路客运量当期值(万人) 期间 :2005年1月至2019年10月 其中2005年至2017年的数据用来建立模型 2018年和2019年的数据用于和预测结果比较 数据的定义: 数据类型为时间序列(ts) #载入必要的R程序包 library(fUnitRoots) ## Warning: package 'fUnitRoots' was built under R version 3.5.3 ## Loading required

Python sklearn - how to calculate p-values

梦想的初衷 提交于 2019-12-30 01:20:49
问题 This is probably a simple question but I am trying to calculate the p-values for my features either using classifiers for a classification problem or regressors for regression. Could someone suggest what is the best method for each case and provide sample code? I want to just see the p-value for each feature rather than keep the k best / percentile of features etc as explained in the documentation. Thank you 回答1: Just run the significance test on X, y directly. Example using 20news and chi2 :

R ggplot2 boxplots - ggpubr stat_compare_means not working properly

落爺英雄遲暮 提交于 2019-12-28 16:13:28
问题 I am trying to add significance levels to my boxplots in the form of asterisks using ggplot2 and the ggpubr package, but I have many comparisons and I only want to show the significant ones. I try to use the option hide.ns=TRUE in stat_compare_means , but it clearly does not work , it might be a bug in the ggpubr package. Besides, you see that I leave out group "PGMC4" from the pairwise wilcox.test comparisons; how can I leave this group out also for the kruskal.test ? The last question I

How to increase the hatch density matplotlib?

与世无争的帅哥 提交于 2019-12-24 23:17:36
问题 I'm trying to increase the density of hatch marks I have for a map. This is how I'm currently doing it and I can't get anything to show up on my map when it should. I want to contour values less than 0.05 (sig p-valuea). And repeating the type of hashing I want doesn't help with the problem. levels = [spnewdata[:,:,1].min(), 0.05] cs1 = plt.contour(x, y, spnewdata[:,:,1],levels=levels, colors='none', hatch='X', alpha=0) Edit: Here's a little more complete form of the code I'm using to make

R small pvalues

China☆狼群 提交于 2019-12-24 02:28:18
问题 I am calculating z-scores to see if a value is far from the mean/median of the distribution. I had originally done it using the mean, then turned these into 2-side pvalues. But now using the median I noticed that there are some Na's in the pvalues. I determined this is occuring for values that are very far from the median. And looks to be related to the pnorm calculation. " 'qnorm' is based on Wichura's algorithm AS 241 which provides precise results up to about 16 digits. " Does anyone know