Correlation coefficient on gnuplot

安稳与你 提交于 2019-12-06 07:31:12

问题


I want to plot data using fit function : function f(x) = a+b*x**2. After ploting i have this result:

correlation matrix of the fit parameters:

               m      n      
m               1.000 
n              -0.935  1.000 

My question is : how can i found a correlation coefficient on gnuplot ?


回答1:


If you're looking for a way to calculate the correlation coefficient as defined on this page, you are out of luck using gnuplot as explained in this Google Groups thread.

There are lots of other tools for calculating correlation coefficients, e.g. numpy.




回答2:


You can use stats command in gnuplot, which syntax is similar to plot command:

stats "file.dat" using 2:(f($2)) name "A"

Correlation coefficient will be stored in A_correlation variable. You can use it subsequently to plot your data or just print on the screen using set label command:

set label 1 sprintf("r = %4.2f",A_correlation) at graph 0.1, graph 0.85

You can find more about stats command in gnuplot documentation.




回答3:


Although there is no direct solution to this problem, a workaround is possible. I'll illustrate it using python/numpy. First, the part of the gnuplot script that generates the fit and connects with a python script:

    file = "my_data.tsv"
    f(x)=a+b*(x)
    fit f(x) file using 2:3 via a,b
    r = system(sprintf("python correlation.py %s",file)) 
    ti = sprintf("y = %.2f + %.2fx (r = %s)", a, b, r)
    plot \
      file using 2:3 notitle,\
      f(x) title ti

This runs correlation.py to retrieve the correlation 'r' in string format. It uses 'r' to generate a title for the fit line. Then, correlation.py:

    from numpy import genfromtxt
    from numpy import corrcoef
    import sys
    data = genfromtxt(sys.argv[1], delimiter='\t')
    r = corrcoef(data[1:,1],data[1:,2])[0,1]
    print("%.3f" % r).lstrip('0')

Here, the first row is assumed to be a header row. Furthermore, the columns to calculate the correlation for are now hardcoded to nr. 1 and 2. Of course, both settings can be changed and turned into arguments as well.

The resulting title of the fit line is (for a personal example):

y = 2.15 + 1.58x (r = .592)



回答4:


Since you are probably using fit function you can first refer to this link to arrive at R2 values. The link uses certain existing variables like FIT_WSSR, FIT_NDF to calculate R2 value. The code for R2 is stated as:

SST = FIT_WSSR/(FIT_NDF+1)
SSE=FIT_WSSR/(FIT_NDF)
SSR=SST-SSE
R2=SSR/SST

The next step would be to show the R^2 values on the graph. Which can be achieved using the code :

set label 1 sprintf("r = %f",R2) at graph 0.7, graph 0.7



来源:https://stackoverflow.com/questions/13957456/correlation-coefficient-on-gnuplot

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!