Rocksolid Light

Welcome to Rocksolid Light

mail  files  register  newsreader  groups  login

Message-ID:  

Do not meddle in the affairs of troff, for it is subtle and quick to anger.


devel / comp.graphics.apps.gnuplot / MSE of a linear regression line + prediction intervals

SubjectAuthor
o MSE of a linear regression line + prediction intervalsAlex van der Spek

1
MSE of a linear regression line + prediction intervals

<nnd$0cc4cb6a$6d5d137d@a08940df5950d1eb>

  copy mid

https://news.novabbs.org/devel/article-flat.php?id=160&group=comp.graphics.apps.gnuplot#160

  copy link   Newsgroups: comp.graphics.apps.gnuplot
From: amvds@xs4all.nl (Alex van der Spek)
Subject: MSE of a linear regression line + prediction intervals
Newsgroups: comp.graphics.apps.gnuplot
MIME-Version: 1.0
User-Agent: Pan/0.149 (Bellevue; 4c157ba)
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Message-ID: <nnd$0cc4cb6a$6d5d137d@a08940df5950d1eb>
Organization: KPN B.V.
Date: Tue, 29 Aug 2023 14:54:18 +0200
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!peer03.ams4!peer.am4.highwinds-media.com!news.highwinds-media.com!feed.abavia.com!abe005.abavia.com!abp002.abavia.com!news.kpn.nl!not-for-mail
Lines: 129
Injection-Date: Tue, 29 Aug 2023 14:54:18 +0200
Injection-Info: news.kpn.nl; mail-complaints-to="abuse@kpn.com"
X-Received-Bytes: 4493
 by: Alex van der Spek - Tue, 29 Aug 2023 12:54 UTC

I worked out a way to compute and plot the wide and narrow prediction
intervals of a linear regression line. A self contained working plot
script goes below.

I have two questions.

1. It seems as if the MSE could have been computed by the stats command.
It don't think it is so I put in a work around which is rather convoluted.
Am I overlooking something?

2. The shading between the prediction intervals doesn't clip correctly at
the plot boundaries. I believe this is a known issue for which no work
around exists? It would be a nice to have only, currently I just plot the
lines instead of a fill between curves (which is in the example code
below).

Thanks in advance,
Alex

+++++++++++++++++++++
#!/usr/bin/env gnuplot
#Make a Weibull plot of DC motor data taken from Philips databook C18
(1986)

#Clean slate
unset multiplot
unset out
reset
set term qt size 640,480
set encoding utf8

#Inline data N=20
$MDAT << EOF
#Benard Hours Exact
0.034 1750 0.0341
0.083 2000 0.0825
0.132 2525 0.1315
0.181 2600 0.1805
0.230 2815 0.2297
EOF

#Linear fit
stats $MDAT using (log10(column(2))):(log(-log(1-column(1))))
n = STATS_records
x_mean = STATS_mean_x; y_mean = STATS_mean_y
x_std = STATS_stddev_x; y_std = STATS_stddev_y
a = STATS_slope; b = STATS_intercept
a_err = STATS_slope_err; b_err = STATS_intercept_err

#Mean Square Error
set table $MSED
plot $MDAT using (($1)-(1-exp(-exp(a*log10($2)+b))))**2
unset table
stats $MSED using 2
MSE = STATS_sum / (n-2)

#Wide prediction intervals
set table $WPINT
set xrange [200:20000]
set samples 1000
t = 3.18
plot '+' using ($1):(1-exp(-exp(a*log10($1)+b))) + t * \
sqrt(MSE * (1+1/n+(log10($1)-x_mean)**2/(n*x_std**2))): \
(1-exp(-exp(a*log10($1)+b))) - t * \
sqrt(MSE * (1+1/n+(log10($1)-x_mean)**2/(n*x_std**2))) \
with filledcurves
unset table

#Narrow prediction intervals
set table $NPINT
set xrange [200:20000]
set samples 1000
t = 3.18
plot '+' using ($1):(1-exp(-exp(a*log10($1)+b))) + t * \
sqrt(MSE * (1/n+(log10($1)-x_mean)**2/(n*x_std**2))):\
(1-exp(-exp(a*log10($1)+b))) - t * \
sqrt(MSE * (1/n+(log10($1)-x_mean)**2/(n*x_std**2))) \
with filledcurves
unset table

#Compute B10, MTBF and FR via shape (k) and scale (l) parameters
log10exp1 = log10(exp(1))
k = a * log10(exp(1))
l = exp(-b / k)

#Quantile function for Weibull distribution
Q(p, k, l) = l * (-log(1-p))**(1.0/k)

#B10 hours, 90% lives longer
B10 = Q(0.1, k, l)

#Characteristic time and failure rate
CT = Q(1-exp(-1), k, l)
FR = 1 / CT

#Mean Time Between Failures
MTBF = l * gamma(1 + 1.0/k)

#Plot specifics
set samples 100
set grid x y mx
set tics out
set xlabel 'Operating hours'
set ylabel 'Median rank'
set key top left opaque
set style fill transparent solid 0.25

#Non linear, Weibull distributed y axis, x axis log
set yrange [0.01:0.99]
set xrange [1000:7000]
set nonlinear y via log(-log(1-y)) inverse 1-exp(-exp(y))
set logscale x
set ytics (0.01, 0.02, 0.05, 0.1, 0.2, 0.5, \
"1-e^{-1}" 0.632, 0.9, 0.99)
set xtics (1000, '' 1500, 2000, 3000, 4000, 5000, 6000, 7000)

set title sprintf("B_{10}=%.fh; MTBF=%.fh; FR = %.2f/1000h", B10, MTBF,
FR*1000)
plot $WPINT using 1:2:3 with filledcurves lc 2 notitle, \
$NPINT using 1:2:3 with filledcurves lc 4 notitle, \
$MDAT using 2:1 with points pt 7 ps 2 lc 8 title 'Data', \
$MDAT using 2:3 with points pt 6 ps 2 lc 8 title 'Data', \
1-exp(-1) with lines dt 2 title '{/Symbol t}', \
0.1 with lines dt 3 title 'B_{10}', \
(1-exp(-exp(a*log10(x)+b))) with lines lc 7 lw 2 title 'Fit Weibull',
\ (1-exp(-exp(log(x/l)))) with lines lc 7 dt 3 title 'Fit Exponential'

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor