By László Györfi, Michael Kohler, Adam Krzyzak, Harro Walk

ISBN-10: 0387954414

ISBN-13: 9780387954417

This e-book offers a scientific in-depth research of nonparametric regression with random layout. It covers just about all recognized estimates. The emphasis is on distribution-free houses of the estimates.

14) i=1 Usually this leads to overly optimistic estimates of the L2 risk and is hence not useful. 14) favors estimates which are too well-adapted to the data and are not reasonable for new observations (X, Y ). This problem doesn’t occur if one uses a new sample ¯ 1 , Y¯1 ), . . 12), where ¯ 1 , Y¯1 ), . . , (X ¯ n , Y¯n ), (X1 , Y1 ), . . , if one minimizes 1 n n ¯ i ) − Y¯i |2 . 15) i=1 Of course, in the regression function estimation problem one doesn’t have an additional sample. But this isn’t a real problem, because we can simply split the sample Dn into two parts: a learning sample Dn1 = {(X1 , Y1 ), .

The kernel estimate is due to Nadaraya (1964; 1970) and Watson (1964). Nearest neighbor estimates were introduced in pattern recognition by Fix and Hodges (1951) and also used in density estimation and regression estimation by Loftsgaarden and Quesenberry (1965), Royall (1966), Cover and Hart (1967), Cover (1968a), and Stone (1977), respectively. The principle of least squares, which is behind the global modeling estimates, is much older. It was independently proposed by A. M. Legendre in 1805 and by C.

Xn1 , Yn1 )} which we use to construct estimates mn1 ,p (·) = mn1 ,p (·, Dn1 ) depending on some parameter p, and a test sample {(Xn1 +1 , Yn1 +1 ), . . 4. Choice of Smoothing Parameters and Adaptation 27 which we use to choose the parameter p of the estimate by minimizing 1 n − n1 n |mn1 ,p (Xi ) − Yi |2 . 16) i=n1 +1 In applications one often uses n1 = 23 n or n1 = n2 . If n is large, especially if n is so large that it is computationally diﬃcult to construct an estimate mn using all the data, then this is a very reasonable method (cf.

