I believe, for n-th order evaluation of ranges, you need to be able to get the range of a polynomial of degree n-1 to at least n-th order. In turn, that job may be difficult. There is an article by Cornelius and Lohner, Computing vol. 3 no. 3, pp. 331--347, 1984, entitled "Computing the Range of Values of Real Functions with Accuracy Higher than Second Order" that gives insight into this question.
If you get any additional responses, please either post them to the alias or send them to me, as I am presently also interested in this question.
The Taylor model method we have been using recently (mostly for verified integration of ODEs and control of the dependency problem) provides such inclusions. Specifically, instead of intervals one uses (floating point) Taylor polynomials with remainder bound and intrinsic operations on these; the sharpness scales with order (n+1), where n is the order of the polynomial. In practice we often use multivariate polynomials of orders between 5 and 20, depending on the complexity of the problem. Some references are
K. Makino and M. Berz, Remainder Differential Algebras and their Applications, in: Computational Differentiation: Techniques, Applications, and Tools, M. Berz, C. Bischof, G. Corliss and A. Griewank (Eds), SIAM, p.63-74, 1996
M. Berz and G. Hoffstaetter, Computation and Application of Taylor Polynomials with Interval Remainder Bounds, Reliable Computing 4, p. 83-97, 1998
M. Berz, Differential Algebras with Remainder and Rigorous Proofs of Long-Term Stability, Fourth Computational Accelerator Physics Conference, AIP Conference Proceedings vol 391, p. 221, 1996 (a particularly difficult global optimization problem)
M. Berz and K. Makino, Verified Integration of ODEs and Flows with Differential Algebraic Methods on Taylor Models, Reliable Computing 4 4, p.361-369, 1998
K. Makino and M. Berz, Efficient Control of the Dependency Problem based on Taylor Model Methods, Reliable Computing 5, p. 3-12, 1999
M. Berz and K. Makino, New Methods for High-Dimensional Verified Quadrature, Reliable Computing 5, p. 13-22, 1999
These papers are available for download at bt.nscl.msu.edu/pub. Hope this helps,
I'm embarrassed that I was so blind to Berz' work. And so much of it is in Reliable Computing!! Yes, I'm more optimistic now that some of our hard problems can be solved that way. It may also allow simplification of the algorithms by obviating the need for, eg. additional sophistication in the epsilon inflation.
I'm presently studying Berz' work.
You know I (like many other people) am a great admirer of your results. They lead to really great applications and interesting theoretical models.
However, with Sheela's question as I understand it, I am not sure that they can help.
It is my understanding (and I may be wrong) that she is asking about the following problem:
For example, this function f(x1,..,xn) can be a polynomial, e.g., a quadratic polynomial f(x1,...,xn )= a0 + a1*x1 + ... + an*xn + a11*x1*x1 + a12*x1*x2 + ... + ann*xn*xn with exactly known coefficients ao, ai, and aij. I do not see how your programs can help in computing the range of this function on a given box (I give an example of a quadratic polynomial, because, as I mentioned, computing the range of a quadratic function with a given accuracy is an NP-hard problem).
By the way, the fact that I do not know how does not mean that it is not possible, of course. If you know how to use it it will be of great interest.
Is this NP-complete in the number of variables, in the length of the description of the function, in both, or in some other item? Can you post the reference for us? Or has Martin proven P = NP? My guess is "no" to the latter. However, his published results do look like an advance, in some sense.
I suggest you look at Berz' work. It may do the trick for you, depending on the widths of x1, x2, and x3. One downside: Unless you use Berz' software, it may take some time to implement.
I apologize for the confusion.
I have sent the email yesterday, before my reaction to Martin's posting, with references and explanations, and my reaction to Martin;s posting was supposed to be a minor comment on my previous emails. It looks like my previous emails were not received, so I am re-sending them:
Such forms may exist for particular cases, but in general, it is impossible; the exact definitions and proofs are given in our paper
Hung T. Nguyen, Vladik Kreinovich, Vyacheslav Nesterov, and Mutsumi Nakamura, "On hardware support for interval computations and for soft computing: a theorem", IEEE Transactions on Fuzzy Systems, 1997, Vol. 5, No. 1, pp. 108-127.
If you have difficulty finding this paper, let me know your mailing address, I will send you a xerox copy by airmail.
P.S. Actually, even for quadratic polynomials, there is no feasible way of computing the range witha given accuracy (it is an NP-hard problem); see, e.g., this result (originally due to Vavasis) is described, among others, in our book
Vladik Kreinovich, Anatoly Lakeyev, Jiri Rohn, and Patrick Kahl, "Computational complexity and feasibility of data processing and interval computations", Kluwer, Dordrecht, 1997.
These negative results do not mean that the situation is all bad: in many cases, there are methods which work in almost all situations (see same book), so good practical bounds which are often good are still possible. Vladik
Today's comment, and answer to Baker's question: the problem of computing the range is NP-hard if we allow arbitrary number of variables; when we fix the number of variables, the problem can be solved in polynomial time (see our above-cited book), but the degree of this polynomial grows very rapidly with the number of variables).
P.P.S. Baker, you seem to say that somewhere in Martin's papers he gives an algorithm for computing the exact (or good approximate) range of a given polynomial (to be more precise, he probably talks about a more general problem, but you seem to imply that it is applicable to a polynomial as well).
Definitely, if you have a function which is NOT a polynomial (as Sheela has) then Martin;s method can help by reducing it to a polynomial with an arbitrary degree (the higher the degree, the better the accuracy), but we will still have the problem of computing the range of a polynomial, the problem which is NP-hard even when the polynomial is known exactly (and even when it is quadratic).
It may be that Martin handles the part about computing the range of the polynomial as well, but I could not find that part. We looked rather attentively into his papers, because we understand that he does help to solve differential equations nicely, and we cite that result in our book.
Can you please tell me the exact place where he describes how to find the range of a give polynomial? I will be glad to look into it.
I am not doubting that it is possible to have an efficient algorithm for computing the range of a polynomial, since as we have proven in the book (see above-repeated email), there are polynomial-time alogirthms which compute the range in almost all cases (in some reasonable sense), I just never looked at Martin's papers in this way.
I did NOT say that Martin gives an algorithm for the exact range of a polynomial. I am mainly speculating that the Taylor models he espouses may be more practical than traditional interval computations in many contexts requiring good bounds on the range. This is not necessarily in conflict with your complexity results. In particular, my understanding is that the theory surrounding the Taylor models is an asymptotic theory, as the widths of the intervals go to zero. However, like high-order floating-point approximations, the higher-order models may give better approximations to the range, within wider intervals, than lower-order models (including plain interval arithmetic).
I must say that I am just now devoting the time they deserve to study Berz' developments. I will also need to revisit your book to reconcile the success of the Taylor models with your complexity theory results. But I don't think there is any contradiction.
Thanks for the clarification.
I agree that there is no contradiction between Vavasis's and ours complexity results and the practical fact that algorithms seem to be often reasonably efficient: after all, complexity results are about worst-case complexity, and in practice what we care about is average-case complexity. It is quite possible (and indeed the case for range computation) that the worst-case complexity is NP-hard, while the avergae-case complexity is polynomial-time.
I agree 100% that Taylor theory is an asymptotic theory, and as I mentioned in my yesterday's email, in addition to results about the NP-hardness of exact range computation, we also have results about NP-hardness of approximnate range computations with the given accuracy. In addition to these results, in our IEEE Transaction paper with Nesterov et al which I also cited we give a precise negative result: that no 3rd or 4th order method is possible for computing the range (meaning a method which gives this order of accuracy for an arbitrary polynomial of many variables). (Warning; this paper with Nesterov is one of the few things which we had to leave out of the book, because we were way above the original page limit anyway).
I also agree that it may be possible that Taylor-type methods will help in computing the range (they are definitely helpful in solving differential equations).
I agree with you that asymptotic methods are likely to produce approximate result rather than the exact one. All I am saying is that so far, by reading his papers, I did not find there an algorithm which would compute the range of a given (exactly known) polynomial with 3rd or 4th order accuracy (which is what Sheela was asking in the first place).
It may be there. If after studying Martin's papers, you will be able to extract such an algorithm, please let me know, it will be very helpful. It was so far my understanding that our book and Martin are about two different problems: we mainly analyze the complexity of computiung the range of a function, while Martin is solving differential equations. It could be that I overlooked some range-computing algorithms which are there as well in his papers.
Any algorithms for solving (some instances of) an NP-hard problem are very helpful because NP-hardness means that, crudelky speaking, that a given problem is as hard as problems can be; so any progress in solving one NP_hard problem can immediately lead to a solution of the others; e.g., in our complexity book, we show that known good heuristics for solving the range computation problem lead to new efficient hueristics for another NP-hard problem - propositional satisfiability. In this sense, any new algorithm for solving a problem which is, in general, NP-hard, is of great interest.
I was reading
M. Berz and G. Hoffstaetter, "Computation and Application of Taylor Polynomials with Interval Remainder Bounds," "Reliable Computing" 4:83-97 (1998). The last sentence of the abstract says "The method is used for guaranteed global optimization of blow-up prone functions." Also, in the introduction, it describes a typical interval global optimization algorithm that fails because of overestimation in the range of the function. This led me to believe that a better method of bounding the range would be presented.
Also, in Makino and Berz, "Efficient Control of The Dependency Problem Based on Taylor Model Methods", "Reliable Computing" 5:3-12, 1999, an example of a range enclosure using both a traditional interval method and using the Taylor model is given on pp. 6-7. There, the Taylor model gives an essentially sharp enclosure of the range from order 5 on.
However, examining the Berz and Hoffstaetter paper, I see that the Taylor model gives a bound on the function values at any POINT in the interval of approximation, whereas how the authors obtained the bounds on the function over the BOXES in the Makino/Berz paper was not described, except to say that Taylor models were used. (Clearly, plugging intervals into the argument of the Taylor model won't work in general, as it doesn't give higher than a second-order enclosure.) Also, in the Berz/Hoffstaetter paper, the global optimization problem solved was a special problem related to beam physics, and involving a "transfer function". Without digging back to previous work on this (Interval Computations 1994 no. 2), I suspect that special techniques were used that cannot be used in general to bound the range of any Taylor model.
Thus, I have questions about using Taylor models to bound the range of functions over entire intervals. Martin, can you clarify?
I think that this discussion is very useful because it attracts the participants' attention to Martin's very important algorithms. Thanks for the references. I have read the papers again, and now I understand hopefully the situation better.
I want to present my understanding in the form that Martin and I are both right (actually, Martin is more right that I am, because
As I understand it, Martin's idea of computing the range as described in his paper with Makino (Vol. 5 of Relibale Computing) is as follows:
The difference between f and P can be made as small as possible: we can have 3rd, 4th, 12th order etc. methods depending on how many terms we preserve in the Taylor expansion
On the estimate-polynomial's-range step, we get an NP-hard problem for which it is impossible to have an estimate which is always of 3rd order (see our paper with Slava). As a result, the total order of the two-step procedure cannot be 3rd order, so formally speaking, my answer is right that there is no 3rd order procedure which can help.
In particular, when we start with a quadratic polynomial in the first place (and this problem is NP-hard for quadratic polynomials already), Taylor's approximation is exact, so there is no remainder at all (the approximation of f by P is accuracy with 3rd, 4th, etc., any degree). However, in this particular case, Martin's method does not give us any new estimate for the range: it simply says that we have to use the estimate for the original polynomial.
For most small sub-boxes of the original box, the function P will be monotonic in all the variables, so we can get the exact estimate. In a few points where the derivatives are close to 0 we may have an overestimation, and this overestimation may be of 2nd order. This explains in what sense I am right.
On the other hand, Martin was absolutely right because he understood Sheela's question not as
If, as I understand it, Sheela really wants to compute the range nicely, then we should accept the fact that it is theoretically impossible to get a range which is 3rd order estimate and use the best possible known algorithm (Martin's may as well be a one) for which at least the first component of the error is 3rd order.
Yes, true. However, I'd still like to hear from Berz et al about the details of estimating the range of the polynomial. In particular, the numerical results on pp. 6-7 of Reliable Computing 5(1), February, 1999 are impressive, but details of the computation were not given.
Baker, Vladik, Sheela,
Now I had taken off Saturday, and found out that I apparently missed something interesting - sorry for keeping people waiting, let me try to address some of the points now. Actually this is a good example of the nice job that Interval people, and in particular Baker and Vladik, are doing about getting to the bottom of may interval-related topics in this useful discussion server.
By the time I am now re-entering the discussion, quite a few aspects have been touched upon, and I am basing my answer mostly on Vladik's last email because it attempts to summarize many of the points and hence allows me to insert my answers in perhaps a most structured way. In addition, I will try to interweave other questions that seem to have come up by Baker and Sheela; if I may have missed something, please ask again. Please note, however, that because of a commitment this Sunday afternoon, I may not be able to reply before Monday ( perhaps I have to re-think the amount of dedication to my work ;-)... ) To close the introduction, Vladik, can you send me a paper or .ps copy of your IEEE complexity paper?
Thanks a lot for the detailed answer. I am glad that we are very much in agreement, and that my understanding of your text is essentially correct.
Thanks for your interest in our results on complexity. I have just sent you a PS file with our paper w/Nesterov, and if necessary, I will be glad to send you a copy of the corresponding part of our complexity book (if you do not have easy access ot it).
The only point which requires clarification is computing the exact bounds for a quadratic polynomial. This problem is NPhard (result by Vavasis, mentioned and proven in our book). Of course, we can explicitly compute it but this computation will require the time which grows exponentially with the number of variables, so it is 2^n.
> 1) First, split the polynomial into a (low-order) part L for which we can > determine location of points that provide maxima exactly, and the high-order > rest H for which we cannot. For example, if L ist just the linear part, it > can be bounded exactly, and the maximum always occurs in one of the corners > in the multidimensional box, and it's even very cheap, with essentially no > penalty for higher orders. Second order L (with fixed variable numbers - see > Vladik's comment on complexity as function of number of variables) can be > done reasonably easily for any number of variables; to this end, consider > the edges, which gives second order of one degree less (and then iterate), > and look on the inside which gives a linear system via vanishing gradient. > To higher orders of L, I will come back below, in the framework of > complexity discussions.A similar issue:
> Vladik, let me ask the following: suppose I have a fifth order polynomial P > of one variable, to bound over the range [a,b]. I can do this by freshman > math by calculating the zeros of the derivative (for which there is a closed > formula; perhaps sophomore math, but nonetheless) Then I throw out those >
most 6 total points consisting of surviving boundary points and a and b to > get the range. Altogether, the only part of the algorithm that depends on > [a,b] at all is the exclusion of irrelevant critical points; but while the > computation of bounds takes longer for big [a,b] because there may be more > critical points inside, the bounds are exact (to floating point etc etc). > Isn't this a scaling with the accuracy that is higher than second order?
Of course, for a single variable, we get exact bounds.
For one variables, the global maximum is attained either at the endpoints, ot at a point where the derivative is equal to 0. So, we get 2 maybe 3 maybe more points, and looking over all of them, we get the result.
For several variables, a similar result is true for each of n variables x1,...,xn: for each of the variables, the maximum is attained either at the left endpoint xi- of the corresponding interval, or at the right endpoint xi+, or at the point where the partial derivative of the objective function with respect to xi is equal to 0. For each of n variables, we have 3 possibilities, so totally, for n variables, we have 3^n possible combinations. Even when we come up with a simple algorithm to solve each of the 3^n cases, we still have exponential number of cases.
These calculations are in line with the fact that range estimation is NP-hard when we consider arbitrary number of variables and is polynomial-time for each fixed number of variables n (then, this algorithm, in effect, is polynomial, because 3^n is simply a constant here, althoiugh for large n, it is a practically huge constant).
If I understand the process correctly, then I suspect that the method will not handle the following problem better than a low-order extension. (This problem is called Gritton's second problem, from chemical engineering.) I give the problem in the form of a Fortran statement for GlobSol. It is an eighteenth degree polynomial of one variable, and I present it in Horner's form.
Am I right that a high-order Taylor model will not work better for this problem? (The function has 18 roots in the interval [-12,8].)
! Using Horner's scheme -- F(1) = -0.371936250000000D+03 + X(1)* ( & -0.791246565600000D+03 + X(1)* ( & 0.404494414300000D+04 + X(1)* ( & 0.978137516700000D+03 + X(1)* ( & -0.165478928000000D+05 + X(1)* ( & 0.221407282700000D+05 + X(1)* ( & -0.932654935900000D+04 + X(1)* ( & -0.351853687200000D+04 + X(1)* ( & 0.478253229600000D+04 + X(1)* ( & -0.128147944000000D+04 + X(1)* ( & -0.283443587500000D+03 + X(1)* ( & 0.202627091500000D+03 + X(1)* ( & -0.161791345900000D+02 + X(1)* ( & -0.888303902000000D+01 + X(1)* ( & 0.157558017300000D+01 + X(1)* ( & 0.124599084800000D+00 + X(1)* ( & -0.358914862200000D-01 + X(1)* ( & -0.195109557600000D-03 + X(1)* ( & 0.227468222900000D-03 ))))))))))))))))))
But it won't solve ALL of our problems. In particular, I just posted a problem to Reliable Computing that I don't think it will solve. If the problem is already in polynomial form and the high-order terms are not small, I don't think it will help.
If we can identify particular problems upon which we may wish to try it, we might be able to get an idea
Also, there is a SIGNIFICANT amount of time that would need to be spent to implement it as well as Martin does in COSY. He appears to have been very clever in the implementation of the automatic differentiation aspects.
I'm still thinking about it, and I'm thinking about an implementation strategy.
If we can identify particular problems upon which we may wish to try it, we might be able to get an idea of whether it will solve them or not before we replicate Martin's work.
As you know, the interpolation forms are given for functions of a single variable in the book: Interval methods for systems of equations by A. Neumaier, sec2.4, pages 66-73.
Permit me to quote from the first para of section 2.4 cited above: "..... In particular, we derive the parabolic boundary value form and show that it has the cubic approximation property; often it even gives the range without overestimation". and to further quote from remark (vi), pg. 72 "The parabolic boundary value form applies to almost arbitrary functions.... f need not even be continuous ... ".
Explicit formulae for constructing the parabolic boundary value form are given in Theorem 2.4.2 of this book. Moreover, by using the interpolation form determined by k+1 points, one can obtain convergence of order of k+2.
Also interesting is the remark (vii) on pg. 73 of this book : "... Another form with cubic approximation order can be constructed using a Taylor series .... ". The form is then given.
However, all these wonderful and useful results and forms as given by Prof. Neaumier are specific to functions of a single variable. Can these results and forms be extended to functions of several variables? May be it can be done using Taylor series- on the lines given by Neaumier for the single var. case?
(Sheela and I are working on the same application problem ).
Thanking you in advance.
In the discussion the question of determining the range of a polynomial over a box came up. I would like to remind that an approach proven to be very useful for this task is the expansion of the given polynomial into Bernstein polynomials, the so-called Bernstein form; for the univariate case see, e.g.,
H.C.Fisher, 'Range computations and applications', in 'Contributions to Computer Arithmetic and Self-Validating Numerical Methods', C.Ullrich, Ed., Amsterdam, J.C.Baltzer, 1990, pp.197-211
and for the multivariate case, J.Garloff, 'The Bernstein algorithm', Interval Computations vol.2, 1993, pp.154-168.
A nearly exhaustive bibliography can be found in R.Hungerbuehler and J.Garloff, Reliable Computing, vol.4, 1998, pp.3-13.
Special feartures of the approach are:
Thanks a lot for your thoughtful comments. May I make one minor comment?
> Date: Mon, 15 Nov 1999 10:45:55 +0100 (MET) > From: Arnold NeumaierIt depends, of course, on how we interpret this statement precisely, but as I interpret it, known complexity results do cover this case as well. Namely, Theorem 3.2 from our book proves, in effect, that for every delta>0 and epsilon>0, the problem of computing the range of a quadratic polynomial on intervals of width delta with accuracy epsilon is NP-hard. This is true, in particular, if we take epsilon=delta^3 or whetever.
> To: email@example.com, firstname.lastname@example.org > Subject: RE: methods with convergence order higher than two > > P.S.V. Nataraj wrote, > > >> interpolation forms are given for functions of a single > variable in the book: > "Interval methods for systems of equations" by A. Neumaier, sec2.4, > pages 66-73. [...] However, all these wonderful and useful results > and forms [...] are specific to functions of a single variable. > Can these results and forms be extended to functions of several > variables?<< > > In principle, it seems possible to generalize these univariate > results to higher dimension, in the sense that one can probably > find analogues that replace the enclosure problem for a general function > f(x) by that of a suitable interval quadratic function [q(x)] such > that the overestimation of the range of f(x) by that of [q(x)] is > of order O(rad(x)^3). And it is probably valuable to explore the > various possibilities to do this. However, this only reduces the > general problem to that of finding the range of a multivariate > quadratic, and there is no polynomial way to do that in general > (unless P=NP). On the other hand, asking for an asymptotic result > is an easier problem, since it means that one only wants O(eps^3) > accuracy and only for boxes with radius O(eps), and midpoint and > coefficients essentially independent of eps. It seems that the > known complexity results don't cover this case,
The proof (this is actually one of the results by Patrick Kahl, one of the book's co-authors who wrote his these on extending known NP-hardness results to narrow input intervals) is rather straighforward: we know that computing a range of a quadratic polynomial is NP-hard. In this proof, we consider quadratic polynomials f(x1,...,xn) with coefficients which take integer values from -3 to 3 and with the range of every variable xi the interval [0,1] (we can take [1,2] if we want, then the coefficients of the polynomial will slightly increase). To prove that the problem is NP-hard for narrow intervals, we can consider a new polynomial F(y1,...,yn)=f(x1/delta,...,xn/delta) (or with some similar linear functions). For this new polynomial, the range computation problem is NP-hard for ranges of yi of width delta (they correspond to ranges of xi of width 1).
The coefficients of the new polynomial are of order 1/delta^2$, so we can condisder a new polynomial delta^2F(y1,..,yn); for new polynomial, the coefficients are uniformly bounded, but computing its range with accuracy epslion\times delta^2 is NP-hard. So, it is NP-hard to even compute the range with quadratic accuracy ~ delta^2. (Therefore, for a more complex problem of computing the range with the accuracy delta^3 it is also NP-hard).
On the other hand, there may be other possible formulations of this problem which may turn out to be polynomial-time; Arnold, if you have any intuition on what these formulations may be, please let us know.
> and if this is the > case there is still some work left to be done, and perhaps a positive > (polynomial) answer is possible for the quadratic case. Then it would > be likely to extend it to the general case, too, by looking more > closely at what happens to the eps'es. > > In any case, the problem of finding good O(n^3) bounds for the > range of a multivariate quadratic is a very important one, and even if > one cannot get the exact range it would be very useful to know what > one c a n get with a reasonable amount of work, and whether the > class of problems where the exponential effort is required is small > or large. > > And finding the best ways of reducing a nonlinear problem to one that > is quadratic is also a useful enterprise. > > I think in both areas the last word hasn't been said yet, and I'd like > to encourage the interval community to investigate these topics more > closely, both by means of algorithmic advances, comparative numerical > studies, and theoretical analysis of the potential and the limits of > various approaches. > > Arnold Neumaier
Ms. Sheela and I both are working under Prof. P. S. V. Nataraj doing research in applications of interval arithmetic.
I also have been reading the discussions on the reliable_computing list initiated by Sheela. Just now I was going through Martin Berz's papers and trying something. I am tempted to do the following remarks based on my observations. Although the following remarks are not straight away related to convergence order of higher order etc., they _are_ related to it based on the actual motivation of using higher order convergence forms etc. to increase computational efficiency and reduce over-estimation for range computation.
It is very useful to do a monotonicity check for the function f(X) over the given interval vector X=[X1 X2 .. Xn]. It may be monotonic along each or some variables X1,X2.. in the given domain or in part of the domain. (Though this is well-known, it might be overlooked many times to do monotonicity check.)
In fact for the 3-dimensional complicated function on page 6 of the following reference, the exact range is possible to compute just by evaluating the function at 8 corner points of the 3-D box of input interval [x1 x2 x3], because it is monotonic along each variable x1, x2 and x3. (Ref: K. Makino and Martin Berz ," Efficient Control of the dependency Problem based on Taylor Model Methods', Reliable Computing 5, 3-12, 1999 )
It is found by gradient evaluation that the function is monotonic along each variable. (I computed gradients using 'gradientinit' tool of INTVAL toolbox for MATLAB of Rump et al). The gradients of f(x1,x2,x3) are found to be ---
intval gradient derivative(s) yg.dx = Column 1 [ -4.890465660975971e-001, -2.093516938447799e-001] Column 2 [ 9.255280249313442e+000, 2.113569763264646e+001] Column 3 [ -4.569281010721819e+001, -7.059391352010891e+000]It clearly shows monoticity of function along each variable, for the given input intervals x1,x2 and x3.
Although this may not happen for each function, the monotonicity check of the function f(X) for the given interval X (or may be parts of X) may obviously lead to more accurate and efficient computation of range.
I apologies, if I am wrong some where in my remarks, being a new user
of interval arithmetic. I welcome corrections, if any.
Dear Dr. Garloff,
Thank you for the info on Bernstein forms.
The application Sheela and I are working on involves:
Dear Dr. Berz,
I have made a first reading of your papers on the Taylor model computation for multivariate functions.
The Taylor model method appears to be quite promising in engineering applications such as ours, where there are several (upto 7-8) variables and functions can involve any of the built-in functions of FORTRAN.
As I understand from your papers, the Taylor model computation can be done automatically even for high orders and a moderately large number of variables (both upto 10) using software available at your university.
However, to be of general use in practical problems having thicker domain intervals (the examples in your papers have fairly thin domain intervals), it is necessary to combine the Taylor models with other schemes (at least, with sub-division) for range computation with higher order ( > 2) convergence.
Please tell if your univ's software for automatic Taylor model computation does support user defined routines having interval arithmetic operations?
The idea is to be able to combine the Taylor model with sub-division, or use it along with more sophisticated interval global optimization techniques for efficient range computation over large boxes.
Thanking you in advance,
With all of the discussion about generalized interval methods, I thought I would throw in my comments:
Caution: Better bounds may well come at the expense of using more basic operations to implement interval operations. With two distinct functions, more information must be kept in memory for each interval.
Dear Mr. Barve,
> 2) I also have been reading the discussions on the reliable_computing > list initiated by Sheela. Just now I was going through Martin Berz's > papers and trying something. I am tempted to do the following remarks > based on my observations. > Though the following remarks are not straight away related to > convergence order of higher order etc. > But they are related to it based on the actual motivation of > using higher order convergence forms etc. to increase computational > efficiency and reduce over-estimation for range computation. > > > 3) It is very useful to do monotonicity check for the function > f(X) over the given interval vector X=[X1 X2 .. Xn]. It may be > monotonic along each or some variables X1,X2.. in the given domain > or in part of the domain. (Though this is well-known, it might > be over-looked many times to do monotonicity check.)In principle such checks may be useful in some cases and make the computation of the bound very easy if they apply. In the beginning we have in fact thought about including this check in the procedure. However, one would still need to find range bounds for the derivatives, and if they do contain zero, not much is gained. In general, the range bounding of the derivatives is the same type of problem that we originally set out except one order lower. Perhaps it can be done with lower accuracy; but if it doesn't allow the exclusion of zero very quickly (say by mere interval evaluation), then it's tough to decide how far to continue this thread if the end result may be that anyway we don't have monotonicity.
The algorithm for the polynomial bounding we are currently using in fact has most of the effect that you may want to achieve with the monotonicity test built in; as illustrated in the description of the algorithm that we circulated a few days ago, we use the slope of the linear part of the polynomial (which is of course boundable exactly and easily) and the fact that the higher orders converge to make an estimate of sub-regions where the maxima have to occur, and iterate a few times.
> Observation: > > In fact for the 3-dimensional complicated function on page 6 of the > following reference, the exact range is possible to compute just by > evaluating the function at 8 corner points of the 3-D box of input > interval [x1 x2 x3]. Because it is monotonic along each variable > x1, x2 and x3. > > (Ref: K. Makino and Martin Berz ," Efficient Control of the dependency > Problem based on Taylor Model Methods', Reliable Computing 5, 3-12, > 1999 ) > > It is found by gradient evaluation that the function is > monotonic along > each variable. (I computed gradients using 'gradientinit' tool of > INTVAL toolbox for MATLAB of Rump etal). > The gradients of f(x1,x2,x3) are found to be --- > > intval gradient derivative(s) yg.dx = > Column 1 > [ -4.890465660975971e-001, -2.093516938447799e-001] > Column 2 > [ 9.255280249313442e+000, 2.113569763264646e+001] > Column 3 > [ -4.569281010721819e+001, -7.059391352010891e+000] > > It clearly shows monoticity of function along each variable, for > the given input intervals x1,x2 and x3. > > 4) Though this may not happen for each function, the monotonicity check > of the function f(X) for the given interval X (or may be parts of X) > may obviously lead to more accurate and efficient computation > of range. > > 5) I apologies, if I am wrong some where in my remarks, being a new user > of interval arithmetic. And wel-come corrections, if any.Of course in principle you are right to point this out. It also serves as an example of one of the really powerful aspects of interval methods: if you have several ways of testing things, you can pursue them all, and use the most favorable result you get. The black magic lies in what routes to choose to get you what you want most likely and most quickly.
> I have made a first reading of your papers on the Taylor model computation > for multivariate functions. > > The Taylor model method appears to be quite promising in engineering > applications such as ours, where there are several (upto 7-8) variables > and functions can involve any of the built-in functions of FORTRAN. > > As I understand from your papers, the Taylor model computation can be > done automatically even for high orders and a moderately large number > of variables (both upto 10) using software available at your university.Yes, indeed something like 10 independent variables is not a problem.
> However, to be of general use in practical problems having thicker domain > intervals (the examples in your papers have fairly thin domain > intervals), it is necessary to combine the Taylor models with other > schemes (at least, with sub-division) for range computation with higher > order ( > 2) convergence.As with any range bounding tool that works locally, for large domains this method has to be applied repeatedly. In the case of the Taylor models, there is usually a very sharp transition from which on they start to give very good results, and so the "smallest" boxes one needs to manage can usually be much larger than with intervals. So the "inflation" of boxes is usually much less severe, and often one can just use a multidimensional grid and scan. For the very difficult functions with lots of local maxima that we were treating, this is usually what we did.
At one point we were actually starting to discuss with Baker to utilize the Taylor models as a sharp local range bounding tool in his sophisticated global optimizer, in which case one could use the whole machinery that manages boxes etc; but so far this is only thought.
> Please tell if your univ's software for automatic Taylor model computation > does support user defined routines having interval arithmetic > operations?Yes, the COSY environment allows user-defined functions within the COSY language very easily (or by linking to FORTRAN, which is a little more difficult and system dependent), and we also support plain interval evaluation.
> The idea is to be able to combine the Taylor model with > sub-division, or use it along with more sophisticated interval > global optimization techniques for efficient range computation over large > boxes.> With separate message I was in contact with your student Mr. Barve, and I suggest we try together to implement the functions you are interested in and see what happens.
Baker, Vladik, Sheela et al.,
Kyoko Makino has performed some runs with Taylor models for Baker's problematic function; sorry that it took a while, I was almost out of commission with an allergic reaction in the last two days. As we will show, the Taylor model approach works very favorably even for this case. In fact, we will use this example to illustrate the performance of the Taylor model approach even for cases that are apparently nontrivial for conventional interval methods.
> If I understand the process correctly, then I suspect that the > method will not handle the following problem better than a low-order > extension. (This problem is called > Gritton's second problem, from chemical engineering.) I give the > problem in the form of a Fortran statement for GlobSol. It is an > eighteenth degree polynomial of one variable, and I present it in > Horner's form. > > Am I right that a high-order Taylor model will not work better for > this problem? (The function has 18 roots in the interval [-12,8].)We study the behavior of Baker's function around the sample expansion point 1.0. The floating point Taylor expansion to order 5, as computed by COSY, is as follows.
RDA evaluation of the function RDA VARIABLE: NO= 5, NV= 1 I COEFFICIENT ORDER EXPONENTS 1 4.486589620246718 0 0 2 -8.799797859981823 1 1 3 -129.5969983611221 2 2 4 622.0452345235221 3 3 5 -914.5460853304157 4 4 6 -40.40649503718055 5 5(Baker, to make sure we have the right function, if you want you can see if you get the same function value - we also ran the function in FORTRAN and it checked out ok)
Kyoko computed the Taylor model remainder bound, a total bound predicted by the Taylor model, a bound obtained by naive interval arithmetic, and to put things into perspective as to what can be expected, an inner inclusion or the range bound obtained by scanning the function over 1000 regularly spaced points. All interval operations were performed using COSY's interval tools based on software outward rounding, and the results are shown below for reference.
We use the example to illustrate the following points.
1) Taylor models can give inclusions that are many orders of magnitude sharper than those obtained by naive interval evaluation, even for examples such as this that are considered complicated.
For example, for a width of 0.4, the width of the fifth order Taylor model remainder bound is about 0.7, while the width of the naive interval bound is around 20,000. As expected, with smaller widths, the results become more favorable for Taylor models; at a width of 0.0125, the width of the fifth order Taylor model remainder bound is about 5E-10, while the width of the naive interval bound is around 400. Tenth order Taylor models behave even much more favorable.
2) The sharpness of the prediction with Taylor models of order n scales with order (n+1), the demand posed in the original question that lead to this thread.
Indeed, in the example below using n=5, it can be observed that dividing the width by two yields a gain of nearly two orders of magnitude, and over the entire range of widths we have studied. Because n=5, as illustrated in my reply to Vladik's summary of the method, all occurring polynomials can be bounded fully sharply by mere arithmetic (which of course has to be done with interval representations of floating point numbers, and thus introduces a very small but for our purposes irrelevant overestimation), and thus the sharpness of the remainder bound is a true measure for the sharpness of the entire method.
Similarly, for n=10, we obtain a gain of about three orders of magnitude each time the interval is divided by two. A sharpness of around 1E-12 is reached for a domain of width 0.1.
3) Compared to the other effects, the details of what polynomial bounder is being used is of rather secondary importance.
To illustrate this point, we purposely used neither the analytic bounder, nor the one outlined in my message of Sunday. Rather, we merely sent intervals through the respective polynomial, with the only sophistication being the use of an intrinsic interval square operation where applicable. As such, this method represents a rather crude lower bound of what to expect from polynomial bounders.
Nevertheless, the resulting total bounds behave much more favorably with Taylor models, providing an overestimation of about a factor of 2 for the largest width studied (as opposed to an overestimation of a factor of 2,000 for the naive intervals), and reaching the noise level of the scanning estimate, but certainly much less than 1.01 for the smallest width (as opposed to an overestimation of a factor of 50 for the naive intervals)
4) To use naive intervals to achieve sharpness levels comparable to those that can be attained with Taylor models requires domain widths many orders of magnitude smaller.
To illustrate this effect, we list the widths of naive interval evaluations of the function for various widths, showing the expected linear scaling.
Domain width Resulting width 0.1D+01 : 140666.8672350321 0.1D-01 : 356.3602597354367 0.1D-03 : 3.563758735169588 0.1D-05 : 0.3563758754273927E-01 0.1D-07 : 0.3563759053708537E-03 0.1D-09 : 0.3563790302862913E-05 0.1D-11 : 0.3567049766672881E-07As usual, the effectiveness of the method is intimately tied to the fact that the bulk behavior of the functional dependence is captured in the Taylor expansion. In the example in question, there is a lot of cancellation going on between the different orders of the original polynomial, which make interval calculations difficult and lead to overestimation. In the Taylor model approach, however, the Taylor models of the various terms of the polynomial can be added and subtracted, and the bulk of the cancellation problem disappears.
The detailed results of Kyoko's Taylor model computations are as follows.
*** Width 0.4000000000000000 REMAINDER BOUND INTERVAL Order 5 [-.3416357780869606 ,0.3416357780869606 ] Order 10 [-0.2328535934303353E-05, 0.2328535934303353E-05] Bound evaluation of the function Taylor Model: [ -9.251451355410341 , 11.57747692493020 ] Naive Interval: [ -12562.21065212304 , 10660.56655625021 ] Rastering: [ -5.273490704793460 , 4.617158578875092 ] *** Width 0.2000000000000000 REMAINDER BOUND INTERVAL Order 5 [-.4621661359731618E-002,0.4621661359731619E-002] Order 10 [-0.1107624804821486E-08, 0.1107624804821485E-08] Bound evaluation of the function Taylor Model: [ 1.592114281270570 , 5.993640367078595 ] Naive Interval: [ -4604.902777062254 , 3766.828066581296 ] Rastering: [ 2.842020783777230 , 4.617163053028378 ] *** Width 0.1000000000000000 REMAINDER BOUND INTERVAL Order 5 [-.6726596225651657E-004,0.6726596225651657E-004] Order 10 [-0.5340837667066269E-12, 0.5340837667066269E-12] Bound evaluation of the function Taylor Model: [ 3.639055771004044 , 5.004415060553268 ] Naive Interval: [ -1942.104301934065 , 1832.828958028471 ] Rastering: [ 3.794653671645733 , 4.617163892782912 ] *** Width 0.5000000000000000E-01 REMAINDER BOUND INTERVAL Order 5 [-.1014678358854022E-005,0.1014678358854024E-005] Bound evaluation of the function Taylor Model: [ 4.175518439144360 , 4.716305432808793 ] Naive Interval: [ -901.0920355888309 , 904.9732440520786 ] Rastering: [ 4.194958674398492 , 4.615510452684589 ] *** Width 0.2500000000000000E-01 REMAINDER BOUND INTERVAL Order 5 [-.1557890476201449E-007,0.1557890476201449E-007] Bound evaluation of the function Taylor Model: [ 4.355105328209006 , 4.597802053505219 ] Naive Interval: [ -439.9913804625362 , 452.3847128010760 ] Rastering: [ 4.357535212820721 , 4.575100319861519 ] *** Width 0.1250000000000000E-01 REMAINDER BOUND INTERVAL Order 5 [-.2413011037357702E-009,0.2413011037357716E-009] Bound evaluation of the function Taylor Model: [ 4.426375238247726 , 4.541740224010647 ] Naive Interval: [ -217.5954692009348 , 227.8438762229445 ] Rastering: [ 4.426678971590206 , 4.536372712588161 ]Hope this helps -
If you have a question related to validated computing, interval analysis, or related matters, I recommend