A Global Convergence of Spectral Conjugate Gradient Method for Large Scale Optimization

In this paper, we are concerned with the conjugate gradient method for solving unconstrained optimization problems due to its simplicity and don’t store any matrices. We proposed two spectral modifications to the conjugate descent (CD). These two proposed methods produces sufficient descent directions for the objective function at every iteration with strong Wolfe line searches and with inexact line search, and also they are globally convergent for general non-convex functions can be guaranteed. Numerical results show the efficiency of these two proposed methods.


Introduction.
Let R R f n → : be continuously differentiable function. Consider the unconstrained nonlinear optimization problem: We use g(x) to denote to the gradient of f at x. Due to need less computer memory especially, conjugate gradient method is very appealing for solving (1) when the number of variables is large. A conjugate gradient (CG) method generates a sequence of iterates by letting where the step-length k  is obtained by carrying out some line search, and the search direction where k  is scalar which determines the different CG methods [11]. There are many well-known formula for k  , such as the Fletcher-Reeves(FR) [7], Polak-Ribirere-Polyak (PRP) [13] and [14], Hesteness-Stiefel (HS) [10], conjugate descent (CD) [8], Liu-Story (LS) [12], and Dai-Yuan (DY) [5]. In survey paper Hager and Zhang in [9] reviewed the development of different various of nonlinear gradient methods, with especial attention given to global convergence properties.
The standard CD method proposed by Fletcher [8], specifies the where . denotes the Euclidean norm of vectors. An important of the CD method is that the method will produce a descent direction under the strong Wolfe line search [18] , ) ( Another popular method to solving (1) is spectral CG method, which was developed originally by Barzilai and Browein [2]. Raydan in [17] further introduced the spectral CG method for potentially large-scale unconstrained optimization problems. Birgin and Marti'nez [3] proposed a spectral CG method by combining CG method and spectral gradient method [17], by multiplying the gradient k g in the second part of equation (3) by the parameter k  in the following manner: Zhang in [19] take FR formula and proved that this method can guarantee to generate descent directions and is globally convergent. Matonoha and et al in [15] proposed a modified CD method by Zhong in [20] they proposed the spectral PRP method by using the standard PRP formula with k  defined by ( ) Du and Liu in [6] they proposed the spectral HS method by using the standard HS formula with k  defined in (9). Liu and Jiang [11] proposed success spectral gradient method by combining the CD method and spectral gradient method by the following manner In this paper we proposed two spectral CG method, they based to the modification to the standard CD in (4), and then proposed a suitable k  for each one to get a good spectral CD-CG methods.
The rest of this paper is organized as follows. In the next section, a new modified spectral CD-CG method is proposed by combining modification to CD method with k  defined in (11) will be denoted by MCD1, and we give its algorithm. Section 3 will be devoted to prove the global convergence. In section 4, new proposed spectral CD-CG method is proposed by combining modification to CD method with k  will be defined next in this section, will be denoted by MCD2, and we give its algorithm. In section 5 will be devoted to prove the global convergence. Finally in section 6, some numerical experiments will be done to test the efficiency of the two proposed methods.

2-Modified Spectral CD Conjugate Gradient and its Algorithm (MCD1).
In this section, we present a new modified CD method which is specified by If exact line search is used, then 1 MCD k  will reduce to standard CD k  , and k  in (11) equal one. However, we used inexact line search in our work. We put (12) with k  defined in (11) in (7), we will get the direction of our proposed method ( )
Step 3: Determine the steplength k  by using the strong Wolfe line search conditions (5) and (6).
Step 4: Calculate new point k x by (2).
Step 5: Compute Step 6: If k=n, or , set k=1, and go to step 1; else, set k=k+1, and go to step 2.
The following theorem shows that algorithm (2.1) possesses the sufficient descent condition with strong Wolfe line search (5) and (6).
be generated by algorithm (2.1), then we have where ( ) , the conclusion (13) holds for k=0. Now we assume that the conclusion is true for (k-1) and We need to prove that the conclusion holds for k. Multiply both sides of (13) by k g , we have ( ) Now, using (6) in (15), we obtain ( ), , we will get (14).

3-The global convergence of MCD1 method.
In order to establish the global convergence result for the MCD1, we will impose the following assumptions for f, which have been used often in the literature to analyze the global convergence of CG methods with inexact line search.

Assumption (I): Let
In some neighborhood  of  , f is continuously differentiable and its gradient g satisfying Lipschitz conditions, namely, there exist a constant L>0, such that Obviously, from the Assumption (I, i) there exists a positive constant such that: where B is the diameter of  . From Assumption (I, ii), we can also find if there exist a constant 0   , such that To prove global convergence by contradiction we assume that there is a positive constant  such that We are going to prove that Using (6) and (14) in (12) we will get ( )

Theorem 3.1
Suppose that the Assumption (I) holds and consider any CG methods (2) and (7). The parameter 1 MCD k  defined by (12), and k  defined by (11), the  (21) Proof: From (12) , (13) and (6) we get Take the norm of the both sides of (7) with (22) and (23), its yield Which is contrary to proof this theorem. Therefore, the proof is complete. Now to prove that the new algorithm is global convergence for general function, we establish a bounded for the change ( ) , which we will use to conclude, by contradiction, that the gradients cannot be bounded away from zero [16].

Lemma 3.2
Suppose that Assumption (I) hold and consider the CG algorithm (2.1), the direction k d given by (13) From the definition of k v , and using (6) we get With the above estimates we get Therefore (24) holds, which complete the proof.

4-Modified Spectral CD Conjugate Gradient and its Algorithm (MCD2).
In this section, we present a new modified CD method which is specified by (27) If exact line search is used, then 2 MCD k  will reduce to standard CD k  , and k  equal one. However, we used inexact line search in our work. We put (26) and (27) in (7), we will get new direction
Step 3: Determine the steplength k  by using the strong Wolfe line search conditions (5) and (6).
Step 4: Calculate new point k x by (2).
Step 6: If k=n, or , set k=1, and go to step 1; else, set k=k+1, and go to step 2.
The following theorem shows that algorithm (4.1) possesses the sufficient descent condition with strong Wolfe line search (5) and (6). , the conclusion (29) holds for k=0. Now we assume that the conclusion is true A Global Convergence of Spectral Conjugate Gradient Method for Large … 153 for (k-1) and We need to prove that the conclusion holds for k. Multiply both sides of (28) by k g , we have . W e will get (29), so the proof is complete.

5-The global convergence of MCD2 method.
In this section we are going to prove the global convergence of the proposed method MCD2.

Theorem 5.1
Suppose that the Assumption (I) holds and consider any CG methods (2) and (7). The parameter  (5) and (6), if Proof: we can rewrite (7) as follows , and squaring both side of the above equation, we get Noting that (33) Therefore, it follows from (33) and (19) that ( ) This contradicts the Zoutendijk condition [21]. Therefore the conclusion (31) holds, so the proof is complete.
The above theorem show that the new proposed method is independent to any line search is descent and global convergent.

Lemma 5.2
Suppose that Assumption (I) hold and consider the CG algorithm (4.1), the direction k d given by (28)  Using the identity Using the condition 0  k r , the triangle inequality, and (41), we obtain (36) From the definition of k v , and using (6)  Therefore (34) holds, which complete the proof.

6-Numerical results
In this section, we reported some numerical results that we obtained with the implementation of the two new methods MCD1 and MCD2 on a set of unconstrained test functions. The cod were written in Fortran 90 and in double precision arithmetic. Our experiments performed on a set of (35) large scale nonlinear unconstrained test functions. These test functions are contributed in CUTE (Bongratz [4] and Andrei [1]).
All these algorithms are implemented with strong Wolfe Powell line search conditions (5) and (6)  All these methods terminated when the following stopping criterion is satisfied: We record the number of iterations denoted by (NOI), the number of function evaluations denoted by (NOF), for purpose of our comparisons. Table (1) and (2) gives a computational results of the two new methods (namely: MCD1 and MCD2) against the standard CD method with n= 100 and 10000, respectively. While Table (3) and (4) gives the percentage performance of these two proposed methods (MCD1 and MCD2) against the standard CD method taking over all the tools as 100%.