Misplaced Pages

Numerical range

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In the mathematical field of linear algebra and convex analysis, the numerical range or field of values of a complex n × n {\displaystyle n\times n} matrix A is the set

W ( A ) = { x A x x x x C n ,   x 0 } = { x , A x x C n ,   x 2 = 1 } {\displaystyle W(A)=\left\{{\frac {\mathbf {x} ^{*}A\mathbf {x} }{\mathbf {x} ^{*}\mathbf {x} }}\mid \mathbf {x} \in \mathbb {C} ^{n},\ \mathbf {x} \not =0\right\}=\left\{\langle \mathbf {x} ,A\mathbf {x} \rangle \mid \mathbf {x} \in \mathbb {C} ^{n},\ \|\mathbf {x} \|_{2}=1\right\}}

where x {\displaystyle \mathbf {x} ^{*}} denotes the conjugate transpose of the vector x {\displaystyle \mathbf {x} } . The numerical range includes, in particular, the diagonal entries of the matrix (obtained by choosing x equal to the unit vectors along the coordinate axes) and the eigenvalues of the matrix (obtained by choosing x equal to the eigenvectors).

In engineering, numerical ranges are used as a rough estimate of eigenvalues of A. Recently, generalizations of the numerical range are used to study quantum computing.

A related concept is the numerical radius, which is the largest absolute value of the numbers in the numerical range, i.e.

r ( A ) = sup { | λ | : λ W ( A ) } = sup x 2 = 1 | x , A x | . {\displaystyle r(A)=\sup\{|\lambda |:\lambda \in W(A)\}=\sup _{\|x\|_{2}=1}|\langle \mathbf {x} ,A\mathbf {x} \rangle |.}

Properties

Let sum of sets denote a sumset.

General properties

  1. The numerical range is the range of the Rayleigh quotient.
  2. (Hausdorff–Toeplitz theorem) The numerical range is convex and compact.
  3. W ( α A + β I ) = α W ( A ) + { β } {\displaystyle W(\alpha A+\beta I)=\alpha W(A)+\{\beta \}} for all square matrix A {\displaystyle A} and complex numbers α {\displaystyle \alpha } and β {\displaystyle \beta } . Here I {\displaystyle I} is the identity matrix.
  4. W ( A ) {\displaystyle W(A)} is a subset of the closed right half-plane if and only if A + A {\displaystyle A+A^{*}} is positive semidefinite.
  5. The numerical range W ( ) {\displaystyle W(\cdot )} is the only function on the set of square matrices that satisfies (2), (3) and (4).
  6. W ( U A U ) = W ( A ) {\displaystyle W(UAU^{*})=W(A)} for any unitary U {\displaystyle U} .
  7. W ( A ) = W ( A ) {\displaystyle W(A^{*})=W(A)^{*}} .
  8. If A {\displaystyle A} is Hermitian, then W ( A ) {\displaystyle W(A)} is on the real line. If A {\displaystyle A} is anti-Hermitian, then W ( A ) {\displaystyle W(A)} is on the imaginary line.
  9. W ( A ) = { z } {\displaystyle W(A)=\{z\}} if and only if A = z I {\displaystyle A=zI} .
  10. (Sub-additive) W ( A + B ) W ( A ) + W ( B ) {\displaystyle W(A+B)\subseteq W(A)+W(B)} .
  11. W ( A ) {\displaystyle W(A)} contains all the eigenvalues of A {\displaystyle A} .
  12. The numerical range of a 2 × 2 {\displaystyle 2\times 2} matrix is a filled ellipse.
  13. W ( A ) {\displaystyle W(A)} is a real line segment [ α , β ] {\displaystyle } if and only if A {\displaystyle A} is a Hermitian matrix with its smallest and the largest eigenvalues being α {\displaystyle \alpha } and β {\displaystyle \beta } .

Normal matrices

  1. If A {\textstyle A} is normal, and x span ( v 1 , , v k ) {\textstyle x\in \operatorname {span} (v_{1},\dots ,v_{k})} , where v 1 , , v k {\textstyle v_{1},\ldots ,v_{k}} are eigenvectors of A {\textstyle A} corresponding to λ 1 , , λ k {\textstyle \lambda _{1},\ldots ,\lambda _{k}} , respectively, then x , A x hull ( λ 1 , , λ k ) {\textstyle \langle x,Ax\rangle \in \operatorname {hull} \left(\lambda _{1},\ldots ,\lambda _{k}\right)} .
  2. If A {\displaystyle A} is a normal matrix then W ( A ) {\displaystyle W(A)} is the convex hull of its eigenvalues.
  3. If α {\displaystyle \alpha } is a sharp point on the boundary of W ( A ) {\displaystyle W(A)} , then α {\displaystyle \alpha } is a normal eigenvalue of A {\displaystyle A} .

Numerical radius

  1. r ( ) {\displaystyle r(\cdot )} is a unitarily invariant norm on the space of n × n {\displaystyle n\times n} matrices.
  2. r ( A ) A o p 2 r ( A ) {\displaystyle r(A)\leq \|A\|_{op}\leq 2r(A)} , where o p {\displaystyle \|\cdot \|_{op}} denotes the operator norm.
  3. r ( A ) = A o p {\displaystyle r(A)=\|A\|_{op}} if (but not only if) A {\displaystyle A} is normal.
  4. r ( A n ) r ( A ) n {\displaystyle r(A^{n})\leq r(A)^{n}} .

Proofs

Most of the claims are obvious. Some are not.

General properties

Proof of (13)

If A {\textstyle A} is Hermitian, then it is normal, so it is the convex hull of its eigenvalues, which are all real.

Conversely, assume W ( A ) {\textstyle W(A)} is on the real line. Decompose A = B + C {\textstyle A=B+C} , where B {\textstyle B} is a Hermitian matrix, and C {\textstyle C} an anti-Hermitian matrix. Since W ( C ) {\textstyle W(C)} is on the imaginary line, if C 0 {\textstyle C\neq 0} , then W ( A ) {\textstyle W(A)} would stray from the real line. Thus C = 0 {\textstyle C=0} , and A {\textstyle A} is Hermitian.

Proof of (12)

The elements of W ( A ) {\textstyle W(A)} are of the form tr ( A P ) {\textstyle \operatorname {tr} (AP)} , where P {\textstyle P} is projection from C 2 {\textstyle \mathbb {C} ^{2}} to a one-dimensional subspace.

The space of all one-dimensional subspaces of C 2 {\textstyle \mathbb {C} ^{2}} is P C 1 {\textstyle \mathbb {P} \mathbb {C} ^{1}} , which is a 2-sphere. The image of a 2-sphere under a linear projection is a filled ellipse.

In more detail, such P {\textstyle P} are of the form 1 2 I + 1 2 [ cos 2 θ e i ϕ sin 2 θ e i ϕ sin 2 θ cos 2 θ ] = 1 2 [ 1 + z x + i y x i y 1 z ] {\displaystyle {\frac {1}{2}}I+{\frac {1}{2}}{\begin{bmatrix}\cos 2\theta &e^{i\phi }\sin 2\theta \\e^{-i\phi }\sin 2\theta &-\cos 2\theta \end{bmatrix}}={\frac {1}{2}}{\begin{bmatrix}1+z&x+iy\\x-iy&1-z\end{bmatrix}}} where x , y , z {\textstyle x,y,z} , satisfying x 2 + y 2 + z 2 = 1 {\textstyle x^{2}+y^{2}+z^{2}=1} , is a point on the unit 2-sphere.

Therefore, the elements of W ( A ) {\textstyle W(A)} , regarded as elements of R 2 {\textstyle \mathbb {R} ^{2}} is the composition of two real linear maps ( x , y , z ) 1 2 [ 1 + z x + i y x i y 1 z ] {\textstyle (x,y,z)\mapsto {\frac {1}{2}}{\begin{bmatrix}1+z&x+iy\\x-iy&1-z\end{bmatrix}}} and M tr ( A M ) {\textstyle M\mapsto \operatorname {tr} (AM)} , which maps the 2-sphere to a filled ellipse.

Proof of (2)

W ( A ) {\textstyle W(A)} is the image of a continuous map x x , A x {\textstyle x\mapsto \langle x,Ax\rangle } from the closed unit sphere, so it is compact.

For any x , y {\textstyle x,y} of unit norm, project A {\textstyle A} to the span of x , y {\textstyle x,y} as P A P {\textstyle P^{*}AP} . Then W ( P A P ) {\textstyle W(P^{*}AP)} is a filled ellipse by the previous result, and so for any θ [ 0 , 1 ] {\textstyle \theta \in } , let z = θ x + ( 1 θ ) y {\textstyle z=\theta x+(1-\theta )y} , we have z , A z = z , P A P z W ( P A P ) W ( A ) {\displaystyle \langle z,Az\rangle =\langle z,P^{*}APz\rangle \in W(P^{*}AP)\subset W(A)}

Proof of (5)

Let W {\textstyle W} satisfy these properties. Let W 0 {\textstyle W_{0}} be the original numerical range.

Fix some matrix A {\textstyle A} . We show that the supporting planes of W ( A ) {\textstyle W(A)} and W 0 ( A ) {\textstyle W_{0}(A)} are identical. This would then imply that W ( A ) = W 0 ( A ) {\textstyle W(A)=W_{0}(A)} since they are both convex and compact.

By property (4), W ( A ) {\textstyle W(A)} is nonempty. Let z {\textstyle z} be a point on the boundary of W ( A ) {\textstyle W(A)} , then we can translate and rotate the complex plane so that the point translates to the origin, and the region W ( A ) {\textstyle W(A)} falls entirely within C + {\textstyle \mathbb {C} ^{+}} . That is, for some ϕ R {\textstyle \phi \in \mathbb {R} } , the set e i ϕ ( W ( A ) z ) {\textstyle e^{i\phi }(W(A)-z)} lies entirely within C + {\textstyle \mathbb {C} ^{+}} , while for any t > 0 {\textstyle t>0} , the set e i ϕ ( W ( A ) z ) t I {\textstyle e^{i\phi }(W(A)-z)-tI} does not lie entirely in C + {\textstyle \mathbb {C} ^{+}} .

The two properties of W {\textstyle W} then imply that e i ϕ ( A z ) + e i ϕ ( A z ) 0 {\displaystyle e^{i\phi }(A-z)+e^{-i\phi }(A-z)^{*}\succeq 0} and that inequality is sharp, meaning that e i ϕ ( A z ) + e i ϕ ( A z ) {\textstyle e^{i\phi }(A-z)+e^{-i\phi }(A-z)^{*}} has a zero eigenvalue. This is a complete characterization of the supporting planes of W ( A ) {\textstyle W(A)} .

The same argument applies to W 0 ( A ) {\textstyle W_{0}(A)} , so they have the same supporting planes.

Normal matrices

Proof of (1), (2)

For (2), if A {\textstyle A} is normal, then it has a full eigenbasis, so it reduces to (1).

Since A {\textstyle A} is normal, by the spectral theorem, there exists a unitary matrix U {\textstyle U} such that A = U D U {\textstyle A=UDU^{*}} , where D {\textstyle D} is a diagonal matrix containing the eigenvalues λ 1 , λ 2 , , λ n {\textstyle \lambda _{1},\lambda _{2},\ldots ,\lambda _{n}} of A {\textstyle A} .

Let x = c 1 v 1 + c 2 v 2 + + c k v k {\textstyle x=c_{1}v_{1}+c_{2}v_{2}+\cdots +c_{k}v_{k}} . Using the linearity of the inner product, that A v j = λ j v j {\textstyle Av_{j}=\lambda _{j}v_{j}} , and that { v i } {\textstyle \left\{v_{i}\right\}} are orthonormal, we have:

x , A x = i , j = 1 k c i c j v i , λ j v j i = 1 k | c i | 2 λ i hull ( λ 1 , , λ k ) {\displaystyle \langle x,Ax\rangle =\sum _{i,j=1}^{k}c_{i}^{*}c_{j}\left\langle v_{i},\lambda _{j}v_{j}\right\rangle \sum _{i=1}^{k}\left|c_{i}\right|^{2}\lambda _{i}\in \operatorname {hull} \left(\lambda _{1},\ldots ,\lambda _{k}\right)}

Proof (3)

By affineness of W {\textstyle W} , we can translate and rotate the complex plane, so that we reduce to the case where W ( A ) {\textstyle \partial W(A)} has a sharp point at 0 {\textstyle 0} , and that the two supporting planes at that point both make an angle ϕ 1 , ϕ 2 {\textstyle \phi _{1},\phi _{2}} with the imaginary axis, such that ϕ 1 < ϕ 2 , e i ϕ 1 e i ϕ 2 {\textstyle \phi _{1}<\phi _{2},e^{i\phi _{1}}\neq e^{i\phi _{2}}} since the point is sharp.

Since 0 W ( A ) {\textstyle 0\in W(A)} , there exists a unit vector x 0 {\textstyle x_{0}} such that x 0 A x 0 = 0 {\textstyle x_{0}^{*}Ax_{0}=0} .

By general property (4), the numerical range lies in the sectors defined by: Re ( e i θ x , A x ) 0 for all  θ [ ϕ 1 , ϕ 2 ]  and nonzero  x C n . {\displaystyle \operatorname {Re} \left(e^{i\theta }\langle x,Ax\rangle \right)\geq 0\quad {\text{for all }}\theta \in {\text{ and nonzero }}x\in \mathbb {C} ^{n}.} At x = x 0 {\textstyle x=x_{0}} , the directional derivative in any direction y {\textstyle y} must vanish to maintain non-negativity. Specifically:
d d t Re ( e i θ x 0 + t y , A ( x 0 + t y ) ) | t = 0 = 0 y C n , θ [ ϕ 1 , ϕ 2 ] . {\displaystyle \left.{\frac {d}{dt}}\operatorname {Re} \left(e^{i\theta }\langle x_{0}+ty,A(x_{0}+ty)\rangle \right)\right|_{t=0}=0\quad \forall y\in \mathbb {C} ^{n},\theta \in .} Expanding this derivative:
Re ( e i θ ( y , A x 0 + x 0 , A y ) ) = 0 y C n , θ [ ϕ 1 , ϕ 2 ] . {\displaystyle \operatorname {Re} \left(e^{i\theta }\left(\langle y,Ax_{0}\rangle +\langle x_{0},Ay\rangle \right)\right)=0\quad \forall y\in \mathbb {C} ^{n},\theta \in .}

Since the above holds for all θ [ ϕ 1 , ϕ 2 ] {\textstyle \theta \in } , we must have: y , A x 0 + x 0 , A y = 0 y C n . {\displaystyle \langle y,Ax_{0}\rangle +\langle x_{0},Ay\rangle =0\quad \forall y\in \mathbb {C} ^{n}.}

For any y C n {\textstyle y\in \mathbb {C} ^{n}} and α C {\textstyle \alpha \in \mathbb {C} } , substitute α y {\textstyle \alpha y} into the equation: α y , A x 0 + α x 0 , A y = 0. {\displaystyle \alpha \langle y,Ax_{0}\rangle +\alpha ^{*}\langle x_{0},Ay\rangle =0.} Choose α = 1 {\textstyle \alpha =1} and α = i {\textstyle \alpha =i} , then simplify, we obtain y , A x 0 = 0 {\displaystyle \langle y,Ax_{0}\rangle =0} for all y {\displaystyle y} , thus A x 0 = 0 {\textstyle Ax_{0}=0} .

Numerical radius

Proof of (2)

Let v = arg max x 2 = 1 | x , A x | {\textstyle v=\arg \max _{\|x\|_{2}=1}|\langle x,Ax\rangle |} . We have r ( A ) = | v , A v | {\textstyle r(A)=|\langle v,Av\rangle |} .

By Cauchy–Schwarz, | v , A v | v 2 A v 2 = A v 2 A o p {\displaystyle |\langle v,Av\rangle |\leq \|v\|_{2}\|Av\|_{2}=\|Av\|_{2}\leq \|A\|_{op}}

For the other one, let A = B + i C {\textstyle A=B+iC} , where B , C {\textstyle B,C} are Hermitian. A o p B o p + C o p {\displaystyle \|A\|_{op}\leq \|B\|_{op}+\|C\|_{op}}

Since W ( B ) {\textstyle W(B)} is on the real line, and W ( i C ) {\textstyle W(iC)} is on the imaginary line, the extremal points of W ( B ) , W ( i C ) {\textstyle W(B),W(iC)} appear in W ( A ) {\textstyle W(A)} , shifted, thus both B o p = r ( B ) r ( A ) , C o p = r ( i C ) r ( A ) {\textstyle \|B\|_{op}=r(B)\leq r(A),\|C\|_{op}=r(iC)\leq r(A)} .

Generalisations

See also

Bibliography

References

  1. ""well-known" inequality for numerical radius of an operator". StackExchange.
  2. "Upper bound for norm of Hilbert space operator". StackExchange.
  3. "Inequalities for numerical radius of complex Hilbert space operator". StackExchange.
  4. Hilary Priestley. "B4b hilbert spaces: extended synopses 9. Spectral theory" (PDF). In fact, ‖T‖ = max(−mT , MT) = wT. This fails for non-self-adjoint operators, but wT ≤ ‖T‖ ≤ 2wT in the complex case.
Functional analysis (topicsglossary)
Spaces
Properties
Theorems
Operators
Algebras
Open problems
Applications
Advanced topics
Categories:
Numerical range Add topic