Векторное пространство

Автор работы: Пользователь скрыл имя, 14 Апреля 2013 в 18:15, реферат

Описание работы

These notes are meant to accompany the course Electromagnetic Theory for the Spring 2010 term at RPI. This material is covered thoroughly in Chapters One and Seven in our textbook Classical Electrodynamics, 2nd Ed. by Hans Ohanian, and also in the auxiliary textbook Principles of Electrodynamics by Melvin Schwartz. I am attempting here to present a condensed version of this material, along with several worked examples. Before specializing the discussion to special relativity, we go over two introductory topics.

Файлы: 1 файл

NotesOnVectors.pdf

— 112.10 Кб (Скачать файл)
Page 1
Vectors and Covectors in Special Relativity
Jim Napolitano
March 12, 2010
These notes are meant to accompany the course Electromagnetic Theory for the Spring
2010 term at RPI. This material is covered thoroughly in Chapters One and Seven in our
textbook Classical Electrodynamics, 2nd Ed. by Hans Ohanian, and also in the auxiliary
textbook Principles of Electrodynamics by Melvin Schwartz. I am attempting here to present
a condensed version of this material, along with several worked examples.
Before specializing the discussion to special relativity, we go over two introductory topics.
One is the notation we use for vectors written as components, especially the Einstein sum-
mation notation. We will use this to come up with “grown up” definitions of scalars, vectors,
and tensors. The second is a brief introduction to coordinate-free geometry, which neces-
sitates a discussion of contravariant and covariant vectors. These two topics will be then
combined to write down the formalism for special relativity.
Old Math with a New Notation
We think of a vector in ordinary three-dimensional space as an object with direction and
magnitude. Typically, when we need to calculate something with a vector, we write it in
terms of its coordinates using some coordinate system, say (x,y,z). For example
v = v
x
x + v
y
y + v
z
z
(1a)
v = |v| =
[
v
2
x
+ v
2
y
+ v
2
z
]
1/2
(1b)
It is important to separate the vector object from the vector written in terms of its coordi-
nates. The vector is the same thing, that is in the same direction with the same magnitude,
regardless of what coordinate system we use to describe it. Indeed, we will, in a moment,
define a vector in terms of how one goes from a description in one coordinate system to a
description in another.
Figure 1 shows two sets of axes. Equation 1 describes the vector v in terms of the unprimed
1

Page 2

>
»

»≤
>≤
Figure 1: Simple depiction of a default
(unprimed) coordinate system, in black.
A second (primed) coordinate system is
drawn in red. The primed system can be
described in terms of some rotation of the
unprimed system.
coordinate system. We could just as easily have written the vector in the primed coordinate
system. In that case
v = v
x
x + v
y
y + v
z
z
(2a)
v = |v| =
[
(v
x
)
2
+
(
v
y
)
2
+ (v
z
)
2
]
1/2
(2b)
Since the unit vectors are orthogonal to each other within a particular coordinate system, it
is easy enough to write down the coordinates in the primed frame in terms of those in the
unprimed frame. We have the linear transformation
x · v = v
x
=x · xv
x
+x · yv
y
+x · zv
z
(3a)
y · v = v
y
=y · xv
x
+y · yv
y
+y · zv
z
(3b)
z · v = v
z
=z · xv
x
+z · yv
y
+z · zv
z
(3c)
which can obviously be written as a matrix if we like.
We need a notation that is both more compact than what we have above, and also one
that can easily be generalized to more than three dimensions. Let the (Latin) indices i,
j,...represent the numbers 1, 2, and 3, corresponding to the coordinates x , y, and z (or
x , y , or z ). Write the components of v as v
i
or v
i
in the two coordinate systems. For
example, v
1
= v
x
, v
2
= v
y
, and so on. Also, define
a
ij
≡ x
i
· x
j
(4)
For example, a
12
=x · y. The a
ij
define the rotation, and in fact are individually just the
cosines of the angle between one axis and another. This allows us to write the transformation
2

Page 3

(3) as
v
i
=
3

j=1
a
ij
v
j
(5)
Now here is an important new agreement, called the Einstein summation convention.
We will agree that whenever an index appears twice, then it is implied that we sum over it.
(Sometimes this is called “contraction.”) Therefore (5) is written from now on as
v
i
= a
ij
v
j
(6)
Since a
ij
and v
j
are just ordinary numbers, the order in which we write them is irrelevant.
However, in (6) we took the opportunity to write it with the summed index j “adjacent”.
That is, the two occurrences of j appear next to each other. That means that we can write
(6) as a matrix equation with the matrices a and v written in the same order as in (6), viz


v
1
v
2
v
3


=


a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33




v
1
v
2
v
3


(7a)
or, if we agree to write matrices as sans serif characters, and leave their dimensionality to
context,
v = av
(7b)
We emphasize that (6) is the unambiguous way to write the coordinate transformation.
Using (7) is handy for carrying through calculations, but relies on the individual factors to
be written in the correct order.
Two common vector operations are now easy to write down in our new notation. For two
vectors v and w we have the “inner” or “dot” product
v · w = v
1
w
1
+ v
2
w
2
+ v
3
w
3
= v
i
w
i
= v
T
w
(8)
making use of the “transpose” operation for matrices. The “cross product” is, in terms of
components
(v × w)
x
= v
y
w
z
− v
z
w
y
etc...
or
(v × w)
i
= ϵ
ijk
v
j
w
k
(9)
(Note that this summation is to be done separately over j and k, so nine terms.) The totally
antisymmetric symbol ϵ
ijk
is discussed on page five of your textbook. Briefly, it equals +1
for an even permutation of ijk = 123, −1 for an odd permutation, and 0 if any two indices
are equal.
You likely remember that you can describe the dot product geometrically. It is equal to the
product of the magnitudes of the two vectors, times the cosine of the angle between them.
3

Page 4

That means, of course, that it should be independent of which coordinate system we use to
write the vector components. We say that the dot product is “invariant” under coordinate
rotations. Using our new notation, we can write this mathematically as
v · w = v
i
w
i
= (v )
i
(w )
i
=
(
a
ij
v
j
)(
a
ik
w
k
)
= a
ij
a
ik
v
j
w
k
(10)
In order for this equation to hold, it must be that
a
ij
a
ik
= δ
jk
(11)
in which case the (10) becomes
(v )
i
(w )
i
= δ
jk
v
j
w
k
= v
j
w
j
(12)
which just says that the dot product is the same number whether we use the primed or
unprimed coordinates. (It doesn’t matter that the first summation is over i and the final
step is over j. The summation convention turns them into dummy indices.)
Notice carefully the order of the indices in (11). The first of the two indices of a are
contracted with each other. To write this as matrix multiplication, recall that the transpose
of a matrix just reverses the rows and columns. That is, (11) can be written as
(
a
T
)
ji
a
ik
= δ
jk
or
a
T
a = 1
(13)
since, now, the indices i are adjacent. Matrices with this property are called “orthogonal.”
Definitions: Scalars, Vectors, and Tensors
Given some orthogonal transformation a
ij
we can go on to classify certain objects based on
their transformation properties. This is the approach we take in this course, and in fact
in any study of physical systems defined on the basis of transformations between different
coordinate systems. Our new definition will contain the old definition of a vector, but can be
obviously extended to include other systems, in particular those with different transformation
properties and different numbers of dimensions.
• A scalar is a number K which has the same value in different coordinate systems.
K = K
(14)
• A vector is a set of numbers v
i
which transform according to
v
i
= a
ij
v
j
(15)
4

Page 5

• A (second rank) tensor is a set of numbers T
ij
which transform according to
T
ij
= a
ik
a
jl
T
kl
(16)
Higher rank tensors (that is, objects with more than two indices) are defined analogously.
(See Ohanian, page 13.) Indeed, a vector is a tensor of rank one, and a scalar is a tensor of
rank zero.
Further segmentation of these definitions are possible, but we won’t discuss them in any
length here. For example, depending on whether the determinant of a is ±1, we would
characterize vectors as polar vectors or axial vectors. Also, it is easy to show that any tensor
can be written as the sum of a symmetric part (i.e. T
ji
= T
ij
) and an antisymmetric part
(i.e. T
ji
= −T
ij
). If and when we need these concepts in our course, we’ll discuss them at
that time.
Contravariant and Covariant Vectors and Tensors
So that we can get ready for a discussion of special relativity, we need to take up a distinction
between different kinds of vectors and tensors based on abstract geometry.
If we use the symbol A to denote a vector, and A · B to denote the inner product between
vectors, then we are unnecessarily restricting ourselves to a particular geometry. Indeed, it
implies that our geometry is “flat.” Two vectors will define a plane surface. You can imagine
that there should be more freedom than that. In other words, why can’t we have a space
with some kind of “curvature”, and how would we go about specifying that curvature?
We can remove this restriction by defining two types of vectors. A contravariant vector
(or just vector)

A will be distinguished from a covariant vector (or just covector) B

. An
inner product will only be defined between a contravariant vector and a covector, namely

A · B

= B

·

A. If I want to define an angle between two vectors by using the inner product,
then I need to somehow determine the geometry. This is done with the metric tensor
→→
g
which turns a covector into a vector by the operation

B =
→→
g ·B

. The inverse metric tensor
g
→→
turns a vector into a covector. Clearly,

A · B

= A

·

B implies that
→→
g · g
→→
=

1

.
Is there a good way to get a pictorial representation of vectors and covectors? Some options
are discussed in Amer. Jour. Phys. 65(1997)1037. One of these is depicted in Figure 2.
The contravariant vector is a “stick” whose magnitude is the length of the stick, while the
covector is like a serving of lasagna. The magnitude of the covector is the density of lasagna
noodles.
5

Page 6

Figure 2: Examples of how you can picture contravariant and covariant vectors. A contravari-
ant vector is a “stick” with a direction to it. Its “worth” (or “magnitude”) is proportional to
the length of the stick. A covariant vector is like “lasagna.” Its worth is proportional to the
density of noodles; that is, the closer together are the sheets, the larger is the magnitude of
the covector. These and other pictorial examples of visualizing contravariant and covariant
vectors are discussed in Am.J.Phys.65(1997)1037.
Figure 3: Pictorial representation of the inner
product between a contravariant vector and a co-
variant vector. The “stick” is imbedded in the
“lasagna” and the inner product is equal to the
number of noodles pierced by the stick. The in-
ner product shown has the value 5.
The classic 3D example of a covector is the gradient. You actually think of the gradient
for some function in terms of contour lines, along which the function has a constant value.
When the contours are close together, the gradient is larger than when they are far apart.
(Before the end of this lesson, we will indeed show that a gradient transforms the way a
covector should transform, as opposed to a vector, in the case of special relativity.)
This representation lends itself to a nice pictorial representation of the inner product between
a vector and covector. See Figure 3. This gives us a handy way to view the inner product
without having to resort to the definition of an “angle.” In order to define angles we need
to define the metric tensor.
Of course, this is all consistent with our usual high school three dimensional geometry. We
put all of this contravariant/covariant business behind us just by letting the metric tensor
be the identify transformation.
However, the “geometry” of spacetime is a physical situation that is well suited to the distinc-
tion between contravariant and covariant vectors. Indeed, the metric tensor has important
physical significance in this case.
6

Page 7

Application to (Special) Relativity
This formal way of treating vectors is important for a good understanding of Einstein’s
theory of relativity.
First, some terminology. We work in four-dimensional “spacetime” instead of three dimen-
sional space. In place of a “coordinate system” we speak of a “reference frame.” Instead
of a “coordinate transformation” we use the Lorentz Transformation to relate different ref-
erence frames. Vectors and tensors will be specialized to “four-vectors” and “four-tensors”,
although we’ll frequently just drop the “four-”.
We write everything here in terms of coordinates, and will not try to do things purely
geometrically. We use upper and lower indices to distinguish components of contravariant
and covariant vectors, respectively, namely
Contravariant vector components :
A
µ
where µ = 0,1,2,3
Covariant vector components :
A
µ
where µ = 0,1,2,3
Inner products will only be between contravariant and covariant vectors or tensor compo-
nents. That is, we will only be contracting upper indices with lower ones.
The geometry of spacetime is represented by the metric tensor g
µν
. That is, the invariant
spacetime interval is given by
ds
2
= dx
µ
dx
µ
= dx
µ
g
µν
dx
ν
In principle, g
µν
can be a function of spacetime, i.e. g
µν
= g
µν
(x
α
). The equations of Ein-
stein’s general theory of relativity are used to determine g
µν
in the presence of distributions
of matter and energy.
Special relativity is the case of flat spacetime, which is all that we will discuss in this course.
The metric tensor for flat spacetime is g
µν
= η
µν
where η
00
= 1, η
11
= η
22
= η
33
= −1, and
all of the other elements are zero. Notice that the (components of the) inverse metric tensor
η
µν
= η
µν
. That is, it is simple to show that
η
µα
η
αν
= δ
µ
ν
Don’t forget the Einstein summation convention, which now means we sum over repeated
indices equal to 0, 1, 2, and 3.
The Lorentz transformation is now written as Ohanian (7.5), namely
x
µ
= a
µ
ν
x
ν
(17)
where
a
µ
ν
= a
µ
ν
(+V ) =




γ
−γV/c 0 0
−γV/c
γ
0 0
0
0
1 0
0
0
0 1




(18)
7

Page 8

for boosts in the x
1
direction. Such a transformation indeed leaves
ds
2
= dx
µ
η
µν
dx
ν
= (cdt)
2
− dx
2
− dy
2
− dz
2
invariant. This was our original starting point for special relativity.
Note our care in (17) and (18) so that when we mix contravariant and covariant indices, it is
clear which is first, and therefore labels the row, and which is second, labeling the column.
We know that the inverse Lorentz transformation only amounts to V → −V , and so
x
µ
= a
µ
ν
(−V )x
ν
= a
µ
ν
(−V )a
ν
α
(+V )x
α
(19)
Switching dummy indices α and ν, this means that
a
µ
α
(−V )a
α
ν
(+V ) = δ
µ
ν
(20)
which can be written in terms of 4 × 4 matrices as
a(−V )a(+V ) = 1
(21)
It is worth a few minutes’ time to convince yourself that the matrix in (18) satisfies (21).
How do covectors transform? This is easy to answer using the metric tensor. So,
x
µ
= η
µα
x
α
= η
µα
a
α
β
x
β
= η
µα
a
α
β
η
βν
x
ν
In other words, the transformation equation for covectors is
x
µ
= a
ν
µ
x
ν
(22)
where
a
ν
µ
= η
µα
a
α
β
η
βν
(23)
It is simple to show that a
ν
µ
(+V ) = a
µ
ν
(−V ). See Ohanian (7.22). Therefore
a
µ
α
a
α
ν
= δ
µ
ν
(24)
In other words, Lorentz transformation show exactly the same orthogonality rule, namely
“sum over the first index to get the identity matrix”, that we encountered with old fashioned
three dimensional vectors in (11). We will make much use of (24).
That is pretty much everything we need to proceed with our study of electromagnetism. The
rest of this note gives some sample problems, and then their solutions. First, though, let
us summarize our fundamental objects on the basis of their fundamental properties under
Lorentz transformations between two reference frames moving with relative speed V :
• A scalar K is a number such that K = K. That is, it has the same value in all
reference frames.
8

Page 9

• A contravariant four-vector A
µ
is a set of four numbers which transform as
A
µ
= a
µ
ν
A
ν
(25)
• A covariant four-vector A
µ
is a set of four numbers which transform as
A
µ
= a
ν
µ
A
ν
(26)
Note equations (23) and (24) which relate the matrices a
µ
ν
and a
ν
µ
.
• A (second rank) fully contravariant tensor T
µν
is a set of sixteen numbers which
transform as
T
µν
= a
µ
α
a
ν
β
T
αβ
(27)
Higher rank tensors are obvious generalizations.
• A (second rank) fully covariant tensor T
µν
is a set of sixteen numbers which trans-
form as
T
µν
= a
α
µ
a
β
ν
T
αβ
(28)
There are some obvious generalizations of tensors. Clearly, a vector can be considered a first
rank tensor, and a scalar is a zero rank tensor. Higher rank tensors are clearly possible as well,
and their definitions straightforward extensions. We can also have mixed contra/covariant
tensors. For example, the set of sixteen numbers T
µ
ν
would transform as
T
µ
ν
= a
µ
α
a
β
ν
T
α
β
Finally, remember that if you want to use matrix multiplication to carry out any contractions
with tensors, make sure that you write the matrices in the correct order. The correct order
is the one which matches adjacent indices.
9

Page 10

Sample Problems
1. If A
µ
and B
µ
are four-vectors, prove that K ≡ A
µ
B
µ
is a scalar.
2. If A
µ
and B
ν
are four-vectors, prove that T
µν
≡ A
µ
B
ν
is a tensor.
3. If B
µ
is a vector and A
αµ
is a tensor, prove that X
α
≡ B
µ
A
αµ
is a vector.
4. The trace of a mixed tensor T
µ
ν
is Tr(T
µ
ν
) ≡ T
µ
µ
. Show that Tr(T
µ
ν
) is a scalar.
5. Prove that the four-gradient operator ∂
µ
≡ ∂/∂x
µ
indeed transforms like a covector.
6. (Ohanian 7.2a) Consider sixteen numbers given by A(µ,ν). Suppose that A(µ,ν)B
µ
is a
vector for any vector B
µ
. Prove that A(µ,ν) is a tensor.
7. (Ohanian 7.2b) The number K ≡ A
µν
C
µν
is a scalar for any tensor C
µν
. Show that A
µν
is a tensor.
10

Page 11

Sample Problems: Solutions
1. If A
µ
and B
µ
are four-vectors, prove that K ≡ A
µ
B
µ
is a scalar.
K = A
µ
B
µ
= (a
µ
α
A
α
)
(
a
β
µ
B
β
)
= a
µ
α
a
β
µ
A
α
B
β
= δ
β
α
A
α
B
β
= A
α
B
α
= K
2. If A
µ
and B
ν
are four-vectors, prove that T
µν
≡ A
µ
B
ν
is a tensor.
T
µν
= A
µ
B
ν
= (a
µ
α
A
α
)
(
a
ν
β
B
β
)
= a
µ
α
a
ν
β
A
α
B
β
= a
µ
α
a
ν
β
T
αβ
3. If B
µ
is a vector and A
αµ
is a tensor, prove that X
α
≡ B
µ
A
αµ
is a vector.
X
α
= B
µ
A
αµ
= (a
µ
ν
B
ν
)
(
a
β
α
a
γ
µ
A
βγ
)
= a
µ
ν
a
γ
µ
a
β
α
B
ν
A
βγ
= δ
γ
ν
a
β
α
B
ν
A
βγ
= a
β
α
B
ν
A
βν
= a
β
α
X
β
Note that we arranged the transformation matrices to show that we used (24) explicitly.
4. The trace of a mixed tensor T
µ
ν
is Tr(T
µ
ν
) ≡ T
µ
µ
. Show that Tr(T
µ
ν
) is a scalar.
T
µ
µ
= a
µ
α
a
β
µ
T
α
β
= δ
α
β
T
α
β
= T
α
α
5. Prove that the four-gradient operator ∂
µ
≡ ∂/∂x
µ
indeed transforms like a covector.
First use some simple calculus to write

µ
=

∂x
µ
=
∂x
ν
∂x
µ

∂x
ν
=
∂x
ν
∂x
µ

ν
So now we need to find ∂x
ν
/∂x
µ
, which is simple once we have the inverse Lorentz transform.
x
µ
= a
µ
ν
x
ν
= a
µ
α
x
α
so
a
ν
µ
x
µ
= a
ν
µ
a
µ
α
x
α
= δ
ν
α
x
α
= x
ν
Therefore
∂x
ν
∂x
µ
= a
ν
µ
and

µ
= a
ν
µ

ν
11

Page 12

6. (Ohanian 7.2a) Consider sixteen numbers given by A(µ,ν). Suppose that A(µ,ν)B
µ
is a
vector for any vector B
µ
. Prove that A(µ,ν) is a tensor.
Written properly, A(µ,ν) is either A
ν
µ
or A
µν
. Work these separately.
a) X
ν
≡ A
ν
µ
B
µ
so X
ν
= A
ν
µ
B
µ
. Since B
µ
is a vector, X
ν
= A
ν
µ
a
µ
α
B
α
= A
ν
γ
a
γ
α
B
α
.
Alternatively, since X
ν
is a vector, X
ν
= a
ν
β
X
β
= a
ν
β
A
β
µ
B
µ
= a
ν
β
A
β
α
B
α
. Since either of
these are true for any vector B
µ
, we must have A
ν
γ
a
γ
α
= a
ν
β
A
β
α
. Multiplying by a
α
µ
gives
a
α
µ
a
γ
α
A
ν
γ
= a
α
µ
a
ν
β
A
β
α
(29)
So, what is a
α
µ
a
γ
α
? Unlike (24) it contracts on second indices, not the first. However,
A
ν
= a
ν
µ
(−V )A
µ
= a
ν
µ
A
µ
for some vector A
ν
, and therefore the (invariant) inner product is
A
ν
B
ν
= a
ν
α
A
α
a
β
ν
B
β
= a
ν
α
a
β
ν
A
α
B
β
= A
α
B
α
which means that a
ν
α
a
β
ν
= δ
β
α
. In other words, you can some of the second indices for the
orthogonality condition, as well as the first indices. Therefore, (29) becomes
A
ν
µ
= a
α
µ
a
ν
β
A
β
α
b) X
ν
≡ A
µν
B
µ
so X
ν
= A
µν
B
µ
. Working things the same way as for X
ν
, you have
A
γν
a
γ
α
= a
β
ν
A
αβ
a
α
µ
a
γ
α
A
γν
= a
α
µ
a
β
ν
A
αβ
A
µν
= a
α
µ
a
β
ν
A
αβ
7. (Ohanian 7.2b) The number K ≡ A
µν
C
µν
is a scalar for any tensor C
µν
. Show that A
µν
is a tensor.
Since C
µν
can be anything, pick C
µν
= x
µ
y
ν
for arbitrary vectors x
µ
and y
ν
. Then K = K
means that A
αβ
x
α
y
β
= A
αβ
x
α
y
β
. Now do ∂
2
/∂x
µ
∂y
ν
on both sides to get
A
αβ
δ
α
µ
δ
β
µ
= A
µν
= A
αβ
∂x
α
∂x
µ
∂y
β
∂y
ν
=
∂x
α
∂x
µ
∂y
β
∂y
ν
A
αβ
In Problem 5 we showed that ∂x
ν
/∂x
µ
= a
ν
µ
. Therefore
A
µν
= a
α
µ
a
β
ν
A
αβ
12

Информация о работе Векторное пространство