Home » Mathematics » Algebraic combinatorics » Schur functors

Schur functors

I’m going to describe the basic ideas of the Schur functors, \mathbb{S}^\lambda(V), where \lambda is a partition and V is a vector space. These will turn out to be the complete set of irreducible polynomial representations of GL_n (for all n). The main facts to strive for are:

  • Every irreducible representation of GL_n is a unique Schur functor. Conversely, every Schur functor is irreducible.
  • The character of \mathbb{S}^\lambda(V) is the Schur polynomial s_\lambda.
  • The dimension of \mathbb{S}^\lambda(V) is the number of SSYTs of shape \lambda and entries from 1, \ldots, n (where n = \dim(V).) This fact will be explicit: there will be a “tableau basis” for the representation.

As a corollary, we get an improved understanding of the Littlewood-Richardson numbers and the isomorphism Rep(\text{GL}) \to \lambda_n between the representation ring and the ring of symmetric polynomials.

In particular:

  • The map takes \mathbb(S)^{\lambda} \to s_\lambda, the Schur polynomial,
  • Since the map respects (tensor) products, the Littlewood-Richardson number c_{\lambda \mu}^\nu is the multiplicity of \mathbb{S}^\nu in the tensor product \mathbb{S}^\lambda \otimes \mathbb{S}^\mu. Equivalently,

\displaystyle{c_{\lambda \mu}^\nu = \dim_{\mathbb{C}} \text{Hom}(\mathbb{S}^\nu, \mathbb{S}^\lambda \otimes \mathbb{S^\mu}).}

Finally:

  • If we allow twists by negative tensor powers of the determinant representation \det(V) = \bigwedge^n V, we also get all the rational algebraic representations of GL_n. These are representations which are morphisms of the algebraic group GL. (This distinguishes them from the polynomial representations, which extend to morphisms of affine n^2-space, the space of all matrices.)
  • These are indexed by decreasing sequences of (not necessarily positive) integers.

Let’s get started.

The Beginning: Sylvester’s Lemma

I want to try to motivate this construction. It generalizes the “exchange relation” I gave last post for \mathbb{S}^{(2,1)}, involving flag tensors. The key idea is that there are relations between wedges v_1 \wedge \cdots \wedge v_r and w_1 \wedge \cdots \wedge w_r when the subspaces spanned by the v_i‘s and w_j‘s are not sufficiently general.

The first step is the following:

Sylvester’s Lemma. Let v_1, \ldots, v_n, w_1, \ldots, w_n \in V, where \dim V = n. Let 1 \leq k \leq n. Then

\displaystyle{(v_1 \wedge \cdots \wedge v_n) \cdot (w_1 \wedge \cdots \wedge w_n) = \sum (v'_1 \wedge \cdots \wedge v'_n) \cdot (w'_1 \wedge \cdots \wedge w'_n),}

where the sum is over all ways of exchanging the first vectors on the right, w_1, \ldots, w_k, for any k of the v_i‘s, preserving the ordering. (There are {n \choose k} summands.)

Note that this equation lives in S^2(\bigwedge^n V). (Special case: if n = 2, r = s = 1, then the identity reduces to v \cdot w = w \cdot v, which is true in S^2(V) and false in V \otimes V.)

Fulton’s book has a very elegant coordinate-free proof, which I’ll briefly reproduce:

Proof. Let f be the difference of the two sides of the equation. Then, considered as a function of 2n vectors, f is obviously multilinear. It’s also easy to verify that f is alternating on v_1, \ldots, v_n. (The way to check is to set v_i = v_{i+1} and follow a minus sign.)
Now, we claim that, in addition to the above, f is also alternating on the argument w_1. To see this, define g to be the above expression with v_n = w_1, considered as a function of 2n-1 vectors. By the same reasoning, g is alternating on v_1, \ldots, v_n. And, g is also alternating on w_2, since if v_n = w_2, then there are three vectors that are all the same (v_n = w_1 = w_2), so everything vanishes. Thus g is alternating on n+1 vectors in an n-dimensional space, hence is identically zero. The same then holds for f.

Very nice! There are a few ways to improve this proof. First of all, it didn’t matter that there were n vectors on the right, since we never thought about the ones past w_2. So we have

Sylvester’s Lemma, v.2. The analogous relation holds in \bigwedge^n(V) \otimes \bigwedge^s(V), for any 1 \leq k \leq s.

Note that the case s=k=1 is slightly different, since the inductive step, showing that the expression vanishes if v_n = w_1, follows by inspection, not by alternation (g is alternating on n vectors, which isn’t enough to force it to be zero, but it is in fact zero when you write it down).

To connect this to “flag tensors”, let’s extend this to \bigwedge^r V \otimes \bigwedge^s V, where v_1, \ldots, v_r and w_1, \ldots, w_s, and

\text{Span}(v_1, \ldots, v_r) \supseteq \text{Span}(w_1, \ldots, w_s).

This is the “flag condition”. We have the following:

Sylvester’s Lemma, v.3. Let X \subseteq \bigwedge^r V \otimes \bigwedge^s V be spanned by “flag tensors”. Then the exchange relations hold on X, for any 1 \leq k \leq s. (Note: here, r \geq s.)

Proof. Given a flag tensor, let V_0 = \text{Span}(v_1, \ldots, v_r) be the larger subspace. Then the exchange relation actually lives in the space

\bigwedge^r V_0 \otimes \bigwedge^s V_0 \subseteq \bigwedge^r V \otimes \bigwedge^s V,

and now it follows from Sylvester’s Lemma, v.2 above.

There’s one final, separate improvement we can make, involving “incident-subspace tensors” rather than flag tensors — that is, wedges \alpha \wedge \beta \in \bigwedge^r V \otimes \bigwedge^s V such that \dim \text{Span}(\alpha) \cap \text{Span}(\beta) is big enough. I’ll discuss these briefly, but flag tensors are all I actually need.

In the basic argument we showed that the expressions g and f were alternating on v_1, \ldots, v_r, w_1. Well, in fact they’re also alternating on w_1, \ldots, w_k, the vectors being exchanged: those w_i‘s always end up in the same wedges, so if (for example) w_1 = w_2, then every single term in the relation is zero. So, g is actually alternating on r+k-1 vectors, hence vanishes if r+k \geq n+2. So:

Sylvester’s Lemma, v.4. Let r \geq s \geq k be such that r+k \geq \dim(V) + 2. Then the exchange relation holds on all of \bigwedge^r V \otimes \bigwedge^s V, for k exchanges.

Sylvester’s Lemma, v.5 Let r \geq s \geq k and let X \subseteq \bigwedge^r V \otimes \bigwedge^s V be spanned by “incident-subspace tensors” \alpha \wedge \beta, such that

\dim \text{Span}(\alpha) \cap \text{Span}(\beta) \geq s-k + 2.

Then the exchange relation holds on X for exchanges of k vectors.

Proof. Let V_0 = \text{Span}(\alpha,\beta) be the subspace generated by the r+s vectors in the relation. Then the relation lives in

\bigwedge^r V_0 \otimes \bigwedge^s V_0 \subseteq \bigwedge^r V \otimes \bigwedge^s V,

and the inequality just says that r+s \geq \dim(V_0) + 2.

That’s enough Sylvester’s Lemma for now. Let’s put these together into actual Schur functors.

Schur Functors

Let \lambda be a partition with distinct column lengths, say \lambda^T =(c_1, \ldots, c_k), where c_1 > \cdots > c_k > 0. Then define the Schur functor

\mathbb{S}^{\lambda}(V) \subseteq \bigwedge^{c_1}(V) \otimes \cdots \otimes \bigwedge^{c_k} V

to be the subspace spanned by flag tensors, that is, tensors \alpha_1 \otimes \cdots \alpha_k with \text{Span}(\alpha_1) \supseteq \text{Span}(\alpha_2) \supseteq \cdots \supseteq \text{Span}(\alpha_k).

The following facts are clear:

  • This is a sub-GL(V)-representation of the product of exterior powers.
  • It is functorial in V. (Note that if a map \varphi: V \to V' collapses the dimension of some piece of some flag tensor, then it will kill that element inside the exterior power.)
  • \mathbb{S}^{\lambda}(V) satisfies all the flag tensor exchange relations given by Sylvester’s Lemma, v.3 above.

To get the case where \lambda has repeated column lengths, replace \bigwedge^{c_i} V by S^a \bigwedge^{c_i} V, the symmetric power. (This is what’s going on in the original form of Sylvester’s Lemma.)

Visualizing with Tableaux

We can think of a flag tensor as an element of V^{\times \lambda}. That is, we literally write the tensor as a Young diagram of shape \lambda, with vectors in each of the boxes. Each column corresponds to a piece of the flag.

The algebraic rules for manipulating these are:

  1. The diagram is multilinear in all of the boxes.
  2. The columns are alternating – that is, we can rearrange entries of a column, introducing a minus sign each time.
  3. The exchange relations hold in the following form. Fix two columns, say of sizes r, s, and 1 \leq k \leq s. Then the flag tensor/tableau T satisfies T = \sum T', where the sum runs over all ways of exchanging the first k entries of the right-hand (size s) column with any k entries in the left column.

This gives a very convenient (and compact) way of picturing elements of the Schur functor.

The Tableau Basis for \mathbb{S}^{\lambda}(V)

Fix a basis e_1, \ldots, e_n for V. We’ve already seen how to write down bases for \bigwedge^k V and S^k V in terms of a basis for V, which made it easy to read off the traces of the exterior and symmetric powers as GL-representations. Now we’ll do the same thing for all the Schur functors.

Lemma. The Schur functor is spanned by SSYTs in the basis e_1, \ldots, e_n. (An entry i in an SSYT means the vector e_i.)

Proof. First, use relation (2) to put all columns in increasing order. Strictly increasing order, in fact, since the tensor is zero if there are any repeats in a column. This gives us column-strictness.

To get the weakly-increasing-along-rows condition, use a careful choice of exchange relation (3). It’s something like, among all strict decreases in rows, find the one that is farthest to the right, then lowest down. Then use an exchange relation with just enough boxes to move the offending box to the left. Continue in this manner (details are in Fulton’s book.)

Next:

Lemma. The SSYTs are linearly independent.

Proof. Fulton proves this by explicitly realizing \mathbb{S}^{\lambda}(V) inside the “ring of matrix coefficients” \mathbb{C}[Z_{ij}], and using a monomial ordering.

Specifically, a column with entries j_1, \ldots, j_k corresponds to the top-justified k\times k determinant using the columns j_1, \ldots, j_k of the matrix (Z_{ij}). An SSYT corresponds to the product of the top-justified determinants coming from its columns. Clearly, this is an explicit realization of a “flag tensor” in coordinates (the dimension-k step of the flag is given by the top k rows of the matrix.)

Now, the standard lexicographic ordering of the monomials (English reading order on the matrix) distinguishes the SSYTs.

As a final note, let’s observe that this description specializes to the description of the bases for the exterior powers (when \lambda is a single column) and symmetric powers (when \lambda is a single row). The tableau basis is really in the same spirit.

Trace, Weight and the Schur polynomial

The nice thing about the tableau basis is that every SSYT is a “weight-vector”. Specifically, if x = \text{diag}(x_1, \ldots, x_n) is a diagonal matrix, so x \cdot e_i = x_i e_i, then the action of x on a tableau T is immediate:

x \cdot T = x^{w(T)} T,

where w(T) is the weight of T. So, the trace of the representation, in the monomial basis, is

\displaystyle{ \chi(\mathbb{S}^{\lambda}) = \sum_{T \in SSYT(\lambda,[n])} x^{w(T)} = \sum_{\mu} K_{\lambda,\mu,[n]} m_{\mu} = s_{\lambda,[n]},}

the Schur polynomial on n variables (where n = \dim(V)).

Irreducibility

Let’s end with a discussion of why the Schur functors are irreducible representations. This uses the theory of Lie algebras and root spaces.

The basic idea is: given a representation \rho: GL(V) \to GL(W), we analyze its character by studying the action of the torus T \subseteq GL(V) of diagonal matrices. The representation must split as a direct sum of one-dimensional sub-T-representations called “weight spaces”, since the torus is abelian.

On the Lie algebra side, we consider the induced Lie algebra representation \sigma : \mathfrak{gl}(V) \to \mathfrak{gl}(W), with the action of the torus \mathfrak{t}. The story is even simpler: there is a lattice corresponding to the possible torus weights, with “roots” that classify them.

There’s now a notion of highest-weight vector (coming from ordering the basis e_1, \ldots, e_n: e_1 is highest, followed by e_2, and so on). These are vectors w \in W for which B \cdot w = \mathbb{C} \cdot w, where B is the Borel subgroup of upper-triangular matrices. In the case of the Schur functors, it’s easy enough to see that the unique highest-weight vector is the SSYT with all 1s in the first row, all 2s in the next, and so on.

The following is the important fact:

Fact. A representation is irreducible if and only if it has a unique highest-weight vector. Furthermore, two irreducible representations with the same highest-weight vector are isomorphic.

Moreover, in the case of the Lie algebra \mathfrak{gl}(V), the possible highest-weight vectors corresponding to representations are all partitions (the comes from the theory of root systems of Lie algebras and “positive roots”, and the fact that \mathfrak{gl}(V) is comparatively simple).

So every irreducible representation corresponds to a (unique) partition, and we’ve exhibited one for each partition. So we’ve found them all.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: