I. Introduction
Article I of the U.S. Constitution establishes “the principle of a House of Representatives elected ‘by the People,’ a principle tenaciously fought for ... at the Constitutional Convention.”
Wesberry v. Sanders,
One of the most challenging struggles has been to interpret the constitutional requirements for apportioning congressional districts. After all, “groups of voters elect representatives, individual voters do not.”
Davis v. Bandemer,
The Supreme Court, however, has never precisely defined what is the relevant “population” for the purposes of apportion *284 ing congressional representation. And, behind this case there lies a theoretically difficult question, whether congressional districts must be of the same total population — the number of residents within each district — or some different population that represents the number of votes cast in each district. Put differently, this is a choice between two conceptions of democratic equality, “electoral equality” and “equal representation.” 5
This question, however, comes to us clothed in a peculiar procedural posture and pled in such a way as to make the claim clearly without merit. For the reasons discussed below, we therefore affirm.
II. Factual and Procedural Background
New York State last redrew its congressional districts in 2002. See New York *285 State Law § 111. Based on the 2000 Census, each district had the same total population of 654,360. 6 But, although all congressional districts in New York State have the same census data population, 7 the voting age populations vary widely from district to district.
Plaintiff, Michael Kalson, is a registered voter in the Fifteenth Congressional District in New York. Based on the 2000 Census, Mr. Kalson’s congressional district has a population of 654,361. Of these 654,-361 people, 497,192 are 18 years or older (of voting age). Plaintiff is challenging any congressional district apportionment based on total population, arguing that such districts should instead be apportioned on the basis of voting-age population. He contends that his vote, cast in a district with 497,192 voting-age residents, counts for less than the votes of people in districts with fewer voting-age residents, 8 and that this dilution denies him his right to an equally weighted vote.
Mr. Kalson sued the Governor of New York and New York election officials for injunctive relief, asserting that the difference in voting-age population between congressional districts violates Article I, § 2 of the United States Constitution. Plaintiffs claim relies primarily on dicta from
Wesberry v. Sanders,
Defendants moved for judgment on the pleadings pursuant to Fed.R.Civ.P. 12(c). In their memorandum of law supporting the motion for judgment, Defendants noted that although Plaintiff did not request a three-judge panel, 28 U.S.C. § 2284(a) normally requires that a three-judge panel adjudicate challenges to the constitutionality of congressional districts.
The District Court did not convene a three-judge panel but rather granted Defendants’ motion and entered judgment for them. The court found “no basis in the case law” for the claim that Article I, § 2 requires districts of equal voting-age population and therefore reasoned that Plaintiff failed to state “a substantial Constitutional claim.”
Plaintiff appealed. The appeal challenges only the merits of the District Court’s ruling. In supplemental letter briefing filed at our request, however, Plaintiff also argued that under 28 U.S.C. § 2284, a three-judge panel should have been convened to rule on his claim and that, as a result, the judgment on the pleadings must be vacated for lack of jurisdiction in the court below.
*286 III. Discussion
A. Jurisdiction
Federal law requires that when “an action is filed challenging the constitutionality of the apportionment of congressional districts,” “[a] district court of three judges
shall
be convened.” 28 U.S.C. § 2284(a) (emphasis added). In ordinary circumstances, a single district court judge cannot adjudicate a case on the merits that is required to be heard by a three-judge court. 28 U.S.C. § 2284(b)(3). Under at least the predecessor version of the three-judge requirement, a single trial judge could decide such a case only on technical or jurisdictional grounds, such as a lack of standing or nonjusticiability.
See McLucas v. De Champlain,
At argument, Defendants-Appellees contended that the three-judge requirement is no longer a jurisdictional one. They concede that an earlier statute made the requirement jurisdictional but claim that the relevant statutory provisions were amended in 1976 and thereafter became “mandatory” but not jurisdictional. They therefore assert that while a district court must convene a three-judge court at the request of any party, the District Court here did not err by adjudicating the case on its own given that no party sought a three-judge court. If that is so, we, as an appellate court, have jurisdiction to treat this case like any other appeal and consider its merits.
The defendants’ position finds some support in the structure and language of 28 U.S.C. § 2284. The procedural portion of the statute, 28 U.S.C. § 2284(b), states only that “[u]pon the filing of a request for three judges” a panel must be convened. The section does not describe any procedure absent the filing of a request for three judges, thus suggesting that without such a request, a single judge might be empowered to adjudicate the case. 11
*287
We nevertheless hold that the three-judge requirement in 28 U.S.C. § 2284 is jurisdictional. The text of 28 U.S.C. § 2284 uses typically jurisdictional language. Thus, 28 U.S.C. § 2284(a) states that a three-judge court “shall” be convened, a word that generally connotes a rule that it is both mandatory and jurisdictional. And on that basis, the only Circuit we believe to rule on the issue found the requirement to be jurisdictional.
See Armour v. Ohio,
There is, moreover, no reason to think that when in 1976 Congress amended the three-judge statute, it intended to make this imperative nonjurisdictional. In 1976, Congress vastly reduced the category of eases for which a three-judge court is mandated. In doing so, Congress gave no indication that it intended to alter the three-judge requirement, other than to reduce the category of cases in which it applied. Indeed, the Senate report stated that “the other powers here given the single judge, or expressly denied him, are similar to those stated in” the predecessor version of § 2284. S.Rep. No. 94-204, at 13 (1975). U.S.Code Cong. & Admin.News 1976, pp. 1988, 2001. The House of Representatives report made a similar statement,
see
H.R.Rep. No. 941379, at 7;
see also
S.Rep. No. 94-204, at 2, U.S.Code Cong. & Admin.News 1976 at p. 1989;
cf. LaRouche v. Fowler,
In light of the language “shall,” and the absence of any clear sign that Congress, when amending the statute in 1976, intended to make the three-judge court requirement nonjurisdictional, we hold that the rule that challenges to congressional apportionment be heard by a three-judge court remains jurisdictional.
B. Substantiality
Although 28 U.S.C. § 2284 is jurisdictional, it has long been held that a single judge may dismiss a claim that must normally be heard by a three-judge court if it is “insubstantial.”
See McLucas,
The contours of an “insubstantial” constitutional claim are, concededly, not well defined. As Judge Friendly observed, “these tests cannot be of mathematical precision.”
Green,
Our limited treatment of the “insubstantial” standard makes use of similar language.
See Loeber v. Spargo,
C. Plaintiffs Case
Plaintiffs theory that Article I requires electoral equality presents an interesting and plausible view of that Article’s apportionment requirement. And, although the *289 Supreme Court has certainly indicated that Plaintiffs argument must fail, 16 the theory is not strictly foreclosed by existing precedent from either the Supreme Court or this circuit. Nor is it “obviously frivolous.” Accordingly, it might well not qualify as “insubstantial” for purposes of jurisdiction.
Plaintiffs actual claim, however, is another story. As he has pled it, Mr. Kalson is arguing specifically that Article I requires that congressional districts be apportioned by voting-age population. Even assuming, arguendo, that districts must be apportioned to create, or even just to approximate an equal number of voters, it does not follow at all that districts should be apportioned by voting-age population. Were it true, as Plaintiff argues, that Article I creates an individual right to an equally weighted vote, that right is not vindicated by having districts of equal voting-age population. Many persons of voting age cannot vote, such as felons, ex-felons, and noncitizens, 17 and many eligible voters choose not to vote. 18
*290 Plaintiff acknowledges that districts of equal voting-age population do not necessarily mean that there will be an equal number of votes cast within each congressional district or even that that there will be an equal number of eligible voters in each district. Plaintiff argues instead that unequal, voting-age populations are a distinct wrong and that he is not required to plead relief for other wrongs. This argument is meritless. The only claim that is at all plausible must depend on Article I and on the form of equality that it mandates; it could be of total population, it might be of eligible voters, or even perhaps of actual voters. But there is no reason that Plaintiff has presented that supports the thesis that Article I requires districts equal in voting-age population.
Significantly, Plaintiff does not assert that voting age is the best available proxy for actually equal voting power. And that is enough to make the claim before us insubstantial. One may question whether, even if Plaintiff had pled that equal voting-age population is the best proxy available, that assertion would be plausible. There are districts in which a large proportion of the voting-age population are noncitizens or felons ineligible to vote.
See, e.g., Hayden,
Because Plaintiff has not pled facts that would support an argument that if his theory of electoral equality were correct, apportionment by voting-age population would best achieve that electoral equality, his claim is insubstantial. Accordingly, the District Court did not err in denying him relief, and the judgment below is Affirmed.
Notes
. Although the populations must be the same, practical considerations do still come into play. For example, in states like New York where the total population is not perfectly divisible by the number of congressional districts, some districts have a population that is one person larger than others.
See Rodriguez v. Pataki,
No. 02 CIV. 618(RMB),
. Apportionment for state and local elected bodies, which is governed only by the Equal Protection Clause, is given more leeway as to population differences than congressional apportionment, which is governed by both the Equal Protection Clause and Article I, § 2. The equal protection and Article I, § 2 requirements are not precisely the same. Because judicial review of congressional apportionment could be based on Article I, § 2, which explicitly describes the right of the people to elect Congress, the Court has held
*284
that this stronger textual warrant justifies a stronger rule. The Court has also suggested a functional difference: congressional districts are larger than state districts, and therefore it is both less feasible and less sensible to design congressional districts consistently with existing political boundaries.
See White v. Weiser,
. Judge Kozinski used these terms in
Garza v. County of Los Angeles,
Although these two conceptions often go hand in hand, they are by no means equivalent. Given a population of any size, there are almost always some number of people who cannot vote — such as children, nonciti-zens, felons, or ex-felons — and some number of people who are eligible to vote but choose not to do so. Two districts of the same population, district A and B, may ensure equal representation, but if district A has substantially more voters than district B, the voters in district B each individually have seemingly greater voting power. Thus, if representatives are apportioned to districts of the same total population in an area in which potentially eligible voters are unevenly distributed, the result will necessarily devalue the votes of individuals in the area with a higher percentage of potentially eligible voters.
In contrast, two districts with the same number of voters may ensure electoral equality, but if district A has a substantially larger population than district B, then there are three interrelated putative harms. First, residents of district A theoretically must compete more for the attention of their representative. Second, assuming that representatives can each provide the same amount of federal goods for their district, then district A will receive the same government benefits despite its larger population and, as a result, residents of district A will each receive a smaller share than residents of district B. Third, assuming that representatives are believed to represent every person in their district, then despite the fact that it has a larger population, district A's political power will not adequately reflect its size.
In many instances, population and voting population are rough proxies for each other. But, this is not always the case. Although small differences are permissible under the Equal Protection clause, and thus imperfect proxies would generally not pose a problem, Article I, § 2 requires virtually exact equality, see supra notes 3, 4, making it necessary to know which measure(s) of population are constitutionally permissible.
.There were population deviations plus or minus one person. Because congressional seats are allocated in whole numbers and cannot span state boundaries, total district populations, although essentially identical within a state, vary between states. For example, in Connecticut each district has a population of 681,113. See"110th Congressional District Summary File (100-Percent),” available at http://factfinder.census.gov/home/saff7 main.html?_lang=en.
. See supra note 3.
. For example, the Sixteenth Congressional District, in New York, has a voting age population of 428,285.
. Although the statutes to which these cases referred, such as 28 U.S.C. §§ 2281, 2282, preclude on their face the
issuance
of an injunction by a single judge, they do not directly mandate a three-judge panel to dismiss or deny an injunction. The Supreme Court has, however, interpreted them to mean that a three-judge court is required for ruling on the merits of any injunction request in such cases.
See Goosby,
. The fact that neither party raised a jurisdictional issue on appeal is of no matter; we are obligated to determine whether jurisdiction exists
nostra sponte. See Arbaugh v. Y & H Corp.,
.At least some academic commentators and judges have criticized treatment of the three-judge requirement as jurisdictional, noting that to convene a three-judge court when no party requests it and to throw out an otherwise valid judgment on appeal is "wasteful and senseless."
See
17 Charles Alan Wright, Arthur R. Miller, Edward H. Cooper,
Federal Practice and Procedure
§ 4235. Professor David Currie described it as "the most obnoxious feature of the entire three-judge
*287
scheme,” David Currie,
The Three-Judge District Court in Constitutional Litigation,
32 U. Chi. L.Rev. 1, 77 (1964), and Judge Friendly termed the need to remand a case decided without objection by either party, “bizarre,”
Sardino
v.
Fed. Reserve Bank of New York,
. The report from the House Judiciary Committee does not discuss why reapportionment cases were kept in three-judge courts. See Wright, Miller, Cooper, & Amar, § 4235 n. 8 (citing H.R.Rep. No. 94-1379).
. Although the early cases establishing the “insubstantial claim” exception arose under the pre-1976 versions of the three-judge court requirement, there is no reason to think that the 1976 amendment intended to alter the exception.
Cf. LaRouche,
. In the case before us, the District Court did not dismiss for lack of subject-matter jurisdiction nor did the court explicitly state that the claim was inappropriate for a three-judge court; rather, the court adjudicated the case on the merits of the pleadings per Fed. R.Civ.P. 12(c). If the theory behind the substantial federal claim exception is that the federal courts lack jurisdiction, then one might deem it curious that a judge could decline to convene a three-judge court on the grounds that there is no federal jurisdiction but then rule on merits of the pleadings under Rule 12(c). A judgment on the pleadings may, however, include a judgment on the grounds of failure to state a claim upon which relief can be granted,
see
Fed.R.Civ.P. 12(h)(2);
see also Burnette v. Carothers,
. Some of our earlier decisions used a slightly looser language in affirming decisions that declined to convene a three-judge court.
See, e.g., Abele v. Markle,
. Most on point, in
Kirkpatrick v. Preisler,
Moreover, in cases where the Supreme Court has evaluated population disparities between congressional districts, it has always done so by looking at total population.
See, e.g., Vieth,
Indeed, the
reasoning
of
Wesberry
— -the case on whose language Plaintiff relies — suggests that Plaintiff’s claim is incorrect. The Supreme Court's mandates of equally populous congressional districts began in
Wesberry,
when the Court analyzed Art. I, § 2 in "its historical context.”
Additionally, it is worth noting that in challenges to apportionment of state and local government bodies (that were brought as equal protection claims rather than under Article I), several circuits have held that total population is at the very least a permissible measure.
See Chen v. City of Houston,
. And there is often a significant geographic disparity with regard to groups that cannot vote. For example, in New York, many incarcerated prisoners come from urban centers but the census on which apportionment is based counts "incarcerated prisoners as residents of the communities in which they are incarcerated,” and thus may increase "upstate New York regions' populations at the expense of New York City's.”
Hayden v. Pataki,
. Even in 2004, an election that drew a very large voter turnout, turnout was at about 60% of eligible voters. See Comm, for the Study of the Am. Electorate, President Bush, Mobilization Drives Propel Turnout to Post-1968 High 12 (2004), available at http://www.american. edu/ia/cdem/csae/pdfs/csae041104.pdf.
