CONCORD MUSIC GROUP, INC., et al., Plaintiffs, v. ANTHROPIC PBC, Defendant.
Case No. 24-cv-03811-EKL (SVK)
UNITED STATES DISTRICT COURT NORTHERN DISTRICT OF CALIFORNIA
May 23, 2025
ORDER ON JOINT DISCOVERY SUBMISSIONS
Re: Dkt. Nos. 328, 341, 345
Before the Court are the Parties’ Joint Discovery Submissions regarding three discovery disputes. First, Defendant Anthropic PBC (“Anthropic“) seeks to compel Plaintiffs (“Publishers“) to disclose and produce currently undisclosed prompts to and outputs from Claude (and related settings) that Publishers and their agents submitted and obtained in the course of their legal investigation. Dkt. 328 at 2 (relating to Interrogatory no. 1 and Requests for Production (“RFP“) nos. 37-43). Second, the Parties disagree as to the appropriate scope of the sampling protocol in response to this Court‘s prior order. Dkt. 341; see also Dkt. 318. Third, Publishers challenge certain confidentiality designations made by Anthropic as overbroad and improper. Dkt. 345. These matters came on for hearing on May 13, 2025. Having considered the Parties’ arguments, the relevant law and the record in this action, the Court DENIES without prejudice Anthropic‘s request to compel production of Publishers’ prompts, outputs, and settings; ORDERS production of a sample of 5 million prompt-output pairs according to the protocol set forth below; and SUSTAINS-IN-PART and OVERRULES-IN-PART Publishers’ challenge to Anthropic‘s confidentiality designations.
////
////
////
I. DKT. NO. 328: ANTHROPIC‘S MOTION TO COMPEL PRODUCTION OF PUBLISHERS’ UNDISCLOSED PROMPTS AND OUTPUTS AND THE SETTINGS USED THEREFORE
A. Legal Standard
However, “[t]he privilege derived from the work-product doctrine is not absolute. Like other qualified privileges, it may be waived.” United States v. Sanmina Corp., 968 F.3d 1107, 1119 (9th Cir. 2020). “Similar to the waiver of the attorney-client privilege, a litigant can waive work-product protection to the extent that he reveals or places the work product at issue during the course of litigation.” Id. In other words, under the “fairness principle,” a party cannot “us[e] the privilege as both a shield and a sword.” Id. at 1117 (explaining the fairness principle in context of attorney-client privilege).
B. Discussion
In this case, Publishers conducted a pre-suit investigation into potential infringement by Anthropic and relied upon certain prompts and outputs from their investigation in support of their allegations of infringement. See Dkt. 1; Dkt. 337-2 (Ex. B); Dkt. 369 at 1-2. Publishers represent that they have produced all prompts and outputs on which they have relied, totaling nearly 5,000 prompt-output pairs. Dkt. 369 at 2-3. Anthropic has now served broad discovery requests directed to all prompts and outputs including those not relied upon by Publishers — in other words, those that presumably did not support claims of infringement. See Dkts. 328-1-2 (Interrogatory no. 1 and RFP nos. 37-43). In the joint statement, Anthropic argues that either (a) the unrelied-upon prompts/outputs are not privileged or (b), if privileged, that the privilege has been waived. Dkt. 328 at 1-5. On the record before it, the Court disagrees.
The closer issue is the extent to which Publishers have waived the attorney work product protection. Publishers admit that they have relied on certain prompts and outputs in their First Amended Complaint (Dkt. 337) and in various other filings, including their original and renewed motions for preliminary injunction and supporting declarations but also represent that they have produced all prompts and outputs relied upon. See Dkt. 369 (“Publishers have produced all Claude prompts and outputs from their investigation on which they have relied upon to date. In total, Publishers have produced approximately 4,659 Claude prompt/output records....“); see also Dkt. 372 (“Hrg. Tr.“) at 50:16-51:18. Under the “fairness principle” or sword-and-shield doctrine, there has been at least a limited waiver here; indeed, Publishers have produced nearly 5,000 prompt-output pairs upon which they rely. The issue is how far the waiver reaches.
The Ninth Circuit has made clear that “the scope of [a work product] waiver must be ‘closely tailored... to the needs of the opposing party’ and limited to what is necessary to rectify any unfair advantage gained....” Sanmina, 968 F.3d at 1124. Anthropic claims waiver by pointing to Publishers’ “ease of use” or “massive use” allegations in the complaint — for example, the allegation that “Anthropic knowingly trained its AI models on infringing content on a massive
This Court therefore DENIES Anthropic‘s requests without prejudice to Anthropic seeking to discover, as necessary, specific facts supporting specific contentions or opinions disclosed in the course of the litigation.
////
////
////
////
////
II. DKT. NO. 341: SAMPLING PROTOCOL FOR RFP NOS. 50-51
This dispute arises from Publishers’ requests for prompts and outputs from Claude products that relate to song lyrics, for which this Court previously ordered that a “statistically significant sample would be proportional to the needs of the case.” See Dkts. 306, 318. The Parties have submitted competing proposals along with expert declarations, and the Court has had an opportunity to question the experts. See Dkts. 341, 341-2, 352, 372.
At the outset, the Court notes that during the hearing, Publishers asked this Court to examine Anthropic‘s expert, Ms. Chen and strike her declaration because at least one of the citations therein appeared to have been an “AI hallucination“: a citation to an article that did not exist and whose purported authors had never worked together. See Hrg. Tr. at 7:11-11:25. The Court gave Anthropic time to investigate the circumstances surrounding the challenged citation. Having considered the declaration of Anthropic‘s counsel and Publishers’ response (Dkts. 371, 373), the Court finds this issue is a serious one—if not quite so grave as it at first appeared.
Anthropic‘s counsel protests that this was “an honest citation mistake” but admits that Claude.ai was used to “properly format” at least three citations and, in doing so, generated a fictitious article name with inaccurate authors (who have never worked together) for the citation at issue. Dkt. 371, ¶¶ 3, 6. That is a plain and simple AI hallucination. Yet the underlying article exists, was properly linked to and was located by a human being using Google search; so, this is not a case where “attorneys and experts [have] abdicate[d] their independent judgment and critical thinking skills in favor of ready-made, AI-generated answers....” Contra Dkt. 373 at 2 (quoting Kohls v. Ellison, No. 24-cv-3754 (LMP/DLM), 2025 WL 66514 at *4 (D. Minn. Jan. 10, 2025)). A remaining serious concern, however, is Anthropic‘s attestation that a “manual citation check” was performed but “did not catch th[e] error.” Dkt. 372, ¶ 6. It is not clear how such an error—including a complete change in article title—could have escaped correction during manual cite-check by a human being. Furthermore, although the undersigned‘s standing order does not expressly address the use of AI by parties or counsel, Section VIII.G of Judge Lee‘s Civil Standing Order requires a certification “that lead trial counsel has personally verified the content‘s accuracy.” J. Lee‘s Civil Standing Order, § VIII.G (emphasis added). Neither the certification nor
Turning to the substance of the dispute, based upon the Parties’ agreed upon parameters, the Court adopts the values of a 95% confidence level corresponding to a “Z-score” of 1.96 and an expected prevalence of 0.00006. Dkt. 352, ¶ 12; Dkt. 341-2, ¶¶ 5, 12. The disagreement between the Parties is with respect to the margin of error: Anthropic offers a 25% margin of error, while Publishers would require a 5% margin. Dkt. 3, 7. During the hearing, the Parties’ experts testified that there was room within the 5 to 25% (Publisher‘s expert Mr. Buchan) or 10 to 50% (Ms. Chen) range—i.e., agreeing that anywhere from 10 to 25% would be statistically valid. See Hrg. Tr. at 26:19-28:1; 32:21-33:2. Based upon the foregoing, the Court determines that a margin of error of approximately 11.3% is within the range that will yield a representative sample and will appropriately balance the associated burden against the needs of the case. Calculating a sample from these values according to the experts’ formulas, the Court ORDERS Anthropic to produce a sample in accordance with the following protocol:
- Anthropic shall produce a total of 5 million prompt-output pairs (i.e., each containing a given prompt and the corresponding Claude output);
- These prompt-output pairs shall be drawn equally from pre-suit and post-suit data, i.e., with 2.5 million records from between September 22 and October 18, 2023, and 2.5 million records from between October 19, 2023 and March 22, 2024; and
- These prompt-output pairs shall be randomly selected from the respective periods.
Anthropic will produce this sample as soon as practicable and no later than the deadline for substantial completion of remaining document productions, currently July 14, 2025.
////
////
////
////
////
III. DKT. NO. 345: ANTHROPIC‘S CONFIDENTIALITY DESIGNATIONS
With regard to the final dispute, Publishers challenge three categories of “Highly Confidential—Attorneys’ Eyes Only” (“HC-AEO“) designated materials: (1) Anthropic‘s 9,418 Claude prompt-output records produced to-date; (2) various Claude use statistics and figures; and (3) two training datasets used by Anthropic. Dkt. 345 at 1. Publishers’ concerns regarding over-designation by Anthropic are not unfounded. The protective order (“PO“) in this case requires the Parties to “take care to limit any [confidentiality] designations to specific material that qualifies under the appropriate standard [and, to] the extent practicable... designate for protection only those parts of material documents, [etc.] that qualify.” Dkt. 293, § V.1. While some level of confidentiality is appropriate for the materials, a blanket HC-AEO designation is not.
First, with regard to Claude prompts and outputs generally, the Court is cognizant not only of the roughly 9,000 prompts produced to-date but also of the forthcoming production of the 5 million record sample. This Court has previously recognized that Anthropic‘s “users have a privacy interest at stake” in their use of Claude. Dkt. 318, 4. Anthropic does not object to down-designating prompts and outputs to “Confidential” rather than “HC-AEO,” and Publishers are unable to identify prejudice apart from the assertion that it makes quoting from prompts and outputs in public filings more difficult. Accordingly, recognizing that it would be impractical if not impossible to require Anthropic to review and redact each prompt-output pair to maintain users’ privacy, Anthropic may designate prompts and outputs categorically as “Confidential” at the outset under the PO. However, Anthropic is cautioned that the standard for designation under the PO and this Court‘s recognition of a general privacy interest are separate from any inquiry into whether documents in the public record should be sealed.1
For the Second category, Anthropic‘s use of the “HC-AEO” designation for Claude use
Third and finally, the Court agrees with Anthropic that, as a general matter, the training datasets used to train Claude are competitively sensitive and may properly be maintained as “HC-AEO” material. Of course, if Anthropic has publicly disclosed the training datasets at issue, then confidentiality is not appropriate. But Anthropic maintains that it “has never publicly disclosed using either dataset to train Claude.” Dkt. 345 at 10. Neither Publishers’ citations to what Anthropic‘s competitors have disclosed about their products, nor their citations to a 2021 article that predates Claude and does not mention Claude, refute this representation. Accordingly, Anthropic may maintain this information as “HC-AEO” material.
SO ORDERED.
Dated: May 23, 2025
SUSAN VAN KEULEN
United States Magistrate Judge
