The Government brings this interlocutory appeal of a pretrial ruling excluding evidence in a criminal prosecution that charged Arnold Katz (“Katz”) with violation of 18 U.S.C. § 2252(a)(2), receipt of child pornography. We affirm.
I. FACTS AND PROCEDURAL HISTORY
The government alleges that the following facts will be proven at trial.
On November 23, 1994, Katz posted a message on an Internet bulletin board, stating that he had homemade “pornos” and was interested in trading with others. An undercover customs agent responded and arranged to exchange videos with Katz. On April 7, 1995, agents executed a controlled delivery of a package containing a videotape entitled “Masturbating Lolita” and a computer disk containing eleven Graphic Image Files (“GIFs”) to Katz at his residence, which became the subject of Count II (receipt of child pornography). The Government also seized a videotape entitled “Dream Teens,” that Katz sent to the undercover agent which became the subject of Count I (distribution of child pornography). 1
At issue is whether the government’s evidence is sufficiently reliable that a jury could conclude beyond a reasonable doubt that the models depicted in the evidence were less than 18 years old at the time the *370 images were produced. Katz filed a Dau-bert motion
pursuant to Federal Rules of Evidence 403 and 702, to exclude all expert witness testimony purporting to determine the age of the persons portrayed in the [evidence] upon the basis of the application of the “Tanner Scale” to review of a visual depiction. The application of the “Tanner Scale” to a visual depiction for the purpose of determining the age of the person depicted is not valid and reliable scientific methodology and does not comport with the requirements of evidentiary reliability articulated by the Supreme Court in Daubert v. Merrell Dow Pharmaceuticals, Inc., [509 U.S. 579 ,113 S.Ct. 2786 ,125 L.Ed.2d 469 ] (1993). The accused moves for a Dau-bert hearing on this issue pursuant to Federal Rules of Evidence 104(a) and 702.
The trial court set a Daubert hearing on the motion on March 11, 1997, 2 wherein it was developed that the Tanner Scale of Human Development for females is the recognized scientific test utilized for determining the age of postpubescent Caucasian females and consists of separately rating, on scales of 1 to 5, breast development and pubic hair development, with Stage 1 being pre-adolescent and Stage 5 being adult. However, the government’s expert witness testified that he could not use the Tanner Scale breast development scale for determining the age of the models in question because the age bands were too wide. For instance, the Tanner stage 5 breast development band encompasses ages 12 through 19. Further, the Tanner Scale is valid as to Caucasians, but it is not valid as to all ethnic groups. After hearing testimony, the parties stipulated and the district court found that the Tanner Scale has been subject to peer review and publication, that it is a scientifically valid methodology for determining the age of individuals, and that the Government’s expert, Dr. Woo-dling, was qualified to perform Tanner Scale analysis. Whether the Tanner Scale analysis could be adequately performed on the images in evidence remained in dispute. At the close of the hearing, the district court concluded there was sufficient ability to visualize the Tanner Scale criteria to permit the expert to express a reliable opinion whether the models were less than 18 years old and preliminarily determined that the videotape and the expert witness testimony were admissible.
A second hearing was conducted on December 1-4, 1997, immediately prior to the scheduled trial, to resolve all remaining evidentiary issues. The district court reaffirmed that the videotape and government’s expert testimony were admissible, which ruling is not challenged in this interlocutory appeal. The government brings this appeal challenging two district court rulings relating to the inadmissibility of the GIF files.
On the evening of December 1, 1997, the government turned over to defendant a computer disc containing the GIFs. The government chose five of the eleven GIF images from the computer disk to introduce at trial and at the second hearing, labeling them 1-A, 1-B, 1-C, 1-D, 1-E. Katz objected to the admission of the five color “photos” 3 from the GIF files which the government proposed using as exhibits because the government had provided only poor quality black and white versions of these images to the defense during discovery. The district court ruled that, as a sanction for failure to timely disclose the color images to the defendant, those images would not be admissible. However, for purposes of the pretrial hearing, the district court permitted the government to *371 use a set of “better quality black and white photos” in place of the poorer quality images originally turned over to defense counsel and actually utilized the color versions at various times during the lengthy hearing.
In its rulings at the close of the December 1997 hearing, the district court enmeshed its Daubert analysis with a Federal Rule of Evidence 403 weighing of probative value against potential for prejudicial effect. After considering the GIF images and the testimony of the government’s expert, the district court concluded that the black and white images were inadmissible at trial pursuant to Federal Rule of Evidence 403 because they lacked sufficient clarity to determine the models’ ages under the Tanner Scale and therefore their probative value was outweighed by their prejudicial effect. Specifically, Dr. Woodling was unable to apply the Tanner Scale pubic hair analysis to 1-A because the poor quality of the photo precluded him from determining whether any of the model’s pubic hair had been removed. Participants in the production of child pornography may manipulate the appearance of a model’s pubic hair to make an older model look younger, thus impacting on the validity of the Tanner pubic hair development scale. The ethnicity of the model in 1-B was uncertain, and the district court held that the scientific methodology of the Tanner Scale was not sufficiently verified on non-Caucasian individuals. The district court found that the poor quality of the images and the models’ position in 1-C, 1-D and 1-E precluded the application of Tanner Scale pubic hair analysis. It is difficult to determine from the record which set of images some of the expert’s testimony referred to. Regardless, the expert was asked several times whether his testimony would change if he were to base it on the excluded color photos. He testified that it would not. As to one of the images in 1-E, the district court noted that the amount of pubic hair appeared quite different depending on whether the court viewed the black and white image or color image, and concluded that this discrepancy illustrated the lack of reliability of the images in depicting the actual appearance of the person shown. In summary the district court concluded that problems with visibility attributed to the angles of the photos and the quality of the prints precluded utilization of the Tanner scale and thereby greatly reduced the probity of the exhibits. Probity and reliability became inextricably linked which, when balanced against prejudice, tilted the scale toward excluding not only Dr. Woodling’s opinion testimony relating to the GIFs but the exhibits themselves.
The district court granted the government’s motion to stay the trial pending the interlocutory appeal of orders excluding the GIF images.
We would be remiss if we did not note that we are troubled by the amount of judicial resources that were devoted to the Daubert hearing. In a case capable of being tried start to finish in a day and one half, not only the court -but the lawyers were engaged for the better part of five days in a hearing to determine the reliability of testimony and potential prejudice of exhibits involving a well known test that is applied in a quite straightforward manner. Daubert hearings in cases much more complex than this one are customarily conducted with dispatch consuming only a few hours at best.
II. DISCUSSION
A SANCTION FOR LATE DISCLOSURE OF EVIDENCE
The government challenges the district court’s ruling excluding the color versions of the GIF images. The district court found that the government’s failure to disclose the “photographs” to the defendant in the identical form it intended to produce them at trial was either an attempt to “sandbag” the defense or highly unprofessional conduct and therefore limited the *372 government to the use of black and white images.
We review remedies for discovery violations imposed by a district court for abuse of discretion.
See United States v.
Bentley,
First, the government reiterates its explanation for the delay presented at the hearing. Prior to the return of the superseding indictment, the black and white copies had been given to defense counsel as potential Rule 404(b) evidence. After the superseding indictment, which elevated the GIFs to intrinsic rather than extrinsic evidence, defense counsel never requested better images or copies of the computer disc. The district court rejected this explanation, finding instead that the reason disclosure was not made was the government’s attempt to sandbag the defense or highly unprofessional conduct. This finding is not clearly erroneous.
Second, the district court made repeated inquiry into whether its order would result in prejudice to the government by asking the government’s expert whether his testimony would be different if he were to base his answers on the color photos rather than the black and white photos under consideration at the pretrial hearing. The expert testified repeatedly that it would not.
Third, although the district court made no specific findings on this factor, potential for prejudice to the defendant was high because of the unique circumstances of this case: the computer disc had never been in Katz’s possession, so the defendant had no information about the contents of the images other than what he learned during discovery. Further, Katz’s defense was premised on expert testimony concerning the age of the models and it was necessary for his expert to examine the evidence and formulate an opinion prior to trial.
All three of these factors weigh in favor of affirming the district court’s ruling. However, the government argues that a continuance would have been an appropriate and less severe sanction than exclusion of the evidence.
See Sarcinelli,
B. EXCLUSION OF EVIDENCE UNDER R ULE JfOS
Federal Rule of Evidence 403 provides:
Although relevant, evidence may be excluded if its probative value is substantially outweighed by the danger of unfair prejudice, confusion of the issues, or misleading the jury....
We review district court rulings excluding evidence for abuse of discretion.
See United States v. Pace,
Implicit in the district court’s ruling is the finding that the age of the models had to be determined by expert testimony.
*373
The ruling was no doubt influenced by the government’s position embracing the need for, and advocating the admissibility of, expert testimony on this issue. On appeal, the government changes its position and argues that a lay jury could determine the age of the post-puberty models without any assistance from its own expert, citing
United States v. Lamb,
The threshold question — whether the age of a model in a child pornography prosecution can be determined by a lay jury without the assistance of expert testimony — must be determined on a case by case basis. As the government correctly points out, it is sometimes possible for the fact finder to decide the issue of age in a child pornography case without hearing any expert testimony.
See United States v. O’Malley,
In addition, the government argues the district court erred in limiting its expert to opinions based on the Tanner Scale pubic hair development. Although the expert testified that he could not perform the Tanner Scale pubic hair analysis on the GIF images, he was willing to give an opinion concerning the models’ ages based on breast development and general body habitus (the body’s shape, size and distribution of body fat). However, the expert also testified that the Tanner Scale breast stages are not scientifically useful in determining the age of the models because the range of ages for each stage was too broad and extended beyond the age of 18. The district court did not abuse its discretion in determining that the probative value of the images was compromised by the inability of the government’s expert to determine the age of the models using the portion of the Tanner scale which his own testimony advocated as scientifically valid and reliable. We do not mean to imply that, as a matter of law, the Tanner Scale pubic hair development scale is the only reliable basis for judging the age of models in child pornography cases. Rather, we hold only that a fair reading of the extensive record in this case reveals that Dr. Woodling’s *374 expertise was linked to Tanner scale methodology, and that the district court did not abuse its gate-keeping function in limiting his opinion testimony to that methodology. However, the government is not precluded from attempting to persuade the district court that some other witness can express a reliable opinion concerning the age of the models using scientifically valid methodology that is not dependent on the Tanner Scale.
Finally, the government argues that because the GIF images are
res gestae,
they are particularly probative, and the district court erred in performing the weighing task required under Rule 403. The indictment alleges violation of 18 U.S.C. § 2252(a)(2), which requires proof that defendant knowingly and intentionally received visual depictions of children under the age of eighteen engaged in sexually explicit conduct. The government argues that the evidence is probative of elements of the crime charged other than the age of the models, specifically, that the defendant received depiction of individuals “engaged in sexually explicit conduct.” Because that element is not disputed and because the district court ruled admissible a videotape that fulfills the government’s burden on that element, we cannot say that the district court abused its discretion in rejecting the government’s argument that the res
gestae
nature of the images in question gave overwhelming weight to their probative value in spite of their prejudicial nature.
See Campbell v. Keystone Aerial Surveys, Inc.,
In sum, it was not an abuse of discretion to exclude images that, according to the government’s own expert, depict models whose ages are not susceptible to evaluation using the scale that the same expert advocates as scientifically reliable.
III. CONCLUSION
Based on the foregoing, we affirm the district court’s exclusion of the GIF images.
AFFIRMED.
Notes
. Count I was dismissed on December 2, 1997. Count II, receiving child pornography through the mail, is the only currently pending count.
. The district court considered evidence from Counts 1 and 2 at the March Daubert hearing, as Count 1 had not yet been dismissed.
. The images printed from the computer disk are referred to in the record as "photos.” The color "photos” were produced by printing the GIFs from a computer with a color printer. The black and white images were produced by copying the computer generated color images on an ordinary office copier.
