Improve web benchmarks measurements (#127900)

By default, the browser fuzzes the timer APIs such that they have a granularity of approximately 100 microseconds (this is due to Spectre mitigation techniques). However, many of the thing we are trying to measure actually have a much finer granularity than 100 microseconds. As a result, many of our benchmarks are extremely noisy and don't provide accurate data. By serving the initial script files with the `Cross-Origin-Opener-Policy: same-origin` and `Cross-Origin-Embedder-Policy: require-corp` HTTP headers, the browser runs the benchmarks in a `crossOriginIsolated` context, which restores the fine granularity of APIs such as `performance.now()` to microsecond precision. Also, we were considering anything an outlier that was more than one standard deviation away from the mean. In a normal distribution, that means we are only capturing 68% of the data and the rest are considered outliers. This is not ideal. Doing two standard deviations away captures 95% of the data, and the outliers are in the remaining 5%, which seems much more reasonable.

Improve web benchmarks measurements (#127900)
By default, the browser fuzzes the timer APIs such that they have a granularity of approximately 100 microseconds (this is due to Spectre mitigation techniques). However, many of the thing we are trying to measure actually have a much finer granularity than 100 microseconds. As a result, many of our benchmarks are extremely noisy and don't provide accurate data. By serving the initial script files with the `Cross-Origin-Opener-Policy: same-origin` and `Cross-Origin-Embedder-Policy: require-corp` HTTP headers, the browser runs the benchmarks in a `crossOriginIsolated` context, which restores the fine granularity of APIs such as `performance.now()` to microsecond precision. Also, we were considering anything an outlier that was more than one standard deviation away from the mean. In a normal distribution, that means we are only capturing 68% of the data and the rest are considered outliers. This is not ideal. Doing two standard deviations away captures 95% of the data, and the outliers are in the remaining 5%, which seems much more reasonable.
e8f4d803 · Jackson Gardner · GitHub · 660166b5 · e8f4d803 · e8f4d803
Unverified Commit e8f4d803 authored May 31, 2023 by Jackson Gardner Committed by GitHub May 31, 2023
Show whitespace changes
Inline Side-by-side

Showing with 23 additions and 2 deletions

recorder.dart dev/benchmarks/macrobenchmarks/lib/src/web/recorder.dart +2 -1

web_benchmarks.dart dev/devicelab/lib/tasks/web_benchmarks.dart +21 -1

No files found.
--- a/dev/benchmarks/macrobenchmarks/lib/src/web/recorder.dart
+++ b/dev/benchmarks/macrobenchmarks/lib/src/web/recorder.dart
@@ -655,7 +655,8 @@ class Timeseries {
    final double dirtyStandardDeviation = _computeStandardDeviationForPopulation(name, candidateValues);
    // Any value that's higher than this is considered an outlier.
-    final double outlierCutOff = dirtyAverage + dirtyStandardDeviation;
+    // Two standard deviations captures 95% of a normal distribution.
+    final double outlierCutOff = dirtyAverage + dirtyStandardDeviation * 2;
    // Candidates with outliers removed.
    final Iterable<double> cleanValues = candidateValues.where((double value) => value <= outlierCutOff);

--- a/dev/devicelab/lib/tasks/web_benchmarks.dart
+++ b/dev/devicelab/lib/tasks/web_benchmarks.dart
@@ -114,7 +114,7 @@ Future<TaskResult> runWebBenchmark({ required bool useCanvasKit }) async {
        profileData.completeError(error, stackTrace);
        return Response.internalServerError(body: '$error');
      }
-    }).add(createStaticHandler(
+    }).add(createBuildDirectoryHandler(
      path.join(macrobenchmarksDirectory, 'build', 'web'),
    ));
@@ -188,3 +188,23 @@ Future<TaskResult> runWebBenchmark({ required bool useCanvasKit }) async {
    }
  });
 }
+Handler createBuildDirectoryHandler(String buildDirectoryPath) {
+  final Handler childHandler = createStaticHandler(buildDirectoryPath);
+  return (Request request) async {
+    final Response response = await childHandler(request);
+    final String? mimeType = response.mimeType;
+    // Provide COOP/COEP headers so that the browser loads the page as
+    // crossOriginIsolated. This will make sure that we get high-resolution
+    // timers for our benchmark measurements.
+    if (mimeType == 'text/html' || mimeType == 'text/javascript') {
+      return response.change(headers: <String, String>{
+        'Cross-Origin-Opener-Policy': 'same-origin',
+        'Cross-Origin-Embedder-Policy': 'require-corp',
+      });
+    } else {
+      return response;
+    }
+  };
+}