README.md 8.01 KB
Newer Older
1 2
# Flutter devicelab

Yegor's avatar
Yegor committed
3 4 5 6 7 8 9
"Devicelab" (a.k.a. "cocoon") is a physical lab that tests Flutter on real
Android and iOS devices.

This package contains the code for test framework and the tests. More generally
the tests are referred to as "tasks" in the API, but since we primarily use it
for testing, this document refers to them as "tests".

10
Build results are available at https://flutter-dashboard.appspot.com.
Ian Hickson's avatar
Ian Hickson committed
11

12
# Reading the dashboard
Ian Hickson's avatar
Ian Hickson committed
13

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
## The build page

The build page is accessible at https://flutter-dashboard.appspot.com/build.html.
This page reports the health of build servers, called _agents_, and the statuses
of build tasks.

### Agents

A green agent is considered healthy and ready to receive new tasks to build. A
red agent is broken and does not receive new tasks.

In the example below, the dashboard shows that the `linux2` agent is broken and
requires attention. All other agents are healthy.

![Agent statuses](images/agent-statuses.png)

### Tasks

The table below the agent statuses displays the statuses of build tasks. Task
statuses are color-coded. The following statuses are available:

**New task** (light blue): the task is waiting for an agent to pick it up and
start the build.

**Task is running** (spinning blue): an agent is currently building the task.

**Task succeeded** (green): an agent reported a successful completion of the
task.

**Task is flaky** (yellow): the task was attempted multiple time, but only the
latest attempt succeeded (we currently only try twice).

**Task failed** (red): the task failed all of the attempts.

**Task underperformed** (orange): currently not used.

**Task was skipped** (transparent): the task is not scheduled for a build. This
usually happens when a task is removed from `manifest.yaml` file.

**Task status unknown** (purple): currently not used.

In addition to color-coding, a task may display a question mark. This means
that the task was marked as flaky manually. The status of such task is ignored
when considering whether the build is broken or not. For example, if a flaky
task fails, GitHub will not prevent PR submissions. However, if the latest
status of a non-flaky task is red, all pending PRs will contain a warning about
the broken build and recommend caution when submitting.

Legend:

![Task status legend](images/legend.png)

The example below shows that commit `e122d5d` caused a wide-spread breakage,
67
which was fixed by `bdc6f10`. It also shows that Cirrus and Chrome
68 69 70 71 72 73 74 75 76 77
Infra (left-most tasks) decided to skip building these commits. Hovering over
a cell will pop up a tooltip containing the name of the broken task. Clicking
on the cell will open the log file in a new browser tab (only visible to core
contributors as of today).

![Broken Test](images/broken-test.png)

## Why is a task stuck on "new task" status?

The dashboard aggregates build results from multiple build environments,
78
including Cirrus, Chrome Infra, and devicelab. While devicelab
79
tests every commit that goes into the `master` branch, other environments
80
may skip some commits. For example, Cirrus will only test the
81 82 83 84 85 86 87 88 89 90 91 92 93 94
_last_ commit of a PR that's merged into the `master` branch. Chrome Infra may
skip commits when they come in too fast.

## How the devicelab runs the tasks

The devicelab agents have a small script installed on them that continuously
asks the CI server for tasks to run. When the server finds a suitable task for
an agent it reserves that task for the agent. If the task succeeds, the agent
reports the success to the server and the dashboard shows that task in green.
If the task fails, the agent reports the failure to the server, the server
increments the counter counting the number of attempts it took to run the task
and puts the task back in the pool of available tasks. If a task does not
succeed after a certain number of attempts (as of this writing the limit is 2),
the task is marked as failed and is displayed using red color on the dashboard.
Yegor's avatar
Yegor committed
95 96 97 98 99 100 101 102

# Running tests locally

Do make sure your tests pass locally before deploying to the CI environment.
Below is a handful of commands that run tests in a similar way to how the
CI environment runs them. These commands are also useful when you need to
reproduce a CI test failure locally.

103 104
## Prerequisites

105 106 107
You must set the `ANDROID_HOME` or `ANDROID_SDK_ROOT` environment variable to run
tests on Android. If you have a local build of the Flutter engine, then you have
a copy of the Android SDK at `.../engine/src/third_party/android_tools/sdk`.
108

109 110
You can find where your Android SDK is using `flutter doctor`.

111 112 113 114 115 116
## Warnings

Running devicelab will do things to your environment.

Notably, it will start and stop gradle, for instance.

117 118 119 120 121
## Running all tests

To run all tests defined in `manifest.yaml`, use option `-a` (`--all`):

```sh
122
../../bin/cache/dart-sdk/bin/dart bin/run.dart -a
123 124 125 126
```

## Running specific tests

Yegor's avatar
Yegor committed
127 128 129
To run a test, use option `-t` (`--task`):

```sh
Ian Hickson's avatar
Ian Hickson committed
130
# from the .../flutter/dev/devicelab directory
131
../../bin/cache/dart-sdk/bin/dart bin/run.dart -t {NAME_OR_PATH_OF_TEST}
Yegor's avatar
Yegor committed
132 133
```

134 135 136 137 138 139 140
Where `NAME_OR_PATH_OF_TEST` can be either of:

- the _name_ of a task, which you can find in the `manifest.yaml` file in this
  directory. Example: `complex_layout__start_up`.
- the path to a Dart _file_ corresponding to a task, which resides in `bin/tasks`.
  Tip: most shells support path auto-completion using the Tab key. Example:
  `bin/tasks/complex_layout__start_up.dart`.
Ian Hickson's avatar
Ian Hickson committed
141

Yegor's avatar
Yegor committed
142 143 144
To run multiple tests, repeat option `-t` (`--task`) multiple times:

```sh
145
../../bin/cache/dart-sdk/bin/dart bin/run.dart -t test1 -t test2 -t test3
Yegor's avatar
Yegor committed
146 147
```

148 149 150
To run tests from a specific stage, use option `-s` (`--stage`).
Currently there are only three stages defined, `devicelab`,
`devicelab_ios` and `devicelab_win`.
Yegor's avatar
Yegor committed
151 152 153


```sh
154
../../bin/cache/dart-sdk/bin/dart bin/run.dart -s {NAME_OF_STAGE}
Yegor's avatar
Yegor committed
155 156 157 158 159 160 161 162 163 164
```

# Reproducing broken builds locally

To reproduce the breakage locally `git checkout` the corresponding Flutter
revision. Note the name of the test that failed. In the example above the
failing test is `flutter_gallery__transition_perf`. This name can be passed to
the `run.dart` command. For example:

```sh
165
../../bin/cache/dart-sdk/bin/dart bin/run.dart -t flutter_gallery__transition_perf
Yegor's avatar
Yegor committed
166
```
167 168 169 170 171 172 173 174 175 176 177 178 179

# Writing tests

A test is a simple Dart program that lives under `bin/tests` and uses
`package:flutter_devicelab/framework/framework.dart` to define and run a _task_.

Example:

```dart
import 'dart:async';

import 'package:flutter_devicelab/framework/framework.dart';

180
Future<void> main() async {
181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226
  await task(() async {
    ... do something interesting ...

    // Aggregate results into a JSONable Map structure.
    Map<String, dynamic> testResults = ...;

    // Report success.
    return new TaskResult.success(testResults);

    // Or you can also report a failure.
    return new TaskResult.failure('Something went wrong!');
  });
}
```

Only one `task` is permitted per program. However, that task can run any number
of tests internally. A task has a name. It succeeds and fails independently of
other tasks, and is reported to the dashboard independently of other tasks.

A task runs in its own standalone Dart VM and reports results via Dart VM
service protocol. This ensures that tasks do not interfere with each other and
lets the CI system time out and clean up tasks that get stuck.

# Adding tests to the CI environment

The `manifest.yaml` file describes a subset of tests we run in the CI. To add
your test edit `manifest.yaml` and add the following in the "tasks" dictionary:

```
  {NAME_OF_TEST}:
    description: {DESCRIPTION}
    stage: {STAGE}
    required_agent_capabilities: {CAPABILITIES}
```

Where:

 - `{NAME_OF_TEST}` is the name of your test that also matches the name of the
 file in `bin/tests` without the `.dart` extension.
 - `{DESCRIPTION}` is the plain English description of your test that helps
 others understand what this test is testing.
 - `{STAGE}` is `devicelab` if you want to run on Android, or `devicelab_ios` if
 you want to run on iOS.
 - `{CAPABILITIES}` is an array that lists the capabilities required of
 the test agent (the computer that runs the test) to run your test. Available
 capabilities are: `has-android-device`, `has-ios-device`.