Mailspring/spec
Evan Morikawa 487dac31b3 fix(spec): fix asynchronous specs
Summary:
I believe I discovered why our tests intermittently fail, why those
failures cause a cascade of failures, and how to fix it.

The bug is subtle and it's helpful to get a quick refresher on how various
parts of our testing system work:

First a deeper dive into how Promises work:

1. Upon creating a new Promise, the "executor" block (the one that gets
   passed `resolve` and `reject`), gets synchronously run.
2. You eventually call `resolve()`.
3. The minute you call `resolve()`, Bluebird's `async.js` queues up
   whatever's downstream (the next `then` block)
4. The queue gets processed on every "tick".
5. Once the "tick" happens, our queue processes and downstream `then`
   blocks get run.
6. If any more Promises come in before the "tick", they get added to the
   queue. Nothing gets processed until the "tick" happens.

The important takeaway here is this "tick" in step 4. This "tick" is the
core of what makes Promises asynchronous. By default, Bluebird in our
Node-like environment uses `setImmediate` as the underlying
implementation.

Our test environment is different. We do NOT use `setImmediate` in our test
environment.

We use Bluebird's `Promise.setScheduler` API to implement our own "tick".  This
gives us much greater control over when Promises advance. Node's `setImmediate`
puts you at the whim of the underlying event loop.

Before today, our test "tick" implementation used `setTimeout` and
`advanceClock` triggered by `process.nextTick`.

Let me also quickly explain `setTimeout` in our test environment.

We globally override `setTimeout` in our test environment to not be based on
actual time at all. There are places in our code where we wait for several
hundred miliseconds, or have timeouts places. Instead of "sleeping" some amount
of time and hoping for the best, we gain absolute control over "time". "Time"
is just an integer value that only moves forward when you manually call
`advanceClock()` and pass it a certain number of "milliseconds".

At the beginning of each test, we reset time back to zero, we record
setTimeouts that come in, and as advanceClock gets called, we know if we need
to run any of the setTimeout callbacks.

Back to the Promise "tick" implementation. Before today, our testing "tick"
implementation relied our our stubbed `setTimeout` and our control of time.

This almost always works fine. Unfortunately tests would sometimes
intermittently fail, and furthermore cause a cascade of failures down the road.

We've been plauged with this for as long as I can remember. I think I finally
found how all of this comes together to cause these intermittent failures and
how to fix it.

The issue arises from a test like the one in query-subscription-pool-spec.  We
have tests here (and subtly in other yet unknown places) that go
something like this:

```
it("TEST A", () => {
  Foo.add(new Thing())
  expect(Foo._things.length).toBe(1)
})

it("TEST B", () => {
  expect(true).toBe(true)
})
```

At the surface this test looks straightforward. The problem is that
`Foo.add` may down the line call something like `trigger()`, which may
have listeners setup, which might try and asynchronously do all kinds of
things (like read the fs or read/write to the database).

The biggest issue with this is that the test 'finishes' immediately after
that `expect` block and immediately moves onto the next test. If `Foo.add`
is asynchronous, by the time whatever downstream effects of `Foo.add` take
place we may be in a completely different test. Furthremore if those
downstream function errors, those errors will be raised, but Jasmine will
catch them in the wrong test sending you down a rabbit hole of dispair.

It gets worse.

At the start of each test, we reset our `setTimeout` stub time back to
zero. This is problematic when combined with the last issue.

Suppose `Foo.add` ends up queuing a downstream Promsie. Before today, that
downstream Promise used `setTimeout(0)` to trigger the `then` block.

Suppose TEST A finishes before `process.nextTick` in our custom scheduler
can call `advanceClock` and run the downstream Promise.

Once Test B starts, it will reset our `setTimeout` stub time back to zero.

`process.nextTick` comes back after Test B has started and calls
`advanceClock` like it's supposed to.

Unfortunately, because our stub time has been reset, advanceClock will NOT
find the original callback function that would have resolved `Foo.add`'s
downstream Promise!!!

This means that Bluebird is now stuck waiting for a "tick" that will never
come anymore.

Since Bluebird thinks it's waiting for a "tick", all future Promises will
get queued, but never called (see Step 6 of the Promise description
above).

This is why once one test fails, downstream ones never complete and
Jasmine times out.

The failure is intermittent because `process.nextTick` is racing agianst a
test finishing, the next one starting, and how many and how far downstream
promises are setup.

Okay. So how do we fix this? First I tried to simply not reset the time back to
zero again in our stubbed time-override. This doesn't work because it simply
exposes the diasterous consequences of downstream Promises resolving after a
test has completed. When a test completes we cleanup objects, unmount React
components. Those downstream promises and timeouts come back and throw all
kinds of errors like: "can't read property x of undefined" and "can't find a
match for component X".

The fix that works the best is to simply MAKE PROMISES FULLY SYCNRHONOUS.
Now if you look at our custom Promise Test Scheduler in time-override,
you'll see that it immediately and sychronously calls the function. This
means that all downstream promises will run BEFORE the test ends.

Note that this is designed as a safeguard. The best way to make a more
robust test is to declare that your funtion body is asynchronous. If you
call a method that has downstream effects, it's your responsibility to
wait for them to finish. I would consider the test example above a very,
very subtle bug. Unfortunately it's so subtle that it's unreasonable to
expect that we'll always catch them. Making everything in our testing
environment synchronous ensures that test setup and cleanup happen when we
intuitively expect them to.

Addendum:

The full Promise call chain looks something like this:

-> `Promise::_resolveCallback`
-> `Promise::_fulfill`
-> `Promise::_async.settlePromises`
-> `AsyncSettlePromises`
-> `Async::_queueTick`
-> CHECK `Async::_isTickUsed`
-> `Async::_schedule`
-> `TimeOverride.scheduler`
-> `setTimeout`
-> `process.nextTick`
-> `advanceClock`
-> `Async::_drainQueues`
-> `Async::_drainQueue`
-> `THEN BLOCK RUNS`
-> `Maybe more things get added to queue`
-> `Async::_reset`
-> `Async::_queueTick` works again.

Test Plan: They now work

Reviewers: halla, mark, spang, juan, jackie

Reviewed By: juan, jackie

Differential Revision: https://phab.nylas.com/D3538
2016-12-20 16:47:26 -08:00
..
components fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
fixtures fix(quote): fix more cases of unwrapped signature detection 2016-11-14 11:55:37 -08:00
models fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
n1-spec-runner fix(spec): fix asynchronous specs 2016-12-20 16:47:26 -08:00
registries refactor(registry): move all registries into src/registries 2016-11-14 14:01:00 -08:00
services fix(rendering): Fix quote stripping of many plaintext emails 2016-12-15 18:40:34 -08:00
stores fix(sync-status): Change old per model status, in favor of per folder 2016-12-15 11:17:20 -08:00
tasks fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
themes rm(grim): We’re not using Grim for deprecations 2016-10-25 11:36:20 -07:00
utils fix(specs) Fix date-utils specs 2016-11-07 12:15:47 -08:00
action-bridge-spec.coffee fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
async-test-spec.es6 fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
auto-update-manager-spec.coffee fix(autoupdater): Change feed URL when Nylas ID or accounts changes 2016-08-18 13:22:56 -07:00
buffered-process-spec.coffee refactor(spec) move spec-nylas to spec 2015-10-01 21:39:44 -07:00
database-object-registry-spec.es6 refactor(registry): move all registries into src/registries 2016-11-14 14:01:00 -08:00
default-client-helper-spec.coffee feat(win32): Allow N1 to become the system-wide mailto: handler 2016-10-12 16:05:36 -07:00
list-selection-spec.coffee fix(specs): Add regression test for list-selection 2016-11-02 11:46:28 -07:00
mail-rules-processor-spec.coffee fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
mailbox-perspective-spec.es6 lint(*): Bump to ESLint 3.8 2016-10-17 18:07:35 -07:00
menu-manager-spec.coffee es6(*): Misc src to ES2016 2016-10-27 12:08:59 -07:00
module-cache-spec.coffee fix(specs): Update specs following 0.29.2 > 0.34.3 move 2015-11-17 17:40:06 -08:00
nylas-api-spec.coffee fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
nylas-env-spec.es6 fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
nylas-protocol-handler-spec.es6 fix(specs) convert nylas-protocol-handler-spec to ES6 (#2886) 2016-10-12 11:38:30 -07:00
nylas-test-utils.coffee cson(cleanup): Remove imports, only used for config.cson now 2016-04-24 20:33:34 -05:00
package-manager-spec.coffee fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
package-spec.coffee cleanup(specs): Remove space-pen. Goodbye, jQuery! 2016-04-26 13:14:07 -07:00
spellchecker-spec.es6 fix(spec): add support for async specs and disable misbehaving ones 2016-12-15 13:02:00 -05:00
undo-stack-spec.es6 lint(*): Bump to ESLint 3.8 2016-10-17 18:07:35 -07:00