Summary:
This commit also lowers the batch size of messages to fetch on folder sync down to 30. This is in order to prevent sync from getting stuck if we queue too many syncback tasks-- given that we only update the range of fetched uids after we've actually fetched and processed messages, if the batch size is too big and we interrupt too often, we might end up never advancing the range and re fetching the same messages over and over.
This also makes the sync loop run faster through all folders in general.
Depends on D3689 to make sure that the batch size actually reflects a message count, i.e. to ensure that we are making /visible/ progress.
Test Plan: manual
Reviewers: spang, khamidou, evan
Reviewed By: evan
Maniphest Tasks: T7477
Differential Revision: https://phab.nylas.com/D3692
Summary:
Consolidating provider checks to use the same source of truth.
Fixes send issues with some provider types.
Test Plan: tested locally
Reviewers: tomasz
Reviewed By: tomasz
Differential Revision: https://phab.nylas.com/D3694
Summary:
Because we optimistically fetch UIDs by expanding a range without looking
at the actual UIDs in the inbox and the actual space of UIDs with messages
attached may be sparse due to message moves, we need to track how many
messages we actually download during a range expansion and continue
expanding the range if we haven't downloaded enough messages.
If we reach a large gap where we download no messages at all during a batch, we
pause and check the actual UID list for the folder for the next UID to
download, as otherwise we may spin indefinitely fetching UIDs that don't exist.
(Example: my "Deleted Items" folder had about 300k worth of empty UIDs between
a very small UID and a very large UID. With the new system, this registers as a
completed sync within a single iteration as soon as sync hits the gap.)
Test Plan: manual
Reviewers: juan, evan
Reviewed By: juan, evan
Differential Revision: https://phab.nylas.com/D3689
Summary:
This patch changes the sync worker to back off exponentially when there is an issue syncing an account. This has two goals:
- first, it's a bit dangerous to retry immediately. We don't want hundreds of thousands of machines trying to refresh tokens unsuccessfully because our service is struggling.
- second, it's nicer on the CPU to wait a bit between retries.
Currently, we sleep for at most 2 minutes, with some random jitter added.
Test Plan: Tested manually, stared at the code a long time.
Reviewers: evan, juan
Reviewed By: evan, juan
Differential Revision: https://phab.nylas.com/D3684
Summary:
Various errors are thrown when the sync worker tries accessing
a database that we've already deleted, so make sure the sync
worker has been stopped before we remove the database. This diff
involves modifying `Interruptible` so that `interrupt()` returns
a promise that resolves once the interrupt has been completed.
Addresses T7472
Test Plan: manual
Reviewers: evan, juan
Reviewed By: evan, juan
Differential Revision: https://phab.nylas.com/D3679
Summary: Treat any that aren't gmail or office365 as standard imap
Test Plan: manual
Reviewers: juan, evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3686
Summary:
On MG's machine this function is EXTREMELY non performant and causes
things like archive to lock up when the console is running here for some
reason. Not entirely sure exactly what's causing it, but there were some
simple DB cleanups that will make it faster for large queries.
There's likely other things involved since the sequelize DB being locked
up shouldn't affect the peformLocal of the edgehill db for things like
archive. Still looking into that
Test Plan: manual
Reviewers: juan
Reviewed By: juan
Differential Revision: https://phab.nylas.com/D3683
Summary:
Before trying to sync a folder, check if we actually need to do so. This will prevent us from doing unnecessary work that slows down the sync loop (like performing SELECT commands)
We will perform a folder sync if any of the following are true
- The folder hasn't been completely synced
- There are new messages (using imap STATUS command)
- There are attribute changes indicated via highestmodseq (using imap STATUS command)
- If server doesn't support highestmodseq, it has passed enough time since we last ran an attribute scan on the folder.
Addresses T7513
Test Plan: manual
Reviewers: evan, halla, spang
Reviewed By: halla, spang
Differential Revision: https://phab.nylas.com/D3675
Summary:
Currently, our mail sync strategy of expanding UID ranges from UIDNEXT
backwards until a UID of 1 implicitly assumes that every UID corresponds to an
actual message. This assumption is incorrect, and results in several
significant bugs regarding sync status.
This patch fixes issue 1:
Since UIDs are persistent and, so long as the UIDVALIDITY is valid, ascend
monotonically upward, every time you move a message to a new folder you "lose"
UIDs lower down in the range. In my work Inbox, where I get a lot of mail,
archive all the time, and generally have only a small number of threads in the
mailbox, the smallest UID is over 100k. This means that, after all my inbox
messages are synced, the sync loop will continue attempting to download
nonexistent old messages in this mailbox for hundreds of sync iterations, and
will not mark the mailbox as fully synced until fetchmin reaches 1, regardless
of the fact that there are no actually messages being pulled down.
This patch needs a small associated patch to N1 to update how sync status is
calculated (coming soon).
The next patch in this series will deal with gaps in the UIDspace that slow
down syncing of a folder.
Test Plan: manual
Reviewers: halla, juan
Reviewed By: juan
Differential Revision: https://phab.nylas.com/D3677
Summary:
We want to do this in order to prevent send tasks from blocking the sync loop given that they can take a very long time to run. This is especially true when sending emails with large attachments to multiple recipients.
There is no real way to make sending in these cases faster, but we can prevent it from blocking the sync loop at least, especially because sending is mostly I/O bound.
This is a bit messy actually, but should be fixed when we properly implement a sync scheduler
Also added a limit to the total size of attachments you can upload to try to prevent weird EPIPE errors when sending.
See: D3670.
Also moved and renamed stuff a little
Test Plan: manual
Reviewers: halla, evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3669
Summary: Allows us to reset accounts in local-sync too
Test Plan: manual
Reviewers: mark, juan
Reviewed By: juan
Differential Revision: https://phab.nylas.com/D3672
Summary:
I happened to be testing between Jan 2017 and Dec 2016, so I
missed this logic flaw. Boo.
Test Plan: tested locally
Reviewers: evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3667
Summary: We did this for gmail, but not for other providers.
Test Plan: tested locally
Reviewers: juan, spang
Reviewed By: spang
Differential Revision: https://phab.nylas.com/D3665
Summary: While working on separating send out of the sync loop, I realized sync tasks could use some cleanup to be more consistent with how we implemented syncback tasks. I reorganized and renamed things a little bit. This will also help us move in the direction of the scheduler implementation under which everything is a task.
Test Plan: manual
Reviewers: evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3660
Summary:
Only updated within month precision. We can use this to show how
far back a folder has been synced.
Test Plan: tested locally
Reviewers: juan, evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3662
Summary:
Fixes https://phab.nylas.com/T7435
The old deepScan (now `scanForAttributeChanges`) and shallowScan (now
`fetchLatestAttributeChanges`) had some fatal flaws.
If you deep scanned it would attempt to load the message attributes of all
messages ever and cause very bad memory leaks.
Also, if you left a mailbox running for a long time, there was a query
that would eventually run `Message.findAll` and, even though it was just
returning the headers, would still run insanely expensive operations
This fixes (and renames) these issues.
Test Plan: manual
Reviewers: spang, halla, juan
Reviewed By: juan
Differential Revision: https://phab.nylas.com/D3657
Summary:
This swaps out our generic IMAP threading mechanism to use the threading
headers on the message instead of the prior way of grouping by subject
and then differentiating based on participants, as that design was
somewhat driven by what we could accomplish easily given legacy data
schema decisions and has serious caveats, such as different threads between
the same people with the same subject being misthreaded together. With K2, we
have free reign to change the data format, so we can do it right.
The algorithm is super simple:
- Define "references" as the union of the Message-Id, In-Reply-To, and
References headers on a message, filtered for valid RFC2822 Message-IDs
- On message sync, if any element of the new message's references
matches any element of an existing message's references, thread them
together
In order to accomplish this, we need to store References in a way that
allows each element to be indexed for fast lookup. That meant either
using the sqlite JSON1 extension + expression-based indices, or creating
a new table. I chose the latter as a time-tested and simple solution,
since we don't need the flexibility of JSON here.
Test Plan: manual - unit tests coming
Reviewers: khamidou, evan, juan
Reviewed By: evan, juan
Differential Revision: https://phab.nylas.com/D3651
Examined the headers on a message we sent and found this:
> X-Mailer: nodemailer (2.5.0; +http://nodemailer.com/; SMTP/2.6.0[client:2.8.0])
No need to plaster which sendmail library we're using all over every
email our users send. Turn it off!
Summary:
Previously we would unconditionally issue a SELECT when openBox was
called. Now we check if the currently open box is the one we want first and
return immediately if it is, avoiding the unnecessary SELECT (which can be
quite expensive on large folders like INBOX). We were also calling closeBox
after iterating all the messages in a thread to mark them as read/unread.
This was unnecessary and was causing extra SELECTs to be issued. Now we don't!
This diff is a 5x speedup over the old behavior when marking lots of
threads in the same folder as read all at once.
Test Plan: Run locally, measure perf with log statements
Reviewers: evan, juan
Reviewed By: evan, juan
Differential Revision: https://phab.nylas.com/D3654
Summary:
We were returning the wrong type in the case that we got no messages
back from the Gmail search API.
Test Plan: Run locally
Reviewers: evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3646
Summary:
Add a new dedicated imap connection to listen for any updates or new mail on the inbox.
Previously, we wouldn't be able to receive new mail events on the inbox during the sync loop
because other mailboxes would be open while we sync them. This would cause big
delays in receiving new mail, especially if you have a lot of folders
Test Plan: manual
Reviewers: spang, evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3650
Summary:
We don't want to inflate delete Transactions, but we do still want
to pass the delta itself along.
Test Plan: tested locally
Reviewers: evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3647
There are plenty of valid use cases for sending subject-only emails,
and we also want to still download the headers and create message
objects if e.g. the email consists of only an event invitation or
an attachment.
Summary:
This commit passes down the `socketTimeout` option to node-imap. However, just
passing this option doesn't seem to work reliably, so this commit manually implements
the socketTimeout option for our IMAPConnection.
How it works is that basically every operation is wrapped with a timeout by
augmenting the `createConnectionPromise` construct that already existed.
Test Plan:
Locally, tested by sleeping computer and turning off wifi. The
connection will successfully error and be restarted. It will reconnect when the
network is available again
Reviewers: khamidou, halla, evan
Reviewed By: evan
Differential Revision: https://phab.nylas.com/D3642
Summary:
We may find it useful at some point to be able to tell which messages in
a user's mailbox were sent using N1/the REST API vs Nylas Mail.
Test Plan: manual
Reviewers: evan, juan
Reviewed By: juan
Differential Revision: https://phab.nylas.com/D3631
Summary:
Sometimes things go very wrong when trying to syncback an action. For example, the worker window could crash, preventing us from marking the current task as failed. What's worse, another sync iteration could try to run the task which crashed, thinking it a new one. To prevent this, this diff adds a fourth syncback task state, `INPROGRESS`.
New syncback tasks are marked as `INPROGRESS` before being executed. When they complete we mark them as `SUCCEEDED/FAILED`. Stray `INPROGRESS` tasks are automatically marked as `FAILED` at the beginning of every sync iteration, to make sure we don't retry them again.
Test Plan: Tested manually.
Reviewers: evan, juan
Reviewed By: evan, juan
Differential Revision: https://phab.nylas.com/D3635
A bug allowed multiple sync loop processes to start. This could lead to
double sending and the sync loop appearing as thouogh it couldn't be
interrupted