Commit graph

77 commits

Author SHA1 Message Date
Halla Moore 77ad25af24 [local-sync] Stop sync worker before deleting account database
Summary:
Various errors are thrown when the sync worker tries accessing
a database that we've already deleted, so make sure the sync
worker has been stopped before we remove the database. This diff
involves modifying `Interruptible` so that `interrupt()` returns
a promise that resolves once the interrupt has been completed.

Addresses T7472

Test Plan: manual

Reviewers: evan, juan

Reviewed By: evan, juan

Differential Revision: https://phab.nylas.com/D3679
2017-01-15 10:33:12 -08:00
Christine Spang 941c564443 [local-sync] fix comment 2017-01-14 13:31:24 -08:00
Christine Spang a23813a11c [local-sync] For generic IMAP, Thread based on Message-Id, In-Reply-To & References
Summary:
This swaps out our generic IMAP threading mechanism to use the threading
headers on the message instead of the prior way of grouping by subject
and then differentiating based on participants, as that design was
somewhat driven by what we could accomplish easily given legacy data
schema decisions and has serious caveats, such as different threads between
the same people with the same subject being misthreaded together. With K2, we
have free reign to change the data format, so we can do it right.

The algorithm is super simple:
- Define "references" as the union of the Message-Id, In-Reply-To, and
References headers on a message, filtered for valid RFC2822 Message-IDs
- On message sync, if any element of the new message's references
matches any element of an existing message's references, thread them
together

In order to accomplish this, we need to store References in a way that
allows each element to be indexed for fast lookup. That meant either
using the sqlite JSON1 extension + expression-based indices, or creating
a new table. I chose the latter as a time-tested and simple solution,
since we don't need the flexibility of JSON here.

Test Plan: manual - unit tests coming

Reviewers: khamidou, evan, juan

Reviewed By: evan, juan

Differential Revision: https://phab.nylas.com/D3651
2017-01-13 10:39:54 -08:00
Christine Spang 0dcbcee06c [local-sync] Messages generated for send should not be unread 2017-01-13 07:46:18 -08:00
Christine Spang ed57422412 [local-sync] Enable toggling ENABLE_SEQUELIZE_DEBUG_LOGGING on launch via an env var
It's much more convenient to be able to turn this on/off without
editing the code.
2017-01-12 07:43:34 -08:00
Christine Spang 9eb1aef139 [local-sync] Properly handle multiple occurrences of From/To/Cc/Bcc headers
Summary:
Well-behaved MUAs don't do this, but it is totally legal and we should
handle it.

The Python sync engine handles this properly also:
7db949fec9/sync-engine/inbox/util/addr.py (L26-L47)

(Found this out while scratching my head over the output format of
mimelib.parseHeaders---turns out it's an array for a reason:

> let to = mimelib.parseHeaders(`To: Christine Spang <spang@nylas.com>
... To: Foo Bar <foo@example.com>`);
undefined
> to
{ to:
   [ 'Christine Spang <spang@nylas.com>',
     'Foo Bar <foo@example.com>' ] }

)

Test Plan: unit test included

Reviewers: halla, jackie, mark

Reviewed By: mark

Differential Revision: https://phab.nylas.com/D3625
2017-01-10 15:07:36 -08:00
Christine Spang 15ac9f795b [local-sync] Remove deprecated comment
With the latest update to N1's CSS, this is no longer true---though a theme
can easily override the font to monospace if so desired.
2017-01-10 13:39:54 -08:00
Christine Spang 27b2c41fcc [local-sync] Make comment about Date/INTERNALDATE & message hashes better 2017-01-10 12:05:14 -08:00
Christine Spang a234570118 [local-sync] Rename extract{Snippet,Contacts} in message-factory
Summary:
We have another function called extractContacts, in its own file,
extract-contacts.js, which is used to create Contact objects. This has
confused me a number of times and also leads to grep collisions.

This patch also makes the snippet unit tests pass again after a recent
API change.

Test Plan: included unit tests, manual

Reviewers: halla, evan, juan

Reviewed By: juan

Differential Revision: https://phab.nylas.com/D3623
2017-01-10 10:41:35 -08:00
Christine Spang b618c4ef28 [local-sync] Fix typo in ENABLE_SEQUELIZE_DEBUG_LOGGING 2017-01-10 08:16:16 -08:00
Juan Tejada d48ed92d66 [local-sync] Make the sync loop interruptible
Summary:
This commit introduces interruptible sync operations. Now, the `SyncWorker`, `FetchFolderList` operation and `FetchMessagesInFolder` operation can be interrupted at several points during their execution. This improves the performance of SyncbackTasks, which now run almost immediately.

To achieve this, this commit adds an Interruptible abstraction, which is an object that can run functions and interrupt them at points marked by the function. For more info on how this works, see the docs on the Interruptible class.

This commit also splits up the SyncWorker a little bit to make it smaller, byadding a SyncbackTaskWorker.

Depends on D3613

Test Plan: Manual

Reviewers: spang, mark, jackie, khamidou, evan, halla

Reviewed By: evan, halla

Differential Revision: https://phab.nylas.com/D3612
2017-01-09 14:43:15 -08:00
Christine Spang 7073e19fe7 [local-sync] Don't filter contacts w/out emails out of To/From/Cc/Bcc fields
We separately filter out contacts without email addresses before
committing to the contacts table in the database for autocomplete (in
isContactMeaningful()), and if we filter out these already we can end up
excluding legitimate elements of the headers. For example, the Clutter
feature of Office 365 sends emails with a From: header like this:

From: Microsoft Outlook

Fixes: T7413
2017-01-09 13:45:18 -08:00
Evan Morikawa 4a760ff5ec [local-sync] audit database indices
Summary:
Fixes T7398

We were create unnecessary and duplicate indices for the IDs of all of
our objects and increasing db write overhead.

We were not creating the correct reverse index for our join tables.

The search API'd db is already in scope of the accountId, this is an
unnecessary constraint on the query

Test Plan: manual

Reviewers: spang, juan

Reviewed By: juan

Maniphest Tasks: T7398

Differential Revision: https://phab.nylas.com/D3606
2017-01-09 09:45:33 -08:00
Christine Spang 6d72bb2aaf [local-sync] Optimize header & MIME structure download
Summary:
Headers can be quite big, so we might as well download and store only
the ones that we care about. This patch also makes it so we stop
downloading MIME structures twice per message.

While it's possible that we _may_ want to make more headers accessible
later, we don't currently make the generic pile of headers accessible to
N1 or N1 plugins in any way, so doing that would end up requiring
changes to the sync code regardless. I think it's worth optimizing the
base experience rather than trying to predict what we may want in the
future. Plus, it seems more likely that we'll want to build future
extensibility using thread metadata, rather than message headers.

On inboxapptest1@fastmail.fm, this patch decreases the size of the
generated sqlite file for a fully synced mailbox by 35%.

Test Plan: manual

Reviewers: juan, evan

Reviewed By: evan

Differential Revision: https://phab.nylas.com/D3611
2017-01-08 13:27:10 -08:00
Christine Spang 8238fe9594 [local-sync] Correctly handle messages with non-alternative multipart text bodies
Summary:
It's possible to have multiple inline HTML parts in a message, or even
a multipart/alternative part that contains text and HTML, followed by a
plaintext signature. Previously, if there was more than one text part in
an email, we would pick the _last_ text/html or text/plain part that we
found, and treat that as the entire message body. This works most of the
time, but fails to display the full message body in some edge cases.
This patch fixes that by resolving multipart/alternative subparts to a
single part in the mimepart fetch stage, and then treating each desired
mime part separately when parsing the message, concatenating them if
there are multiple.

This makes K2's handling of multipart MIME message text better,
bug-wise, than the Python sync engine's, which has been mangling some
rare messages forever. (Example from my email: every email from the MIT
EECS Jobs List has never displayed the mailing list signature in N1.)

Note that this patch also removes our tentative support for PGP
encrypted messages. I'd rather add that back in later when I've dug up
some real example messages to test on, rather than leaving it in in its
current not-really-tested and probably not-really-working state, since
it makes it harder to make sure that the rest of the logic isn't broken.

Test Plan: manual for now - added examples of this to my growing list of regression tests to add to the message parser unit tests once I fix them

Reviewers: juan, evan

Reviewed By: evan

Differential Revision: https://phab.nylas.com/D3600
2017-01-07 14:11:27 -08:00
Christine Spang 4aa6cb3ca1 [local-sync] Port generic IMAP threading logic from Python Sync Engine
Summary:
We sync messages in the same order as the Python sync engine (newest to
oldest, generally), so we should be able to just use the same threading
algorithm. While we may still want to take into account References /
In-Reply-To at some point, this is a big step up from the current
thread-matching-only.

Test Plan: manual --- could pretty easily port the unit tests from the python codebase if we wanted

Reviewers: khamidou, juan, evan

Reviewed By: evan

Differential Revision: https://phab.nylas.com/D3604
2017-01-07 13:11:33 -08:00
Christine Spang d6e0b7eb8d [local-sync] Do not consider HTML or plaintext attachments to be body parts
Summary:
We were previously not taking into account the 'Content-Disposition'
MIME header, which differentiates between parts intended for display
('inline') and parts that are instead transferred files ('attachment').
See the RFC for more details:

https://www.ietf.org/rfc/rfc2183.txt

Fixes: T7367

Test Plan: unit test coming soon---have the test data and going to fix all message parsing test cases at once

Reviewers: juan, jackie, evan, halla

Reviewed By: evan, halla

Differential Revision: https://phab.nylas.com/D3585
2017-01-05 16:18:48 -08:00
Christine Spang 0cbe8e2600 [local-sync] Trim NUL bytes from body strings
These bytes will cause SQLite to blow up with the following error
on insertion:

'SQLITE_ERROR: unrecognized token'

Fixes: T7331
2017-01-04 15:47:16 -08:00
Jackie Luo a2c9555e2a [local-sync] Fix dates for message hashing
Summary: We were creating duplicate `Message` objects because the formatting for the date was different between `buildForSend` and `parseFromImap`. Now, we create the initial hash using the same format that `buildmail` uses to ensure that we generate the same IDs.

Test Plan: Tested locally.

Reviewers: evan, juan, spang

Reviewed By: spang

Differential Revision: https://phab.nylas.com/D3559
2016-12-29 15:58:21 -08:00
Christine Spang 40c7b09e27 [local-sync] Use mimelib to parse contacts
Summary:
Because of the way we were attempting to parse contacts from
From/To/Cc/Bcc headers by converting them to JSON with a regex, we were
erroneously breaking contacts that contained commas in quoted names into
multiple contacts. This could result in things like parsing multiple
addresses for the From: header, incorrectly!

To resolve the problem, replace our homegrown logic with mimelib's
seemingly excellent parseAddresses(), which handles this and a myriad of
other cases correctly.

Fixes: T7370

Test Plan: unit tests included

Reviewers: mark

Reviewed By: mark

Differential Revision: https://phab.nylas.com/D3565
2016-12-29 15:48:12 -08:00
Christine Spang 09749c53da [local-sync] Fix lint 2016-12-29 10:59:17 -08:00
Christine Spang 6dcc12843c [local-sync] Fix message parsing when no Date header present
Summary:
When the Date: header is not present, use the INTERNALDATE from the IMAP server
instead.

Test Plan: manual for now - will add a regression test for this though

Reviewers: juan, jackie

Reviewed By: jackie

Differential Revision: https://phab.nylas.com/D3556
2016-12-23 18:32:29 -08:00
Christine Spang de1b67287c fix(snippet-parsing): Don't add extraneous spaces after text format tags
Summary:
This was leading us to put funny things like 'Nylas !' in some snippets that used
tags like <i> and <b> for text formatting. This is probs a teeny little bit slower
than the previous version since it invokes a callback on a lot more nodes, but we
can't really fix this issue without knowledge of the preceding tag name.

Test Plan: unit test included!!

Reviewers: evan, jackie

Reviewed By: jackie

Differential Revision: https://phab.nylas.com/D3553
2016-12-23 18:26:35 -08:00
Christine Spang a947cc063a [local-sync] Use mimelib to parse headers
Using node-imap's parseHeader function to parse headers was resulting in
a huge number of message parse failures on Office365 accounts, because
the results contained unicode control character 9 and we'd then feed that
string to JSON.parse when extracting contacts, which would throw an
exception.

Using mimelib's header parsing function eliminates these errors.
2016-12-22 08:51:16 -08:00
Christine Spang e924e74c1b [local-sync] Optimize snippet extraction
Summary:
We were seeing JS blocking in snippet extraction of up to 2k ms. This
is because we were walking the entire DOM of a message and extracting
all text, regardless of message size---and using our own homegrown
DOM walker function.

To remedy this, use the standard TreeWalker from the Chrome browser
APIs (which in benchmarks looks 2-4x faster) and also exit out of
the DOM walking process once we've accumulated enough text to create
a snippet. Informal eyeballing of timing metrics for this function suggests
the new implementation is something like 10-100x faster for some messages.

As a bonus, we get to delete some code and end up with a cleaner
implementation!

Test Plan: old unit tests yaay

Reviewers: juan

Reviewed By: juan

Subscribers: evan

Differential Revision: https://phab.nylas.com/D3543
2016-12-20 15:33:27 -08:00
Jackie Luo 6bbceda709 fix(tracking): Save updated metadata correctly and stop sending account ID with request
Summary:
Save metadata correctly by reassigning an object to value.

Since account IDs are different between N1 and N1 Cloud, use just the message ID, which should be unique.

Test Plan: Tested locally.

Reviewers: evan, juan

Reviewed By: evan, juan

Differential Revision: https://phab.nylas.com/D3524
2016-12-19 15:40:38 -08:00
Juan Tejada 7c701c6369 [local-sync] Fix date parsing and tests
Make sure that we use the header date for our date field because that's
the one we can control and depend on for message id generation
2016-12-19 15:35:50 -08:00
Juan Tejada abc5f35255 [local-sync] Ensure send runs fast, clean up multisend tasks
Summary:
Associated N1 Diff: D3530

This commit converts multi-send from a 3 step process into a 2 step
process

The first step creates the base message and sends a message per
recipient, each with its customized message body for tracking.

The second step reconciles all sent messages, specifically removing any
sent messages created by gmail, and saving the correct message to the
sent folder

This commit also ensures that we run the send tasks immediately by
ensuring we restart the sync loop if its already running

Test Plan: Manual

Reviewers: evan, jackie, halla

Reviewed By: jackie, halla

Differential Revision: https://phab.nylas.com/D3529
2016-12-16 14:41:20 -08:00
Christine Spang e77fbc21e1 [local-sync] Update comment 2016-12-16 11:06:04 -08:00
Halla Moore 5a7aa45629 [local-sync] Fix a couple of message parsing bugs
Summary:
- Don't fail if there's no subject, just set it to `(no subject)`
- Support "BINARY" content-transfer-encoding. (This really means
that there is no encoding, so it's simple to add support for it)

Test Plan: tested locally

Reviewers: jackie

Reviewed By: jackie

Differential Revision: https://phab.nylas.com/D3528
2016-12-16 10:50:35 -08:00
Juan Tejada 3910799683 [local-sync] Fix contact parsing from T7327
Summary: See description at T7327

Test Plan: Manual, but this should have unit tests

Reviewers: mark, evan

Reviewed By: evan

Differential Revision: https://phab.nylas.com/D3513
2016-12-15 12:30:59 -08:00
Juan Tejada e17b6d8d17 [local-sync]: Move sendmail-client and errors to isomorphic-core
Summary:
Move sendmail-client and errors to isomorphic-core, given that they will
probably be used by cloud-workers (plugin backends) and cloud-api

Depends on D3510

Test Plan: Manual

Reviewers: halla

Reviewed By: halla

Differential Revision: https://phab.nylas.com/D3512
2016-12-15 12:29:56 -08:00
Juan Tejada c8e71464f9 [local-sync] Update send endpoints to use SyncbackTasks
Summary:
Associated N1 diff: D3511
Convert send endpoints to use syncback tasks for consistency with how we
perform other imap operations, but primarily:
- So that it triggers a sync loop immediately and we pick up changes quickly
- To keep track of various send operations as a single unit (e.g. sending + saving to sent folder or deleting from sent)

This commit also fixes SyncbackRequest error handling and processing in
N1-- previously we were saving error fields to the syncbackRequests with
a format that didn't match N1's API error and which wasn't properly
serializable. (Also rename HTTPError to APIError)

Test Plan: Todo/Manual

Reviewers: jackie, halla, evan

Reviewed By: evan

Differential Revision: https://phab.nylas.com/D3510
2016-12-15 11:55:40 -08:00
Christine Spang e5a9e2cc9e [local-sync] Allow logging parsed messages to disk with NYLAS_DEBUG env var
Summary:
I've found this useful for generating test cases and am tired
of adding and removing this code!

Test Plan: inspect output of /tmp/k2-parse-output

Reviewers: juan

Reviewed By: juan

Differential Revision: https://phab.nylas.com/D3518
2016-12-15 10:45:48 -08:00
Christine Spang ac5c7e3d2c [local-sync] Escape HTML entities in plaintext
Summary: This was understandably causing some messages to fail to display correctly.

Test Plan: unit tests are already broken for message parsing -- will fix in follow up diff

Reviewers: juan

Reviewed By: juan

Differential Revision: https://phab.nylas.com/D3517
2016-12-15 10:41:36 -08:00
Juan Tejada b79488ae43 [local-sync, cloud-api, cloud-workers] Fix msg id collision, tracking and sending issues, some refactoring
Summary:
This diff solves a few separate issues from T7313, T7316, T7282, and it refactors
the send code a little bit.

Initially, the problem that led to this diff was generating message ids that
wouldn't collide (which was causing errors in the message-processor). Collisions
in ids were being caused by messages that contained the exact same participants,
subject and date (most likely due bots or scripts sending emails in quick
succession)

To prevent collisions this commit adds the `message-id` header as part of the
database message id, and ensures that we set it correctly before sending, and
that it remains consistent through send, multi-send, and the sync loop.

During the refactor and review, I removed some code that assumed that we were
syncing drafts (which we aren't), and also fixes a few other known and
unknown issues around sending, message creation, and tracking, like assigning
the correct date header (we were previously assigning the draft creation date
from within N1), fixing the tracking regex, among other smaller bugs/typos.

Will address inline TODOs in a separate diff

Test Plan: TODO!!! I will add tests in another diff

Reviewers: evan, halla, jackie, khamidou

Reviewed By: halla, jackie

Differential Revision: https://phab.nylas.com/D3507
2016-12-14 19:35:48 -08:00
Christine Spang c214ba1e34 [local-sync] Parse DOM to extract snippets
Summary:
This fixes multiple issues, including snippets telling you you
ought to look at the HTML as well as cruft like HTML entities
and CSS in snippets.

Test Plan: unit tests included o.O

Reviewers: juan

Reviewed By: juan

Subscribers: evan

Differential Revision: https://phab.nylas.com/D3500
2016-12-13 16:32:22 -08:00
Mark Hahnenberg b47cd28d89 [local-sync] Implement /threads/search endpoint for Gmail
Summary: See title

Test Plan: Ran it locally

Reviewers: khamidou, juan, evan

Reviewed By: juan, evan

Differential Revision: https://phab.nylas.com/D3496
2016-12-13 13:44:00 -08:00
Halla Moore 547ff416e7 [local-sync] Fix a couple of multi-send bugs
Summary:
1) Send the custom body, rather than the generic body
2) Extract contacts correctly so that the saved sent message has all the
   participants, rather than just the last one

Test Plan: Tested locally

Reviewers: jackie

Reviewed By: jackie

Differential Revision: https://phab.nylas.com/D3499
2016-12-13 11:45:59 -08:00
Mark Hahnenberg f32d0df7e0 [local-sync] Increment streamAll offset by chunkSize (#5)
Otherwise we'll infinite loop if there are more than 2000 results.
2016-12-09 11:15:04 -08:00
Christine Spang 587f7787a6 fix(local-sync): Fix charset interpretation in message parsing
Summary:
This commit fixes the following bugs in message parsing:
- we were unilaterally decoding MIME bodies as UTF-8; instead, decode according
  to the charset data in the mimepart header
- '7bit' content-transfer-encoding means us-ascii, NOT utf-7
- only interpret valid content-transfer-encodings (previously we were trying
  to treat various charsets as transfer-encodings)
- clearer naming: s/values/parsedMessage/
- unify snippet cleanup between plaintext & stripped HTML (merging
  whitespace etc.)

Test Plan: units tests coming

Reviewers: juan

Differential Revision: https://phab.nylas.com/D3491
2016-12-09 11:01:04 -08:00
Jackie Luo 4a11bfe977 fix(message-factory): Unlink circular dependency 2016-12-08 18:10:17 -08:00
Jackie Luo 6e111c073a fix(message-ids): Use correct hashing for headers 2016-12-08 17:55:39 -08:00
Jackie Luo fae855f0fe feat(message-ids): Hash message IDs and replace in draft before sending 2016-12-08 17:48:34 -08:00
Christine Spang a23c68092e [local-sync] Add specs for message parsing
Summary:
This commit also fixes snippets for HTML-only messages to strip out HTML
tags, and makes us preserve whitespace for plaintext emails by
displaying them in <pre class="nylas-plaintext"> tags, and makes us log
messages that fail to parse at all to a tempdir.

The only issue I found with using <pre> tags for plaintext email was
that some lines may trigger scrolling, so there is an associated commit
(D3484) that changes the CSS for <pre class="nylas-plaintext"> to wrap
lines.

In the future, we can add regression tests to this test suite whenever
we fix parsing bugs.

Test Plan: unit tests included

Reviewers: bengotow

Reviewed By: bengotow

Differential Revision: https://phab.nylas.com/D3483
2016-12-07 07:25:28 -08:00
Ben Gotow 269d61171a [local-sync] fix(specs): New specs for threading 2016-12-05 15:00:37 -08:00
Ben Gotow 30c8bedd7a [local-sync] fix(specs): run npm test in local-sync dir 2016-12-05 12:16:53 -08:00
Evan Morikawa 2ba326dfc6 [local-sync] only enable logging in dev mode of N1 2016-12-02 17:34:05 -05:00
Ben Gotow 7712269402 [*] fix(deltas): Cloud-API not filtering deltas at all, refactor a few things
- Don’t need functions in delta.js which must be called to return promsies. Fun of promsies is that you don’t need to care when they’re built to attach a .then.

- Make boundary between route handler and delta stream builder more explicit, don’t do query parsing in helpers, always reply from handler.

- Remove pushJSON extension to outputStream which never actually received JSON.

- Remove `takeUntil` - disposing of the downstream observable should dispose of all the merged/upstream observables

- Rename inflate => stringify since the returned value is a string not an object.

- Remove support for delta streams with no cursors. Don’t think this was supposed to be a feature.

- Add accountId to Transaction models

- Make database hooks shared in isomorphic core
2016-12-01 18:41:46 -08:00
Karim Hamidou 15cfe2cec0 Strip the '[Gmail]/' prefix from folder names.
Summary:
T7253 has two related parts:
1. Stripping the '[Gmail]/' prefix from any canonical Gmail folders
2. Handling nested IMAP folders.

This diff fixes 1. Changes for 2. are forthcoming.

Test Plan: Tested manually. Checked that the part was stripped from the N1 folder list.

Reviewers: juan

Reviewed By: juan

Differential Revision: https://phab.nylas.com/D3475
2016-12-01 16:49:09 -08:00