proxmark3/doc/new_frame_format.txt
2019-05-04 11:36:35 +02:00

437 lines
18 KiB
Text

A major change is the support of variable length frames between host and Proxmark3.
This is a step especially important for usage over FPC/USART/BT.
Old format
==========
Previously, frames were, in both directions like this:
uint64_t cmd;
uint64_t arg[3];
union {
uint8_t asBytes[PM3_CMD_DATA_SIZE];
uint32_t asDwords[PM3_CMD_DATA_SIZE / 4];
} d;
with PM3_CMD_DATA_SIZE = 512 and there was no API abstraction, everybody was forging/parsing these frames.
So the frame size was fixed, 544 bytes, even for simple ACKs.
When snooping the USB transfers, we can observe the host is sending 544b Bulk USB frames while the Proxmark3 is limited by its internal buffers and is sending 128b, 128b, 128b, 128b, 32b, so in total 5 packets.
New format
==========
Even if we make the payload part variable in the old format, we've still a minimum of 32 bytes per frame with fields arbitrarily large.
So we designed a new format from scratch:
For commands being sent to the Proxmark:
uint32_t magic;
uint16_t length : 15;
bool ng : 1;
uint16_t cmd;
uint8_t data[length];
uint16_t crc;
* magic: arbitrary magic ("PM3a") to help re-sync if needed
* length: length of the variable payload, 0 if none, max 512 (PM3_CMD_DATA_SIZE) for now.
* ng: flag to tell if the data is following the new format (ng) or the old one, see transition notes below
* cmd: as previously, on 16b as it's enough
* data: variable length payload
* crc: either an actual CRC (crc14a) or a Magic placeholder ("a3")
For responses from the Proxmark:
uint32_t magic;
uint16_t length : 15;
bool ng : 1;
int16_t status;
uint16_t cmd;
uint8_t data[length];
uint16_t crc;
* magic: arbitrary magic ("PM3a") to help re-sync if needed
* length: length of the variable payload, 0 if none, max 512 (PM3_CMD_DATA_SIZE) for now.
* ng: flag to tell if the data is following the new format (ng) or the old one, see transition notes below
* status: a field to send back the status of the command execution
* cmd: as previously, on 16b as it's enough
* data: variable length payload
* crc: either an actual CRC (crc14a) or a Magic placeholder ("a3")
We used to send an anonymous ACK, now we're replying with the corresponding command name and a status.
CRC is optional and on reception, the magic "a3" is accepted as placeholder. If it's different then it's checked as a CRC.
By default CRC is user over USART and is disabled over USB, on both directions.
Internal structures used to handle these packets are:
* PacketCommandNGPreamble
* PacketCommandNGPostamble
* PacketCommandNGRaw
* PacketResponseNGPreamble
* PacketResponseNGPostamble
* PacketResponseNGRaw
But they are abstracted from the developer view with a new API.
Transition
==========
Because it's a long transition to clean all the code from the old format and because we don't want to break stuffs when flashing the bootloader, the old frames are still supported together with the new frames. The old structure is now called PacketCommandOLD and PacketResponseOLD and it's also abstracted from the developer view with the new API.
New format API
==============
So the new API is a merge of the old and the new frame formats, to ensure a smooth transition.
The boolean "ng" indicates if the structure is storing data from the old or the new format.
Old format can come from either old 544b frames or mixed frames (variable length but still with oldargs).
After the full transition, we might remove the fields "oldarg" and "ng".
typedef struct {
uint16_t cmd;
uint16_t length;
uint32_t magic; // NG
uint16_t crc; // NG
uint64_t oldarg[3]; // OLD
union {
uint8_t asBytes[PM3_CMD_DATA_SIZE];
uint32_t asDwords[PM3_CMD_DATA_SIZE / 4];
} data;
bool ng; // does it store NG data or OLD data?
} PACKED PacketCommandNG;
typedef struct {
uint16_t cmd;
uint16_t length;
uint32_t magic; // NG
int16_t status; // NG
uint16_t crc; // NG
uint64_t oldarg[3]; // OLD
union {
uint8_t asBytes[PM3_CMD_DATA_SIZE];
uint32_t asDwords[PM3_CMD_DATA_SIZE / 4];
} data;
bool ng; // does it store NG data or OLD data?
} PACKED PacketResponseNG;
On the client, for sending frames:
----------------------------------
(client->comms.c)
*****************************************
void SendCommandNG(uint16_t cmd, uint8_t *data, size_t len);
void SendCommandOLD(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len);
void SendCommandMIX(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len);
*****************************************
So cmds should make the transition from SendCommandOLD to SendCommandNG to benefit from smaller frames (and armsrc handlers adjusted accordingly of course).
SendCommandMIX is a transition fct: it uses the same API as SendCommandOLD but benefits somehow from variable length frames. It occupies at least 24b of data for the oldargs and real data is therefore limited to PM3_CMD_DATA_SIZE - 24 (defined as PM3_CMD_DATA_SIZE_MIX). Besides the size limitation, the receiver handler doesn't know if this was an OLD frame or a MIX frame, it gets its oldargs and data as usual.
Warning : it makes sense to move from SendCommandOLD to SendCommandMIX only for *commands with small payloads*.
* otherwise both have about the same size
* SendCommandMIX has a smaller payload (PM3_CMD_DATA_SIZE_MIX < PM3_CMD_DATA_SIZE) so it's risky to blindly move from OLD to MIX if there is a large payload.
Internally these functions prepare the new or old frames and call uart_communication which calls uart_send.
On the Proxmark3, for receiving frames:
---------------------------------------
(armsrc/appmain.c)
*****************************************
PacketCommandNG
*****************************************
AppMain calls receive_ng(common/cmd.c) which calls usb_read_ng/usart_read_ng to get a PacketCommandNG, then passes it to PacketReceived.
(no matter if it's an old frame or a new frame, check PacketCommandNG "ng" field to know if there are oldargs)
PacketReceive is the commands broker.
Old handlers will still find their stuff in "oldarg" field.
On the Proxmark3, for sending frames:
-------------------------------------
(common/cmd.c)
*****************************************
int16_t reply_ng(uint16_t cmd, int16_t status, uint8_t *data, size_t len)
int16_t reply_old(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len)
int16_t reply_mix(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len)
*****************************************
So replies should make the transition from reply_old to reply_ng to benefit from smaller frames (and client reception adjusted accordingly of course).
reply_mix is a transition fct: it uses the same API as reply_old but benefits somehow from variable length frames. It occupies at least 24b of data for the oldargs and real data is therefore limited to PM3_CMD_DATA_SIZE - 24. Besides the size limitation, the client command doesn't know if this was an OLD frame or a MIX frame, it gets its oldargs and data as usual.
Example with CMD_PING that supports both styles (from client CmdPing or CmdPingNG) and replies with the new frame format when it receives new command format:
if (packet->ng) {
reply_ng(CMD_PING, PM3_SUCCESS, packet->data.asBytes, packet->length);
} else {
// reply_old(CMD_ACK, reply_via_fpc, 0, 0, 0, 0);
reply_mix(CMD_ACK, reply_via_fpc, 0, 0, 0, 0);
}
On the client, for receiving frames:
------------------------------------
(client->comms.c)
*****************************************
WaitForResponseTimeout -> PacketResponseNG
*****************************************
uart_communication calls uart_receive and create a PacketResponseNG, then passes it to PacketResponseReceived.
PacketResponseReceived treats it immediately (prints) or stores it with storeReply.
Commands do WaitForResponseTimeoutW (or dl_it) which uses getReply to fetch responses.
API transition
==============
In short, to move from one format to the other, we need for each command:
* (client TX) SendCommandOLD -> SendCommandNG (with all stuff in ad-hoc structs in "data" field)
* (pm3 RX) PacketCommandNG parsing, from "oldarg" to only the "data" field
* (pm3 TX) reply_old -> reply_ng (with all stuff in ad-hoc structs in "data" field)
* (client RX) PacketResponseNG parsing, from "oldarg" to only the "data" field
Meanwhile, a fast transition to MIX frames can be done with:
* (client TX) SendCommandOLD -> SendCommandMIX (but check the limited data size)
* (pm3 TX) reply_old -> reply_mix (but check the limited data size)
Bootrom
=======
Bootrom code will still use the old frame format to remain compatible with other repos supporting the old format and because it would hardly gain anything from the new format:
* almost all frames convey 512b of payload, so difference in overhead is neglictible
* bringing flash over usart sounds risky and would be terribly slow anyway (115200 bauds vs. 7M bauds).
On the Proxmark3, for receiving frames:
---------------------------------------
(bootrom/bootrom.c)
usb_read (common/usb_cdc.c) -> UsbPacketReceived (bootrom.c)
-> CMD_DEVICE_INFO / CMD_START_FLASH / CMD_FINISH_WRITE / CMD_HARDWARE_RESET / CMD_SETUP_WRITE
also usb_enable, usb_disable (common/usb_cdc.c)
On the Proxmark3, for sending frames:
-------------------------------------
(bootrom/bootrom.c)
reply_old (bootrom.c) -> usb_write (common/usb_cdc.c)
also usb_enable, usb_disable (common/usb_cdc.c)
On the client, for sending frames:
-------------------------------------
Therefore, the flasher client (client/flasher.c + client->flash.c) must still use these old frames.
It uses a few commands in common with current client code:
OpenProxmark
CloseProxmark
SendCommandOLD
-> CMD_DEVICE_INFO / CMD_START_FLASH / CMD_FINISH_WRITE / CMD_HARDWARE_RESET
On the client, for receiving frames:
------------------------------------
As usual, old frames are still supported
WaitForResponseTimeout -> PacketResponseNG
New usart RX FIFO
=================
USART code has been rewritten to cope with unknown size packets.
* using USART full duplex with double DMA buffer on RX & TX
* using internal FIFO for RX
usart_init:
* USART is activated all way long from usart_init(), no need to touch it in RX/TX routines: pUS1->US_PTCR = AT91C_PDC_RXTEN | AT91C_PDC_TXTEN
usart_writebuffer_sync:
* still using DMA but accepts arbitrary packet sizes
* removed unneeded memcpy
* wait for DMA buffer to be treated before returning, therefore "sync"
* we could make an async version but caller must be sure the DMA buffer remains available!
* as it's sync, no need for next DMA buffer
usart_read_ng:
* user tells expected packet length
* relies on usart_rxdata_available to know if there is data in our FIFO buffer
* fetches data from our FIFO
* dynamic number of tries (depending on FPC speed) to wait for asked data
usart_rxdata_available:
* polls usart_fill_rxfifo
* returns number of bytes available in our FIFO
usart_fill_rxfifo:
* if next DMA buffer got moved to current buffer (US_RNCR == 0), it means one DMA buffer is full
* transfer current DMA buffer data to our FIFO
* swap to the other DMA buffer
* provide the emptied DMA buffer as next DMA buffer
* if current DMA buffer is partially filled
* transfer available data to our FIFO
* remember how many bytes we already copied to our FIFO
Timings
=======
Reference (before new format):
linux usb: #db# USB Transfer Speed PM3 -> Client = 545109 Bytes/s
On a Windows VM:
proxspace usb: #db# USB Transfer Speed PM3 -> Client = 233998 Bytes/s
(common/usart.h)
USART_BAUD_RATE defined there
9600: #db# USB Transfer Speed PM3 -> Client = 934 Bytes/s
115200: #db# USB Transfer Speed PM3 -> Client = 11137 Bytes/s
460800: #db# USB Transfer Speed PM3 -> Client = 43119 Bytes/s
linux usb: #db# USB Transfer Speed PM3 -> Client = 666624 Bytes/s (equiv. to ~7Mbaud)
(uart/uart_posix.c)
// Receiving from USART need more than 30ms as we used on USB
// else we get errors about "Received packet frame ... too short"
// Now we're using 100ms
// FTDI 9600 hw status -> we need 20ms
// FTDI 115200 hw status -> we need 50ms
// FTDI 460800 hw status -> we need 30ms
struct timeval timeout = {
.tv_sec = 0, // 0 second
.tv_usec = 100000 // 100 000 micro seconds
};
Add automatically some communication delay in the WaitForResponseTimeout & dl_it timeouts
Only when using FPC, timeout = 2* empirically measured delay (FTDI cable)
Empirically measured delay (FTDI cable) with "hw pingng 512" :
usb -> 6.. 32ms
460800 -> 40.. 70ms
9600 -> 1100..1150ms
(client/comms.c)
static size_t communication_delay(void) {
if (conn.send_via_fpc_usart) // needed also for Windows USB USART??
return 2 * (12000000 / uart_speed);
return 100;
}
Because some commands send a lot of frames before finishing (hw status, lf read,...),
WaitForResponseTimeout & dl_it timeouts are reset at each packet reception,
so timeout is actually counted after latest received packet,
it doesn't depend anymore on the number of received packets.
Needed to tune pm3 RX usart maxtry
(common/usart.c)
uint32_t usart_read_ng(uint8_t *data, size_t len) {
// Empirical max try observed: 3000000 / USART_BAUD_RATE
// Let's take 10x
uint32_t maxtry = 10 * ( 3000000 / USART_BAUD_RATE );
DbpStringEx using reply_old:
time client/proxmark3 -p /dev/ttyACM0 -c "hw status"
2.52s
time client/proxmark3 -p /dev/ttyUSB0 -b 460800 -c "hw status"
3.03s
time client/proxmark3 -p /dev/ttyUSB0 -b 115200 -c "hw status"
4.88s
DbpStringEx using reply_old:
time client/proxmark3 -p /dev/ttyUSB0 -b 9600 -c "hw status"
26.5s
DbpStringEx using reply_mix:
time client/proxmark3 -p /dev/ttyUSB0 -b 9600 -c "hw status"
7.08s
DbpStringEx using reply_ng:
time client/proxmark3 -p /dev/ttyACM0 -c "hw status"
2.10s
time client/proxmark3 -p /dev/ttyUSB0 -b 460800 -c "hw status"
2.22s
time client/proxmark3 -p /dev/ttyUSB0 -b 115200 -c "hw status"
2.43s
time client/proxmark3 -p /dev/ttyUSB0 -b 9600 -c "hw status"
5.75s
time client/proxmark3 -p /dev/ttyUSB0 -b 9600 -c "lf read"
50.38s
time client/proxmark3 -p /dev/ttyUSB0 -b 115200 -c "lf read"
6.28s
time client/proxmark3 -p /dev/ttyACM0 -c "mem save f foo_usb"
1.48s
time client/proxmark3 -p /dev/ttyUSB0 -b 115200 -c "mem save f foo_fpc"
25.34s
Sending multiple commands can still be slow because it waits regularly for incoming RX frames and the timings are quite conservative because of BT (see struct timeval timeout in uart_posix.c, now at 200ms). When one knows there is no response to wait before the next command, he can use the same trick as in the flasher:
// fast push mode
conn.block_after_ACK = true;
some loop {
if (sending_last_command)
// Disable fast mode
conn.block_after_ACK = false;
SendCommandOLD / SendCommandMix
if (!WaitForResponseTimeout(CMD_ACK, &resp, some_timeout)) {
....
conn.block_after_ACK = false;
return PM3_ETIMEOUT;
}
}
return PM3_SUCCESS;
Or if it's too complex to determine when we're sending the last command:
// fast push mode
conn.block_after_ACK = true;
some loop {
SendCommandOLD / SendCommandMIX
if (!WaitForResponseTimeout(CMD_ACK, &resp, some_timeout)) {
....
conn.block_after_ACK = false;
return PM3_ETIMEOUT;
}
}
// Disable fast mode and send a dummy command to make it effective
conn.block_after_ACK = false;
SendCommandMIX(CMD_PING, 0, 0, 0, NULL, 0);
WaitForResponseTimeout(CMD_ACK, NULL, 1000);
return PM3_SUCCESS;
Reference frames
================
For helping debug...
On linux USB
* sent packets can be 544
* received packets are max 128, so 544 = 128+128+128+128+32
On linux UART (FTDI)
* sent packets are max 256, so 544 = 256+256+32
* received packets are max 512, so 544 = 512+32
Initial connection:
TestProxmark: SendCommandOLD(CMD_PING, 0, 0, 0, NULL, 0);
->544=0901000000000000000000000000000000000000000000000000000000000000 -> OLD
CMD_PING: reply_mix(CMD_ACK, reply_via_fpc, 0, 0, 0, 0);
<-36=504d336218000000ff0000000000000000000000000000000000000000000000 <- MIX
main_loop pm3_version: SendCommandOLD(CMD_VERSION, 0, 0, 0, NULL, 0);
->544=0701000000000000000000000000000000000000000000000000000000000000 -> OLD
SendVersion: reply_old(CMD_ACK,...);
<-128=ff000000000000004f0a0b270000000009710300000000000000000000000000 <- OLD
<-128=6572696d656e74616c5f7661726c656e2f33646431663163372d64697274792d
<-128=484620696d616765206275696c7420666f7220327333307671313030206f6e20
<-128=0000000000000000000000000000000000000000000000000000000000000000
<-32=0000000000000000000000000000000000000000000000000000000000000000
"hw ping"
CmdPing SendCommandMIX(CMD_PING, 0, 0, 0, NULL, 0);
->34=504d336118000901000000000000000000000000000000000000000000000000 -> MIX
CMD_PING reply_mix(CMD_ACK, reply_via_fpc, 0, 0, 0, 0);
<-36=504d336218000000ff0000000000000000000000000000000000000000000000 <- MIX
"hw pingng"
CmdPingNG SendCommandNG(CMD_PING, data, len);
->10=504d3361008009016133 -> NG
CMD_PING reply_ng(CMD_PING, PM3_SUCCESS, packet->data.asBytes, packet->length);
<-12=504d33620080000009016233 <- NG
"hw pingng 512"
CmdPingNG SendCommandNG(CMD_PING, data, len);
->522=504d336100820901000102030405060708090a0b0c0d0e0f1011121314151617 -> NG
CMD_PING reply_ng(CMD_PING, PM3_SUCCESS, packet->data.asBytes, packet->length);
<-128=504d3362008200000901000102030405060708090a0b0c0d0e0f101112131415 <- NG
<-128=767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495
<-128=f6f7f8f9fafbfcfdfeff000102030405060708090a0b0c0d0e0f101112131415
<-128=767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495
<-12=f6f7f8f9fafbfcfdfeff6233