fig. 05
security writeupfeb 02, 20264 min read

whatsapp trusts the sender. it shouldn't.

a thirteen-line spoof. four hundred and seventy-two lines of go around it. one twenty-four megabyte binary.

whatsapp's reply-quote ui is a gentleman's agreement.

when you reply to a message, your client builds a ContextInfo object containing the original message id, who sent it, and what it said. your client then sends that to whatsapp's servers, who pass it to the recipient, who renders the familiar little quote bubble above your reply.

at no point does anyone check that the quoted message ever existed.

what the spoof actually does

whats-spoofing is one go file (main.go, 472 lines), one html page, and a tiny http server on 127.0.0.1:8080. you scan a qr code with your real whatsapp account, paired the same way as whatsapp web. you fill in a form with four fields:

  • the chat id you want to spoof in
  • the user id of the person you want to attribute the fake quote to
  • the text of the fake quoted message
  • your actual reply

the function sendSpoofedReplyMessage (line 403) builds a waE2E.ExtendedTextMessage with a ContextInfo whose Participant is the impersonated user's jid and whose QuotedMessage.Conversation is whatever string you passed in. then it sends. the recipient sees:

person you trust said:
  "hand me the keys, i'll do it tomorrow"

[your real reply]

except they didn't say it. you said they said it.

what it does not do

it does not break end-to-end encryption. it does not bypass the signal protocol. it does not intercept anything. it does not even spoof your sender — your account, your phone number, your real signed device key, all genuine.

the trick is that the ContextInfo is constructed before encryption, on the sending client. the sending client is trusted to honestly report what it's quoting. whatsmeow (the go reimplementation of whatsapp's multi-device protocol) just sends what you tell it to send. libsignal, doing its job correctly, encrypts the lie end-to-end.

the qr endpoint

the most interesting security choice in the project is not the spoof. it's that the qr code for whatsapp authentication is rendered by sending the pairing data to api.qrserver.com, a third-party qr-image api, over the open internet.

the project is called whats-spoofing. it is, nominally, a security tool. it sends your whatsapp pairing data to a random api so it can be drawn as a png.

i would like it on record that i did not write that line of code in a state of grace.

the dead-code download handler

there is a download function at line 445 that handles every kind of incoming media — images, audio, video, documents, stickers, contact cards. it carefully extracts each one, pulls the bytes, and then logs that they exist. nothing else happens to them. no file write. no database insert. no upload. they are downloaded into a buffer, examined, and discarded.

i do not remember why i wrote that. i remember being very thorough about it.

the binary

the entire spoof, in source, is roughly thirteen lines of message construction. the rest is wiring: the http server, the sqlite session store, the event handler, the qr endpoint, the form. the compiled whats-spoofing.exe is twenty-four megabytes. the ratio of "lines doing the actual exploit" to "bytes of statically-linked go binary" is roughly thirteen to twenty-four million.

go is a productive language.

what i learned

  • if a protocol lets the sender build the metadata, the sender will eventually build whatever metadata they want
  • e2e encryption is a guarantee about the channel, not about the content. the channel can faithfully deliver a lie.
  • when your security tool sends pairing data to a third-party png renderer, you have not made a security tool
  • if you ever receive a screenshot of "what someone really said" on whatsapp, the gentleman's agreement does not apply

the source is on github. people have asked me to take it down. i have considered it.