(Draft) Contact tracing and privacy protection, with Nicky Case
Added 2020-05-13 21:48:02 +0000 UTC
Hey Everyone,
A friend of mine, Nicky Case, recently put out several excellent posts about COVID-19, including one about how digital contact tracing apps work. Or more specifically, how they can work without tracking people's locations. This is right in the intersection of "very counterintuitive" and "very important", so I wanted to do what I can to help amplify the message, and made this video adaptation of Nicky's post. If all looks good, I plan to put it out tomorrow. Let me know what you think!
-Grant
A translation from Swedish form NyTeknik newspaper:
Swedish bluetooth inventor denounces contact tracking with bluetooth
There are many disadvantages to this method. One of them is that bluetooth has never been developed to make any accurate distance estimates. There is thus a risk that the system will produce false positive or false negative results.
Now Sven Mattisson, one of the men behind the original bluetooth standard, says he shares that assessment.
- In the way that many BLE devices (BLE is the abbreviation for bluetooth low energy) are designed, the RSSI value (RSSI stands for received signal strength indication and is a measure of the strength of a received radio signal) can be quite coarse and not well calibrated, says Mattisson to the site The Intercept.
- And in addition to this RSSI uncertainty, there will be variations in path losses. If there is an obstacle, such as a human body, then there may be a weakening of 70 dB at a distance of 1 meter, while in an open area it may correspond to 10 meters. Exact numbers vary with objects, signal reflections and so on, says Sven Mattisson.
One way to get around this is to use additional positioning techniques, such as triangulation. But then, on the other hand, the user's privacy risks being compromised, which is one of the reasons why the choice fell on bluetooth from the beginning.
Gregor Shapiro
2020-05-15 05:24:22 +0000 UTC
oh CONTACT v contRact. Gotcha.
2020-05-14 20:34:37 +0000 UTC
Amazing, I had no idea. It's a very efficient method!
Jorge Sinde
2020-05-14 19:59:30 +0000 UTC
I think the video nicely explains how the system works, but I think it gives too rosy a view of contact tracing, and incorrectly presents it having no possible privacy concerns. There are ways to exploit contact tracing protocols to determine if someone is sick and who infected who. This website (https://tracing-risks.com/) give some obvious attacks.
That's not to say that automated contact tracing is useless and something we shouldn't do (and this approach is definitely better than collecting a bunch of GPS data like the Norwegian government is currently doing), but it's worth also talking about the potential drawbacks, and not present it as a flawless solution.
Arne Tobias Malkenes Ødegaard
2020-05-14 18:16:51 +0000 UTC
Hi Raphael,
Thank you for the thoughtful comment. To the concerns on privacy via IP addresses, if that's a real worry people have, would it not work to simply use a VPN, given that the content of the uploaded/downloaded message carries nothing identifying?
As to the size of the data store, the way I understand it is that all your codes come from a pseudorandom generating function, so all you really need to upload/download is the seed to that pseudorandom sequence, not every code that you've been broadcasting for the last 14 days.
As to reliability, I certainly agree that it's not 100% accurate, but for that matter neither is traditional contact tracing with interviews. The real goal is to get R to stay consistently below 1, hoping to eliminate all false positives and false negatives will probably never happen.
As to false positives through walls, there are multiple levels of proximity sensing with Bluetooth, though to be honest I can't speak confidently about how the apple/google system will implement things. For my part, I don't think false positive should be much of a concern at the moment, given that having everyone shelter in place as we are now effectively puts an upper bound on the effect of too many false positives.
I will post the video, but you should know I definitely hear the concern you express.
-Grant
3blue1brown
2020-05-14 14:55:54 +0000 UTC
The exposure notification can be tuned to only go off for a sufficient amount of exposure. For example, if 6 codes are recognized, meaning 30+ minutes of contact. But I'm not sure I understand the worry of False positives, because the current practice of having everyone shelter in place is effectively demonstrating what 100% false positives would look like.
3blue1brown
2020-05-14 14:45:17 +0000 UTC
Thanks! This page gives details on providing subtitles: https://support.google.com/youtube/answer/6054623?hl=en
3blue1brown
2020-05-14 14:35:47 +0000 UTC
Hi Nikita,
Thanks for the feedback. The imagery comes directly from Nicky, who moreover helped me in putting together the video. I'm not sure about which post you're referring to, but for the one this is based on, the pink definitely means "infected, contagious, no symptoms yet". If there are other posts with competing conventions, I could see it being confusing, but within the confines of this video, it seems consistent, no?
3blue1brown
2020-05-14 14:34:59 +0000 UTC
When I took public transport to work I first took a bus and then a train, then I worked in an open plan office and finally took a train and a buss home. In fourteen days that's a lot of people even if you only take a sample every 5 minutes (IIRC Apple/Google ping every 10 minutes). The R values for COVID-19 are about 3 so I am likely to infect about 3 people if I get sick but hundreds will receive an exposure notification. That's a lot of false positives.
2020-05-14 11:28:28 +0000 UTC
Wow, it´s so simple! (once explained clearly).
Daniel Armesto
2020-05-14 10:31:25 +0000 UTC
I did post this before but I must have accidentally deleted it. At least I couldn’t find it anymore… So sorry if this appears twice now.
Well this sounds too good to be true and I’m sorry to say, it kind of is. The concept seems sound in principle but fails on multiple fronts in practice.
## Privacy
Having a centralized store of tokens of phones of infected people means phones have to upload and download from that store. This means that the server knows the IP addresses of everyone. And since IP addresses are geo-locatable at least fairly accurately, the no-geolocation promise goes out the window.
Also, since push and pull are different operations, the server will specifically know IP addresses of infected people.
This data store will grow fairly quickly I’d think so we don’t want to be downloading the entire catalog everytime, which means the app will likely use some delta algorithm to download only the tokens it hasn’t yet seen. That would mean sending the date of the last access (or some other identifier that allows the server to decide which tokens to send), which also makes the user a little bit less anonymous even if they have a new IP address.
## Tracing & Reliability
Bluetooth antennae are very uneven in practice. Their signals can travel far (and even through walls) in some directions and not very far in others.
In practice that means that the ways I can be exposed to a Sars-CoV-2 carrier and the ways I can be exposed to the bluetooth signal of said carrier’s phone just don’t overlap all that much.
Someone hastily walking by me and speaking into their phone in Spanish, English or another language with a th-sound (or another sound that spits), might not be in my bluetooth radius long enough for my phone to get the signal. Same thing with surfaces: Sars-CoV-2 can live a few days on ideal surfaces, whose interactions with carriers my phone has no clue about.
Conversely, my phone will pick up my neighbour’s bluetooth signals all the time even though we’re separated by thick walls through which no virus can travel.
Add to that the fact we don’t carry our phones with us all the time (or may have them set to airplane mode).
All in all this probably means such an app would have far too many false positives and false negatives to ever be useful.
Bruce Schneier made the same points here: https://www.schneier.com/blog/archives/2020/05/me_on_covad-19_.html
2020-05-14 10:14:41 +0000 UTC
Interestingly I don’t find my post anymore but I’m pretty sure I didn’t delete it…
2020-05-14 09:53:00 +0000 UTC
This really is a very clear explanation. Thanks Grant. Since I'm currently in India and the government's approach here is the Aarogya Setu app -- which *does* track a user's location, has been force-installed on phones, and is not open source -- this video will be incredibly beneficial in the realm of helping users see how things should be done. There is no reason for these apps to be closed-source (especially when they are created by governments) but, as you mention, the counter-intuitive aspect is that the apps need not track user locations at all, regardless of how they are licensed.
Once the video is released, if you can please link to your instructions for creating subtitles, I will ask my Hindi- and Tamil-fluent friends to create subtitles in both languages.
Thanks again!
2020-05-14 07:28:37 +0000 UTC
Dear Grant,
Thank you for your amazing job! I admire your ability to do the important stuff for the world in such a short amount of time!
I want just to add a few remarks:
* In Nicky’s original post pink faces mean NOT contagious people and this actually changes the plot a bit. As Nicky writes:
“Actually, let's add one more nuance: before an *grey face* becomes an *red face*, they first become *pink face* Exposed. This is when they have the virus but can't pass it on yet – infected but not yet infectious.”
(I was confused on this one while translating Nicky to Russian:))
* There is a part of a grey field of the third message visible on 2:30
Again, thank you for your awesomeness and stay healthy!
2020-05-14 07:27:07 +0000 UTC
I'd love to continue Raphael's thoughts above... Let's think about corrupt government which wants to track Zoomunists (imaginary political activists). Then you force/bribe/convince companies which control phone update to roll out the update which 'by mistake' always report into the hospital 'COVID-1984' tokens without the owner consent. Now you just need to capture/get access to the phones of the few of the activists and connect this with IP addresses and here we go: all the network of Zoomunists is captured.
It is amazing technology, it works well, but it can and will be misused for other purposes.
Mr. Duck
2020-05-14 06:23:00 +0000 UTC
I will likely publish tomorrow morning.
3blue1brown
2020-05-14 03:19:22 +0000 UTC
As I understand it, the "decentralized" aspect is about how alerting happens. The central database has no way of knowing who contacted whom.
3blue1brown
2020-05-14 03:19:06 +0000 UTC
Yes, the description will be full of various links.
3blue1brown
2020-05-14 03:13:31 +0000 UTC
Hi Randy, good suggestion. To be honest, at this point, it's unclear where the best app will come from (at least here in the US), so I'm hesitant to call anything out specifically in the video. It might just end up being whatever Google/Apple build into their phone's OS in due time. In either case, I'll try to keep an updated pinned comment, or otherwise follow up as things become more clear.
3blue1brown
2020-05-14 03:13:12 +0000 UTC
I really enjoyed this post and agree that it would be great to provide links to reliable apps to download.
2020-05-14 03:12:48 +0000 UTC
Thanks for the feedback. I'm not entirely sure I understand what you mean by "say something more", do you want a more thorough definition of contact tracing at that point?
3blue1brown
2020-05-14 03:10:45 +0000 UTC
The random strings are cryptographically secure hashes of sufficient length to make collisions less probable than all of the people with coronavirus suddenly turning into bananas. :) Per the Apple/Google API, you get back the day of exposure (but not the time, at least as it stands now). False positives are definitely still possible, but more related to being within Bluetooth range, but not actually in a potentially transmissible state (e.g. sitting in different cars next to each other at a traffic light).
2020-05-14 03:08:19 +0000 UTC
This is awesome. I actually hadn't heard about this algorithm until watching this video. It seems like such an easy and risk free method that could have a profound impact on the current pandemic and future ones. Can't wait to share this video with my friends and family. Thanks Grant!!
2020-05-14 02:08:57 +0000 UTC
I really wanted to share this NOW. When will you push the final version? (and yes, please add as much info as possible into the video description). How can I contribute by, say, writing subtitles in German? I will start working on them, but need to learn how to get them to you or Youtube. THANKS!!!
2020-05-14 01:21:23 +0000 UTC
The DP-3T solution is decentralized. Do I miss the part where you tackle this aspect? It seems that there is still a central database where the phones of infected people post the emitted tokens to, and where all the other phones pull them from... What am I missing?
Thanks a lot for your great work, I will be investigating further!
2020-05-14 00:46:45 +0000 UTC
Great Video as always. Go ahead and post. But a question of False Positives. I assume the gibberish strings are much longer and hence the chance of two phones creating the same random string is small (but not zero). Also, can you get a time estimate better than just in the last 14 days. Are the strings at all time stamped? Can you exclude a contact from 14 days back, if it is later proven that contact was exposed on 7 days ago (hence I cannot be infected). Otherwise I fear too many False Positive and people will ignore the App.
2020-05-14 00:42:51 +0000 UTC
Great work, let's hope this gets going soon, and with modification as described by Andy Howell above to catch those within 5 seconds of others, or whatever that time turns out to be
2020-05-14 00:15:51 +0000 UTC
Lovely! I hope a link to the original post will accompany the video.
2020-05-13 23:36:47 +0000 UTC
In terms of an explainer, it's perfect. I think it could really have a significant positive impact on public health.
2020-05-13 23:35:32 +0000 UTC
Nice work! Another significant contribution to the COVID effort. It's an even more enjoyable explanation than a Pueyo post. I only have one suggestion, a way to follow up - perhaps a link to the best website/org that is following contact tracing apps and tech. Help us answer the question of which one(s) I should use and when are they ready to download. You'll have our attention and motivation after the video, so it'd be a good time for us to sign up on some org's update list. Maybe it'll be clear when the time comes which app or apps to get and use, but I worry it's going to be a free-for-all. If a bunch of us sign up with an org, maybe they can influence deserving winners in the space. Thanks for your work on this important area.
2020-05-13 23:25:56 +0000 UTC
What Nicky wrote specifically as "As far as COVID-19 cares...", as if to say the virus doesn't care about those who are recovered.
3blue1brown
2020-05-13 23:16:26 +0000 UTC
It's interesting, I like the video a lot. I'd already seen the post it's based on as well.
Mike G.
2020-05-13 22:48:12 +0000 UTC
love it - lots of paranoia in the streets - this might help, considering perhaps many regions are loosening restrictions. It would be really interesting to see if, privacy laws intact, how much of an impact this kind of tech has in a post-mortem of the pandemic.
RHall
2020-05-13 22:36:25 +0000 UTC
Also, +1 on agent-based modelling of disease transmission. This is important.
Poker Chen
2020-05-13 22:34:34 +0000 UTC
It would be useful to mention a short comparison with some of the existing ones: pushed by the Singaporean and Australian governments for example. They started off with storing your phone's and model, which would be a legit privacy loss when combined with other information.
Poker Chen
2020-05-13 22:32:51 +0000 UTC
Hi Grant, I'm new on Patreon! :)
I think that the video is very clear, but maybe you should say something more about that "Contact... Tracing" slide, which is interesting and you don't talk about it.
In fact, i found the "this" you said at 1:47 a litte bit confusing, since it is related to the frase you said before and not to that "Contract Tracing" text appearing almost on the same time. It was hard for me to listen and read two not totally related phrases.
BUT I'm from Italy and my English level is not so high, as you might tell, so maybe it's a comprehension issue just for myself!
Thank you for your work!! :)
2020-05-13 22:25:56 +0000 UTC
In the opening lines, "there are three kinds", but your other video mentioned "recovered" as a category, so that could be seen as contradictory...
武明帥
2020-05-13 22:23:05 +0000 UTC
The one flaw I see in this is that you need to be in the same place at the same time. It will miss the person that sneezes then leaves just befor you arrive. High traffic locations need a repeater the can republish recent packets. For example, everything seen in the last 15 minutes. The video looks good. I will run the app on my phone.
2020-05-13 22:22:50 +0000 UTC
As someone who's been working on a related app, the answer is that you probably won't see real apps doing this for at least a few more weeks. Apple & Google just released their toolkit last week, and it takes some time to develop the apps. Further, they said that it'll only be "official" apps from health authorities that get to use this system, which probably means it'll take even longer. Just my guess.
2020-05-13 22:18:04 +0000 UTC
Neat! I'm a statistician doing some research on the Coronavirus, and it's really cool to see how similar some of the models we're using are, to what you've talked about in your videos. Also appreciated the mental health shoutout at the end :)
2020-05-13 22:14:36 +0000 UTC
Thanks! I'll check it out, and perhaps link to the article from the video description.
3blue1brown
2020-05-13 22:14:07 +0000 UTC
I missed the 3B1B look. But otherwise the video gets the concept across. No objections. P.S. Yes the drawings are cuuuuute.
Lionel Pöffel
2020-05-13 22:11:15 +0000 UTC
Hurry up and get this out there so I can post a link to it for all my friends on Facebook!
2020-05-13 22:10:13 +0000 UTC
Amazing! Great finall message for emphaty
2020-05-13 22:09:10 +0000 UTC
It might be helpful to include information on whether these apps are close to being available for public use, or if that’s something that will happen a bit further down the road. I was wondering that myself.
2020-05-13 22:02:20 +0000 UTC
Amazing! I was wondering if it transmits periodically would it be assumed that the other phones will be transmitting within the timespan a contact occurs?
2020-05-13 22:01:41 +0000 UTC
Great Video! two suggestions:
1) Many people are now trying to use the phrase "Exposure Alerting" (or "Exposure Notification") for these apps to contrast them with traditional contact tracing, via interviews, which does need to know your location history. This shift was started by this blog post by Harper Reed: https://harper.blog/2020/04/22/digital-contact-tracing-and-alerting-vs-exposure-alerting/ . You might consider adding that phrasing to the video, as an alternative for people to use.
2) The use of location in contact tracing apps is a bit of a nuanced subject, I just published a post all about that here: https://medium.com/@thefutureian/should-contact-tracing-apps-use-your-location-563d3cea3d95 . The long and the short of it is, yeah, not sharing location is absolutely the right choice for these protocols, but it might be useful to allow your app to track your own location locally (without sharing it) so you have more context on the exposure alerts you get. Not suggesting you include any of this in the video, just thought it might be interesting for you.
2020-05-13 21:59:59 +0000 UTC
Haha, whoops, that's just a typo
3blue1brown
2020-05-13 21:57:30 +0000 UTC
You got me a bit confused with the "Contract tracing" headline - I was thinking in the direction autonomous agents or something like this at first. Maybe removing the "r" might help.
2020-05-13 21:53:03 +0000 UTC