Marcel Salathé's Blog

Goodbye

This is my last entry on this blog. I had good fun writing it, most of the time. My blogging was infrequent, but I enjoyed having a home away from social media where I was able to jot down my thoughts.

My last post is almost three years old. On March 15, 2020, I wrote about my outlook on the COVID pandemic. I expressed hope that a vaccine would be available within the year, which turned out to be the case. But many other hopes did not materialize, or in a different way.

How the world has changed since then! Or has it? Maybe it’s not the world that’s changed, but my perspective. I have morphed from a general optimist to a long-term optimist and short-term pessimist. Pre-pandemic, I always thought that the world would solve its problems by being reasonable, i.e. guided by science and using responsible technology. Post-pandemic, I still think the world will solve its problems in the long term, but it will get there in a very messy, chaotic, human way where science and technology are only a small part of the equation in the short term.

While I found that initially frustrating, I am now looking at it with a sense of awe, and humility. And it's boosted my motivation to do my part, however small, to ensure that the voice of science and responsible technology is heard.

Personally, most of that activity goes into an organization that acts on the local level (in Switzerland) called CH++, where I am working with an incredibly talented & growing group to strengthen the scientific and technological competencies of politics, authorities and civil society.

Professionally, the COVID-19 pandemic has been an eye-opener for me in terms of the potential of digital epidemiology. I’m thrilled for the advancements that will be made in the coming years and decades, and I’ll do everything in my power to help steer it in the right direction. This spring, I’ll be teaching a digital epidemiology class at EPFL, and I’ll be releasing a book along with it. I’m also launching a digital epidemiology substack to write about interesting developments (past, present, and future) in the field.

I am sunsetting this blog because writing takes a lot of time, and good writing takes even more time. A few years ago, I started a newsletter on AI & applied machine learning, but the field is simply too vast. Therefore, I have decided to focus on the one area that truly fascinates me beyond everything else - digital epidemiology. If you want to give it a try, it’s over at digitalepi.substack.com. I’ll post my first piece in the coming days.

So long, and thanks for reading!

COVID-19: Some thoughts on what's next (Mar 15)

[This post is replacing a Twitter thread]

What a time we live in. Since many weeks, epidemiologists around the world had been looking at the COVID situation with great worry. As the story became bigger and bigger, some of us were sharing our thoughts both with decision makers and in the media, and were promptly called alarmists. But that's past, and water under the bridge. I'm offering here some thoughts of where I see things heading. This is not a scientific assessment, but rather a personal one.

All of Europe, and the US, is fighting an exponentially growing threat. The hashtag of the moment is #FlattenTheCurve - the idea to mitigate the epidemic wave in order to not overwhelm health care systems - but when you fight an exponential, at some point, a bit of flattening won't do enough. The question is of course when that point is.

Some countries in Europe have now gone into shutdown because they think they've reached that point. When you look at the daily case data, and the reports from hospitals, this is not surprising. What is surprising to me is that even at this point in time, many people have a hard time understanding exponential growth.

Following a shutdown, a few weeks later, the numbers will indeed go down. It doesn't happen immediately because before the shutdown, the many people who got infected in the days before will will eventually get COVID19, thus the numbers will still increase for some time (the incubation period is up to 14 days). But when the numbers finally do go down, hopefully all governments will have copied South Korea's strategy of testing and isolation, and have the infrastructure in place to test and isolate.

Since most countries have been in some sort of partial shutdown before, this will likely be the time when the total shutdown will be relaxed back into a partial shutdown. During that time, I hope that every single case will be treated very very seriously, with all the isolation and quarantining necessary.

Once that is in place and we can feel we have it under control, there will be a resemblance of life as we knew it. We will be careful, but confident - local outbreaks can be contained. The end game is the vaccine. When the vaccine arrives, it will be a magical moment for many, as life can finally go back to normal. The relief will be enormous, followed by a massive economic boom. History books will be written.

The optimist in me is hoping for more rapid relief. At any day, a (non-fake-news) announcement of a medical intervention drastically lowering severity and mortality may appear. That would instantly improve the trajectory. The pessimist in me is worried in particular about middle income countries, where economic hardship could lead to serious instabilities.

All of this is speculation, but I am sharing it because it may help others think through the options. I am still among the optimists who thinks a vaccine is possible this year, and unexpected drugs may lead to rapid relief. But whatever happens, 2020 will be year like none we've ever seen.

May we be strong, may we be lucky, may we be healthy. But most of all, may we learn something out of it. Never again shall another pandemic - basically guaranteed unless we get very serious about it - hit us at such a mind-boggling stage of unpreparedness.

Death by Algorithm - an Amazon Post Mortem

On February 12, my research lab at EPFL received an email from Amazon (via the email address "mturk-noreply@amazon.com") which contained the following:

Greetings from Amazon Mechanical Turk,
We regret to inform you that your Amazon Mechanical Turk (MTurk) account has been suspended effective immediately. We took this action because you did not provide true and accurate information about your location during the registration process, which violates our Participation Agreement.

You can review the complete Participation Agreement at the link below:

https://www.mturk.com/participation-agreement

Your remaining prepaid HITs balance will be refunded to the payment method you used to purchase the Prepaid HITs once your outstanding Worker liability has been resolved.

Wow.

Some context: Amazon MTurk is a crowdsourcing platform that allows us to crowdsource certain tasks to people, and pay them for it. This platform is very popular especially in research. In our case, we use MTurk to annotate tweets, and while our usage of it is on and off, this email caught us in the middle of intense MTurk activity for a coronavirus study.

I should also add that we've been using Amazon MTurk for many years, and have spent tens of thousands of dollars on it. That's in addition to the tens of thousands of dollars we spend on AWS infrastructure every single year. We are no Netflix, but by all common standards we should be considered a good customer.

36 hours later, we've been able to resolve the situation. Through a personal contact at Amazon, I managed to get a clear explanation from MTurk. Having used the service for many years, I created the account back in the day while I was still in the US. Since a few years, we've been using the account from Switzerland. For some reason, this discrepancy - usage from Switzerland while having an address in the US - only triggered the system now.

The person at MTurk was very helpful and helped me resolve the issue quickly. We're up and running again. But many things about this experience rubbed me the wrong way (if you happen to follow me on Twitter, you have probably seen my strong reaction). I tried to summarize everything in an email to Amazon, which I am posting here as well.

Hi [name redacted]

Many thanks for your detailed message, and your help - much appreciated.

I’d like to explain my side of the story, and the reasons for my strong reaction. I’ve always been very positively impressed by interactions with Amazon staff, and after all, many of us learned customer obsession from Amazon. That’s why when we received the email on 2/12 about immediate account suspension, we were dumbfounded. How could this happen?

Your explanations now make things clear to me. The short version is, your system must have gotten triggered by an old US address from my time at Penn State many years ago, and as you correctly observed, all of our activity is coming out of Switzerland. But that has been the case for almost five years now, so I don’t understand why your system got triggered now, all of a sudden.

In any case, I don’t have insights into your system, but I’d like to share with you the customer experience side. We’ve been using Amazon MTurk for many years, on and off. We must have spent tens of thousands of dollars on it. We use AWS infrastructure for all our projects, and we spend tens of thousands of dollars on that every year. Right now, we were in the middle of a coronavirus research project when we got the message of immediate account suspension. From your records, I was able to find a message from MTurk on 1/29 at 4:52 AM my time, indicating that there might be an issue. At that time, I was at a large machine learning conference that I organize, and I had my email on auto-respond saying I won’t be able to read email, and that important mails should be resent. Of course, given that the Amazon email was automatic and coming from a noreply address, my response went nowhere. Thus, not having read that email, your system kicked us out without further warning on 2/12.

Initially, I thought you had suspended our account without warning. I can see now that you had sent us one warning. But I still find the action extremely draconian. Emails can get lost in spam filters, or can be missed due to multiple circumstances. To suspend an account that has had no problems for many years and spent reasonable amounts of money because inaction after a single email is extremely frustrating, to say the least. We can live with not having access to MTurk for a day or two, but given our dependence on Amazon for infrastructure, the entire experience makes me incredibly nervous. Are we at risk of waking up one day with all our platforms down because we missed that one email? I can’t really take that risk, as I am sure you understand. So my first recommendation would be to follow up with one or two more emails. Rather than waiting 14 days between an email and a decision, please send a few emails in that time period, with escalating warning messages.

In any case, after not having seen any action on the account, your system decided to suspend the account. This left us completely stranded. Why would you do that? If after some warnings your realize that customer has taken no action, why don’t you just shut down the service provision - without account suspension? That would have allowed us to immediately log into the account, see that something is wrong, and fix it - problem solved. The fact that you completely locked us out is actually the worst part of the story. It’s like a landlord locking you out of your apartment because you didn’t fix something, but you can’t get back in to fix it. So my second recommendation would be to not suspend accounts, but suspend services.

If you would follow these two recommendations, the situation would have been completely avoidable. We probably wouldn’t have experienced an interruption, and even if we had, we would have been able to fix the issue immediately.

I have nevertheless two other recommendations. The first is to make your emails more clear. The statement that "Our records indicate that you have provided incomplete or inaccurate information during the Amazon Mechanical Turk (MTurk) registration process” is not meaningful for me as a user. The addresses in the system didn’t show me any problems. Only your email below made it clear to me what the problem was (the old US address). In addition, keep in mind that I registered the account many years ago, which made the message even more confusing (what exactly did I do wrong many years ago?).

The fourth recommendation is to provide better ways to contact your support. Your actions put services into severe distress. Please give them the option to call / online chat somewhere, rather than providing a link to a totally standard, bland support form. I am privileged to be able to have a small audience on Twitter, and to know some Amazon employees personally. But that should not be a condition to get a situation like that resolved quickly. Put a human back into the loop. Yes, it may cost some money - but the risk of losing customers may be more costly down the line. And you’d probably find that the first two recommendations would mean that most situations would anyways sort themselves out before it gets to that stage. But the last option should always be the ability to talk to an actual person.

I hope this feedback is useful. As a service provider myself, I often find that I can learn the most from detailed customer feedback, even - or especially - when it’s criticism. In the grand scheme of things, the magnitude of this event is very low; but in an age of increasing concern about automated decision making, I nevertheless feel the episode is symptomatic of a development going in the wrong direction, and hope to have provided some ways to correct course.

Very best,

Marcel

Applied Machine Learning Days

The Applied Machine Learning Days (AMLD) 2020 is around the corner - in about two months, we’ll open our doors for the fourth time to an expected audience of over 2000 people, with 30 hands-on sessions and 29 tracks on machine learning and artificial intelligence with top speakers from around the world.

An industry representative told me the other day “it’s amazing how you managed to create such a high quality event, and such a brand, from nothing, in just three years”. I’m very grateful for these kind words. But it also made me think. Did we succeed? Where could we go? Where should we go? What did we do right, or wrong? And then I also realized that there is no written record of how AMLD came to be, and so I felled compelled to write this post.

In 2016, my lab at EPFL launched crowdAI, an AI challenge platform (today, it’s a spin off with the name AIcrowd). The idea was to run public machine learning challenges, in an affordable and open source way. We had ideas for a few challenges, specifically also around the research we were doing, and we knew how to build a challenge platform - but what could we offer to the community as prizes, with loads of money not being an option? After some thinking, we decided that one cool prize could be to bring people to Switzerland, in the winter (snowy Alps!), to a small workshop where top performers in these challenges could share their approaches, and learn from each other.

Around the same time, EPFL hired to new professors, Martin Jaggi (machine learning) and Bob West (data science). The three of us felt like it would be a cool idea to create something a bit bigger than a small workshop around the topic of applying machine learning to lots of interesting problems, and we created the Applied Machine Learning Days, with the idea to bring together ML practitioners to share do’s and don’t of this exciting technology. Without any resources, we went ahead and started inviting people to speak at AMLD, and managed to attract a great speaker setup, mostly from our own personal networks. We were hopeful that around 100 people would show up (ML was a hot topic, after all). To our great surprise, hundreds of people signed up, and the largest room we could find on campus in the short term had a capacity of 450 people. AMLD 2017, a 2-day event with talks, was a great success, and we were motivated to do more after that.

When putting together the website for AMLD 2017, I added the slogan “2 days of talks and tutorials”. But given that the AMLD 2017 organization was rather rushed, we did not really have the time to organize tutorials. So for AMLD 2018, we created a call for workshops, and to our great delight, numerous high-quality workshops were proposed. Given that AMLD 2017 ended up being much bigger than planned, we felt that AMLD 2018 could be even bigger, and in the summer of 2017, we brought on an event manager to help us coordinate the event full time (hi Sylvain!). AMLD 2018 thus became a 4-day event, with 2 days of workshops, and 2 days of talks. Almost 1000 people ended up coming to the event, with the workshop weekend completely booked out and long waiting lists.

At this point, we realized we had hit a nerve. People really seemed to like the mix of academia and industry. In parallel, many AI events were popping up left and right (and of course we were not the first either), with some of them being very much focused on marketing and sales, while traditional ML conferences were highly technical. We seemed to have found a sweet spot in between these two extremes, where practitioners and enthusiasts from all types of organizations could come together and learn from each other.

AMLD 2018 was great, but we realized that the single track model would not work for much longer. Thus, the idea of domain-specific tracks - AI & your field - was born. For AMLD 2019, we opened a call for tracks, and once again, the community came along and put together awesome tracks! Given the expected increase in size, we asked Sylvain to stay onboard full time :-). Overall, AMLD 2019 ended up being again a 4-day event, with 2 days of workshops, and 2 days of conference with both keynotes and domain-specific parallel tracks that over 1700 people attended. Speakers like Garry Kasparov, Jeff Dean, and Zeynep Tufekci gave the event a very special vibe.

For AMLD 2020, we primarily thought “never change a winning team”. But nonetheless, I became personally frustrated that while we were holding this interesting event, the public discussion was getting increasingly negative and concerned about this technology, and most of the uncertainty - not surprisingly - was about work, jobs, and skills. So we decided to extend AMLD by one day, and to have a third day that more specifically focuses on all things AI & economy: jobs, skills, employment, HR, social policy, startups, etc. which we're organizing with our neighbors and colleagues from the University of Lausanne. Given the growth, we recently also brought on another person to help with the organization (hi Pauline!).

And once again, the community came along and put together absolutely stunning workshops and tracks. Some of the tracks have such a stellar speaker line up that they would very much go through as independent conferences in their own right!

On reflecting what made AMLD work so well, in such a short time, I’ve come to learn a number of insights. The first is to create an event that you would love going to. This is a truism in industry, certainly in the consumer sector - if you are not using your own product or service, why would anyone else? I keep reminding people that we are not organizing AMLD because somebody told us to. We are doing it simply because we want such an event to exist. Indeed, one of the most difficult thing for us as organizers is to not be able to enjoy the event as visitors. Tough life 😉

The second insight is to not do it alone, but together with others. People were often shocked to hear that the event management team was composed of one person, for a conference of the size of AMLD. But of course there were hundreds of volunteers behind the scene, from the volunteers helping, people in the labs of the organizers, and others who came to help during the event. And most of all, of course, the workshop and track organizers who put together the program.

The final insight is to take it easy on the hype, and just stick to quality. The amount of AI bullshit available on the internet and at some events has taken on rather stunning proportions. Personally I have nothing against some long-term thinking and some excitement around it. But at some point, one should put up, or shut up. It’s for that reason that we want AMLDs to always be associated with academic institutions. That is not to say that non-academic institutions wouldn’t be able to put together great events; of course they are. But academic institutions have the benefit that they are full of deeply skeptical scientists that won’t tolerate overselling for too long, and most speakers will naturally focus on serious work when they present at an academic institution.

So, what is the future of AMLD? I can’t say for sure, but it’s worth reflecting on what the ultimate goal of AMLD is. An event is a huge effort, both for organizers and attendees. If you calculate the overall costs, and the energy spent by thousands of people coming together in a particular location, the numbers are absolutely enormous. So there’d better be a very good reason why you do this. For me, the ultimate reason to organize AMLD is to make sure that this technology remains on people’s radar, and becomes accessible to them. Modern machine learning is once-a-lifetime kind of technology, and may even end up being a once-a-century kind of technology. If AMLD can help many more people to understand this technology and use it for their goals, then it will have been worth it. Because I believe very strongly in Feynman’s observation of “what I cannot create, I do not understand.”

That is the ultimate reason I believe that AMLD should grow much more, both in size and in scope. To give you an idea of the importance of machine leaning, PwC believes that by 2030, AI (they mean machine learning) will boost GDP by 13% globally, and up to 26% locally. That’s 15.7 Trillion Dollars, more than today’s GDP of China and India combined. But more than money, machine learning will affect all social systems deeply. Not mastering this technology is simply not an option. Events like AMLD can do their share to ensure a well informed society, from academia to industry to the general public.

Why I am not interacting on LinkedIn

TL;DR: I won't participate in LinkedIn communications, because I have no more trust in LinkedIn. For important matters, please send email instead.

___

LinkedIn would have the potential of being useful. Unfortunately, it has in recent times become a master of dark patterns (see https://www.darkpatterns.org/), and I see no indication of this stopping any time soon.

Just a few examples:

I've been getting LinkedIn requests from women "to share the passion of love", with LinkedIn apparently being unable / unwilling to filter this as spam.
LinkedIn has started prefilling communication fields in ways that I do not like at all - for example, pre-filling responses with "Hi XYZ, thanks for reaching out. I’d like to learn more." - No, I don't like to learn more. It's preposterous to pre-fill communication forms like that, LinkedIn.
I've been getting emails saying "You have 1 new message" - but instead of showing the message right there (I am looking at email right now, for goodness' sake), LinkedIn forces me to open the website, or the app, so that it can track me better.

I could of course leave LinkedIn, the same way I left Facebook a few years ago. The problem is that the network itself is very interesting, and unlike Facebook, I never had any real trust that information on LinkedIn would be private. So rather than abandoning it, I just want to clarify that I am not using it for communication purposes, because a) I cannot assume our communication to remain private, and b) I'd like to stay away from the dark patterns of LinkedIn as much as possible. Like everyone else, I am trying to do my best to mitigate the onslaught of digital overload - my detachment from LinkedIn is a further step in that direction.

The perils of "free" education

Imagine reading one day on a restaurant website the following:

"Come eat with us FOR FREE! That's right - we believe in open food, and that nutrition is a basic human right for everyone! We provide free meals, created for people at any hunger level! Eat as much as you want!"

Sounds ridiculous, doesn't it? Even a place that would offer "all your nutritional needs covered for just $49 / month" sounds incredibly suspicious. Would you really eat there? What could they possibly be putting on those plates to cover their expenses?

It's pretty simple - you know that food has a price, and that the people making and serving it have expenses to cover - so anything extremely cheap is most likely very bad quality, or a scam, and immediately raises red flags.

Yet, when it comes to education, we quickly seem to let our guard down. Free courses to learn anything I want? Bring it on! A full education for $49 / month? Sounds good, count me in! No suspicion is raised - this is normal. After all, most of us didn't pay for school either.

The fundamental problem there is that we easily confuse education with information. Yes, information can be free, and when it comes to knowledge about the world, I'd argue it should be free. But education is not just information - not by a long shot. Education is helping learners make a selection about what is worth learning (at least initially); it's helping learners differentiate good quality from bad quality; it's helping learners when they get stuck; it's reviewing learners' work, and give them guidance on how to improve; it's assessing their knowledge at regular levels, and eventually putting your name to vouch for the level of know-how they have. And it's a million other things as well, as anyone who has ever taught another person anything can readily confirm.

Many of these things cannot be automated yet, and the question is not only if they ever will be, but also if that's what learners really want. But whatever the future may bring: today, when you are getting an education, someone is paying for it. And thus, if it's free, or almost free to you, then someone else is paying for it. Do you know who that is, and why they are doing it? Do you know "the deal"?

Our students at EPFL, who currently pay about 1'200 $ per year - a tiny fraction of the true cost - (hopefully) know that it's the Swiss tax payers who are paying for them. They also hopefully know that the tax payers are paying because they think they're getting more in return in the long run - the until know safe assumption being that a well-educated population will be a wealthy population. Same for all the parents in the country (the vast majority) who send their kids to the excellent public schools, at no direct cost to them - again paid for by the tax payers, for the same reason.

This is not free. In fact, many governments spend multiple percentage points of their GDP on education. The EU, for example, spent 715 Billion Euros on education in 2017. That's right: that's € 715'000'000'000 in a single year. So much for free education.

So that makes you wonder - what are all the people thinking who are signing up for (almost) free online education? That somehow, all of these mechanisms don't apply anymore? Part of the problem, as mentioned above, is that we confuse online education with online information. Online information can be free, yes - although even there, its creation and maintenance costs something.

But the problem with (almost) free online education goes further. In the same way that you are not a Facebook user, but the Facebook product (with advertisers being the customers), free online education means that you're not the only one who is learning something - someone else, with vested financial interests, is also learning something about you. What kind of learner you are, for example. How quickly you grasp new concepts. How well you work with others. How you solve problems. How you search for solutions. How motivated you are. If you go to any job interview, these are exactly the kinds of things companies want to know about potential hires. And there is a huge market developing that sells this information about you to companies that are hiring - directly, or indirectly through recruitment services. It's worth a lot of money - enough to pay for the education.

It may be a deal worth making. But we should be aware that there is a deal in the first place, and most of us simply are not. And we should realize that the education that we hope would advance our career, may actually be putting a break on it.

There is a long term solution, and a short term solution. The long term solution is appropriate legislation - that learners getting the free education deal must be kept totally in the clear about this deal. Perhaps an even better solution would be to prevent such deals in the first place, at least for adult, continued education; I'm not entirely sure yet. The short term solution is to prevent the problem in the first place, and find someone who is truly interested in your education so that they will pay for it. Oftentimes, that will be you; other times, that may be your employer, or perhaps even your government, should you be so lucky to live or work in an environment that supports life long learning and continued education.

At the EPFL Extension School, we think about these issues a lot. We offer courses and programs for digital up-skilling online, and the topics of lifelong learning, online education, and data ownership are parts of our daily discussions. The entire learning experience going through the EPFL Extension School is what we offer as a service, and because of that, we don't even have to think about monetizing any data about our learners to anyone. In fact, we viciously protect our learner's data, far beyond our legal obligations. Being in total control of our learner's data was also a major factor when we decided to build our own learning platform, rather than using someone else's.

I think that's the fairer deal.

1. Meditate

This is one of the brain tools I can't really understand anymore how I managed to do without.

My favorite - and largely only - form of meditation that I practice regularly is mindfulness mediation. I first encountered the concept about 15 years ago when I came across a book called "Wherever you go, there you are" by Jon Kabat Zinn. I was about to become a PhD student at the time and so my natural instinct was to think that this was likely some trivial nonsense. But I was in enough adolescence-related mental pain at the time that I thought I'd give it a try. It changed the way I looked at myself, and at how the mind works. It was the first time when I fully realized, I am not my thoughts, and that thoughts are objects I can study objectively. I've been expanding on this concept for quite some time ever since then.

I most recently came back to regular practice with the Waking Up app, which I very much like (and I can also recommend the book with the same name by the same author, Sam Harris).

Mindfulness meditation has become a key tool for me, and today, as we are in the midst of the attention economy, being able to realize when someone tries to hijack your mind has become extremely valuable. That's of course in addition to all the benefits you get from realizing when your mind gets hijacked by your own thoughts. I now rank the ability to do basic mindful mediation so highly that I will teach my kids to understand it before I teach them how to code (and if you've ever been on the receiving end of one of my sermons about everyone having to learn how to code, you know what that means).

So this is my first advice: Look into mindfulness meditation.

Notes to my younger self

I recently heard someone on a podcast ask a guest, "what advice would you give to your younger self"? The question was rhetorical, of course, as the younger self clearly missed the chance to listen to any advice, but I thought it was a nice way of soliciting condensed advice based on years of life experience. And naturally, I started thinking, what advice would I give? After some reflection, I decided to write it down in small, bit-sized blog posts. It's not going to be useful to me - but in the same way that I occasionally find other people's advice very useful, I hope this may be of use to someone else (hey there!).

Come to think of it, "advice" may be the wrong word here - let's go with "ways to think about the world based on some things I've experienced". I'll keep this list going and growing for a while. Whatever reason brings you here, I hope one of those may change the way you look at certain things in a way that benefits you. That'll already have made it worth it (#payingitforward).

1. Medidate

Facebook

It's impossible to escape the Facebook "scandal" at the moment, and it's important to be fully aware of what is going on. I think this is a defining moment in our digital evolution as a society, so it's worth spending some time reflecting on what is happening.

As you have surely heard, it has been revealed that a rogue researcher by the name of Kogan at Cambridge University has built an app that scrapped a lot of data from Facebook users and their "friends". (Apologies for the many quotes, but so many of the words used in this story have been hijacked to mean different things.) Nothing that Kogan did at that point was illegal by the terms of Facebook. This is, of course, the crux of the story - it may not have been illegal by Facebook's term, but it may have been highly unethical nonetheless. In any case, Kogan then shared the data with a third party, which was illegal, and this is how the company Cambridge Analytica (CA) got hold of the data. It was then used by CA for political purposes. Some people say CA was a decisive factor in the Trump and Brexit victories, but there is at the moment no evidence for that.

The reactions of shock that I've heard so far are of four types:
1. Why does Facebook have so much data on us?
2. Why does Facebook allow others to obtain our personal data?
3. How is this data used to manipulate us?
4. Are all tech companies the same? What about Apple, Google, Amazon, Twitter?

Let's address each of these points briefly.

1. Why does Facebook have so much data on us?
The easy answer is because we give it to them. But there is more to this than meets the eye. Facebook tracks you almost everywhere you go online. Facebook also tricks you into sharing more data than you are probably aware of. As many Android users have found out, Facebook has been scrapping their call and text message data for years - either without permission, or using extremely sleazy tricks to get "permission" from its users. Facebook's value proposition is targeted advertising. Advertisers pay lots of money to Facebook to show their ads specifically to a small target group. This is a highly efficient way to advertise because you know you are advertising to the right audience. It's this lucrative advertising model that has turned Facebook into one of the most highly valued companies on the planet. Yes, Facebook is a surveillance machine, but it has itself no malicious intent - it just wants to know everything about you so it can match you to advertisers. Facebook is not a data seller, it is a matchmaker. The more it knows about you, the better it can match you with those who are willing to pay.

2. Why does Facebook allow others to obtain our personal data?
If data is Facebook's gold, why would it share it with others - such as Kogan, or anyone developing a Facebook app - on its platform? The best answer I can give is that by opening up to app developers, Facebook was hoping to increase engagement on its platform. The more you use the Facebook platform, the more Facebook knows about you, which is good for its matching-making capabilities. Facebook is, of course, aware of this problem and has already some time ago begun to limit data access. Given the current scandal and bad press, Facebook will almost certainly continue to constrain data access to third parties.

3. How is this data used to manipulate us?
As mentioned above, Facebook is in the matchmaking business. It sells this access to anyone willing to pay for it. This is no secret - you can go to Facebook and read in great detail how it works. Facebook writes: "With our powerful audience selection tools, you can target the people who are right for your business." It should come as no surprise that by business, they mean anyone willing to pay, including politicians and organizations with political intent. Advertising is manipulation.

4. Are all tech companies the same? What about Apple, Google, Amazon, Twitter?
It's easy to engage in the blame game and begin to accuse all tech companies of being "data hungry". Isn't it always good to know more about your users? Yes, but as we're learning, that knowledge is also a huge liability (and this doesn't even factor in direct legal liabilities - hello GDPR). The central question is whether that knowledge is core to your business. This is clearly not the case for Apple. The vast majority of Apple's business is selling hardware with a very high margin. Apple is now actively advertising the fact that it can take privacy very seriously because its business doesn't depend on user data, which is both true and smart. The majority of income for Amazon is also not advertising, but services (retail and web services). For Google and Twitter, the story is different, because their business does indeed depend on knowing their users for better advertisement. Close to 90% of Google's and Twitter's income comes from advertisement. Twitter may be in a better position because it is a micro-blogging platform, and it would be difficult to be outraged by the fact that Twitter data can be used by anyone given that it is de facto public data. In addition, Twitter's size is still very small compared to Facebook. Google may be the closest to Facebook in terms of business models. But importantly, Googles does not run a social communication network - it tried with Google Plus, but failed - and that sets it a bit apart. It is difficult to insert manipulative political content into the discussion unless you are the discussion platform. Still, the concern with Google is that its business currently depends most strongly on knowing users intimately.

Now what?
These answers can provide us with some insights. The first is that Facebook is never going to change substantially. The more it knows about you, the better it can do its matchmaking, which is of existential importance to its multi-billion dollar business. That is why Mark Zuckerberg has been on a 14-year apology tour - he embodies the idea of asking for forgiveness, not for permission. The second is that Facebook will continue to be used for political manipulation. As historian Niall Ferguson put it so aptly, there are two kinds of politicians: those who understand Facebook advertising, and those who will lose. We have just seen the tip of the iceberg. The third is that regulation will be quintessential to tame the beast, which is not Facebook, but the extreme effectiveness of micro-targeting. I believe you can manipulate absolutely everyone if you know all the details about their lives, their friends, their fears, and their dreams. And it is generally not necessary to manipulate everyone very strongly; by just nudging a fraction of people undecided on an issue, systems can change rather dramatically. Nudging 10% of swing voters will define the victor; nudging 10% of undecided parents to opt out of vaccination will lead to large disease outbreaks, etc. The fourth is that Mark Zuckerberg may have to step down from Facebook, which could spell its end in the long run. He built Facebook, and stands for everything that happened, for better or worse. I fully believe Facebook did not have any malicious intentions - they simply discovered an extremely lucrative business model and ran with it. But this is not just another "oops - we're sorry" story that's going to go away soon. People are waking up to the core of the Facebook business model - and to some extent to the micro-targeting model - and they don't like it. Someone will have to face the consequences.

CODA
As a final note, I've found it incredibly liberating, a bit more than a year ago, to leave Facebook. I did it because it took more from me than it gave me, and truly valuable interactions I continued to have through other communication channels. I was also getting concerned about its surveillance power, but that was the lesser problem to me, then. But fundamentally, I do believe that the only way to solve the extreme micro-targeting problem is by abandoning those platforms whose business are entirely built on it, and for many of us, this should be easy. I am extremely disturbed to hear some people argue their ability to communicate with friends depends on Facebook. In the end, unless we realize that Facebook's business depends on being our communication platform, and on knowing everything that we communicate through it for efficient micro-targeting, we won't be able to argue we're part of the solution.

Rule 10: Be the best you can be, not the best there is

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

Comparing yourself to others is perhaps the greatest source of self-inflicted unhappiness there is. Unfortunately in academia, it's rampant. But by realizing that this is a major source of stress, you can better recognize when you fall victim to it, and try to ease its negative effects.

No matter how hard you try, there will always be someone who is better than you. It's a mathematical necessity for all but one person. The day you come to terms with this reality is the day you become more relaxed, and being relaxed makes you perform better (as already indicated in rule 9).

That doesn't mean sitting back and drinking mojitos all day long. In fact, becoming the best you can be is hard work. Some even argue (myself included) that it'll take you an entire life, because it's a never-ending task. Trying to consistently improve yourself seems like a smart strategy in general, not just for a career. The important question to ask is not "how can I be as good or better than person X", but "how can I be a bit better today than I was yesterday". It seems like a small difference, but the effect is quite enormous.

I know I'll never be the best scientist in the world. I was never the best evolutionary biologist, never the best network scientist, not even the best digital epidemiologist. I won't be the best writer, the best blogger, the best pianist. And even though my kids tell me I'm the best dad in the world, I know I'll never be, because there are over a billion dads in the world, and surely some of them are better. And that's just fine with me, as long as I'm trying to be the best I can be. And if tomorrow, I'll try to be just a little bit better than I was today, everything will turn out all right in the end.

1
2
3
4
Next
Last

Marcel Salathé’s Blog

Specialization is for Insects