Marcel Salathé's Blog

Goodbye

2022-11-27T17:57:58Z

This is my last entry on this blog. I had good fun writing it, most of the time. My blogging was infrequent, but I enjoyed having a home away from social media where I was able to jot down my thoughts.

My last post is almost three years old. On March 15, 2020, I wrote about my outlook on the COVID pandemic. I expressed hope that a vaccine would be available within the year, which turned out to be the case. But many other hopes did not materialize, or in a different way.

How the world has changed since then! Or has it? Maybe it’s not the world that’s changed, but my perspective. I have morphed from a general optimist to a long-term optimist and short-term pessimist. Pre-pandemic, I always thought that the world would solve its problems by being reasonable, i.e. guided by science and using responsible technology. Post-pandemic, I still think the world will solve its problems in the long term, but it will get there in a very messy, chaotic, human way where science and technology are only a small part of the equation in the short term.

While I found that initially frustrating, I am now looking at it with a sense of awe, and humility. And it's boosted my motivation to do my part, however small, to ensure that the voice of science and responsible technology is heard.

Personally, most of that activity goes into an organization that acts on the local level (in Switzerland) called CH++, where I am working with an incredibly talented & growing group to strengthen the scientific and technological competencies of politics, authorities and civil society.

Professionally, the COVID-19 pandemic has been an eye-opener for me in terms of the potential of digital epidemiology. I’m thrilled for the advancements that will be made in the coming years and decades, and I’ll do everything in my power to help steer it in the right direction. This spring, I’ll be teaching a digital epidemiology class at EPFL, and I’ll be releasing a book along with it. I’m also launching a digital epidemiology substack to write about interesting developments (past, present, and future) in the field.

I am sunsetting this blog because writing takes a lot of time, and good writing takes even more time. A few years ago, I started a newsletter on AI & applied machine learning, but the field is simply too vast. Therefore, I have decided to focus on the one area that truly fascinates me beyond everything else - digital epidemiology. If you want to give it a try, it’s over at digitalepi.substack.com. I’ll post my first piece in the coming days.

So long, and thanks for reading!

COVID-19: Some thoughts on what's next (Mar 15)

2020-03-15T06:55:11Z

[This post is replacing a Twitter thread]

What a time we live in. Since many weeks, epidemiologists around the world had been looking at the COVID situation with great worry. As the story became bigger and bigger, some of us were sharing our thoughts both with decision makers and in the media, and were promptly called alarmists. But that's past, and water under the bridge. I'm offering here some thoughts of where I see things heading. This is not a scientific assessment, but rather a personal one.

All of Europe, and the US, is fighting an exponentially growing threat. The hashtag of the moment is #FlattenTheCurve - the idea to mitigate the epidemic wave in order to not overwhelm health care systems - but when you fight an exponential, at some point, a bit of flattening won't do enough. The question is of course when that point is.

Some countries in Europe have now gone into shutdown because they think they've reached that point. When you look at the daily case data, and the reports from hospitals, this is not surprising. What is surprising to me is that even at this point in time, many people have a hard time understanding exponential growth.

Following a shutdown, a few weeks later, the numbers will indeed go down. It doesn't happen immediately because before the shutdown, the many people who got infected in the days before will will eventually get COVID19, thus the numbers will still increase for some time (the incubation period is up to 14 days). But when the numbers finally do go down, hopefully all governments will have copied South Korea's strategy of testing and isolation, and have the infrastructure in place to test and isolate.

Since most countries have been in some sort of partial shutdown before, this will likely be the time when the total shutdown will be relaxed back into a partial shutdown. During that time, I hope that every single case will be treated very very seriously, with all the isolation and quarantining necessary.

Once that is in place and we can feel we have it under control, there will be a resemblance of life as we knew it. We will be careful, but confident - local outbreaks can be contained. The end game is the vaccine. When the vaccine arrives, it will be a magical moment for many, as life can finally go back to normal. The relief will be enormous, followed by a massive economic boom. History books will be written.

The optimist in me is hoping for more rapid relief. At any day, a (non-fake-news) announcement of a medical intervention drastically lowering severity and mortality may appear. That would instantly improve the trajectory. The pessimist in me is worried in particular about middle income countries, where economic hardship could lead to serious instabilities.

All of this is speculation, but I am sharing it because it may help others think through the options. I am still among the optimists who thinks a vaccine is possible this year, and unexpected drugs may lead to rapid relief. But whatever happens, 2020 will be year like none we've ever seen.

May we be strong, may we be lucky, may we be healthy. But most of all, may we learn something out of it. Never again shall another pandemic - basically guaranteed unless we get very serious about it - hit us at such a mind-boggling stage of unpreparedness.

Death by Algorithm - an Amazon Post Mortem

2020-02-14T06:52:10Z

On February 12, my research lab at EPFL received an email from Amazon (via the email address "mturk-noreply@amazon.com") which contained the following:

Greetings from Amazon Mechanical Turk,
We regret to inform you that your Amazon Mechanical Turk (MTurk) account has been suspended effective immediately. We took this action because you did not provide true and accurate information about your location during the registration process, which violates our Participation Agreement.

You can review the complete Participation Agreement at the link below:

https://www.mturk.com/participation-agreement

Your remaining prepaid HITs balance will be refunded to the payment method you used to purchase the Prepaid HITs once your outstanding Worker liability has been resolved.

Wow.

Some context: Amazon MTurk is a crowdsourcing platform that allows us to crowdsource certain tasks to people, and pay them for it. This platform is very popular especially in research. In our case, we use MTurk to annotate tweets, and while our usage of it is on and off, this email caught us in the middle of intense MTurk activity for a coronavirus study.

I should also add that we've been using Amazon MTurk for many years, and have spent tens of thousands of dollars on it. That's in addition to the tens of thousands of dollars we spend on AWS infrastructure every single year. We are no Netflix, but by all common standards we should be considered a good customer.

36 hours later, we've been able to resolve the situation. Through a personal contact at Amazon, I managed to get a clear explanation from MTurk. Having used the service for many years, I created the account back in the day while I was still in the US. Since a few years, we've been using the account from Switzerland. For some reason, this discrepancy - usage from Switzerland while having an address in the US - only triggered the system now.

The person at MTurk was very helpful and helped me resolve the issue quickly. We're up and running again. But many things about this experience rubbed me the wrong way (if you happen to follow me on Twitter, you have probably seen my strong reaction). I tried to summarize everything in an email to Amazon, which I am posting here as well.

Hi [name redacted]

Many thanks for your detailed message, and your help - much appreciated.

I’d like to explain my side of the story, and the reasons for my strong reaction. I’ve always been very positively impressed by interactions with Amazon staff, and after all, many of us learned customer obsession from Amazon. That’s why when we received the email on 2/12 about immediate account suspension, we were dumbfounded. How could this happen?

Your explanations now make things clear to me. The short version is, your system must have gotten triggered by an old US address from my time at Penn State many years ago, and as you correctly observed, all of our activity is coming out of Switzerland. But that has been the case for almost five years now, so I don’t understand why your system got triggered now, all of a sudden.

In any case, I don’t have insights into your system, but I’d like to share with you the customer experience side. We’ve been using Amazon MTurk for many years, on and off. We must have spent tens of thousands of dollars on it. We use AWS infrastructure for all our projects, and we spend tens of thousands of dollars on that every year. Right now, we were in the middle of a coronavirus research project when we got the message of immediate account suspension. From your records, I was able to find a message from MTurk on 1/29 at 4:52 AM my time, indicating that there might be an issue. At that time, I was at a large machine learning conference that I organize, and I had my email on auto-respond saying I won’t be able to read email, and that important mails should be resent. Of course, given that the Amazon email was automatic and coming from a noreply address, my response went nowhere. Thus, not having read that email, your system kicked us out without further warning on 2/12.

Initially, I thought you had suspended our account without warning. I can see now that you had sent us one warning. But I still find the action extremely draconian. Emails can get lost in spam filters, or can be missed due to multiple circumstances. To suspend an account that has had no problems for many years and spent reasonable amounts of money because inaction after a single email is extremely frustrating, to say the least. We can live with not having access to MTurk for a day or two, but given our dependence on Amazon for infrastructure, the entire experience makes me incredibly nervous. Are we at risk of waking up one day with all our platforms down because we missed that one email? I can’t really take that risk, as I am sure you understand. So my first recommendation would be to follow up with one or two more emails. Rather than waiting 14 days between an email and a decision, please send a few emails in that time period, with escalating warning messages.

In any case, after not having seen any action on the account, your system decided to suspend the account. This left us completely stranded. Why would you do that? If after some warnings your realize that customer has taken no action, why don’t you just shut down the service provision - without account suspension? That would have allowed us to immediately log into the account, see that something is wrong, and fix it - problem solved. The fact that you completely locked us out is actually the worst part of the story. It’s like a landlord locking you out of your apartment because you didn’t fix something, but you can’t get back in to fix it. So my second recommendation would be to not suspend accounts, but suspend services.

If you would follow these two recommendations, the situation would have been completely avoidable. We probably wouldn’t have experienced an interruption, and even if we had, we would have been able to fix the issue immediately.

I have nevertheless two other recommendations. The first is to make your emails more clear. The statement that "Our records indicate that you have provided incomplete or inaccurate information during the Amazon Mechanical Turk (MTurk) registration process” is not meaningful for me as a user. The addresses in the system didn’t show me any problems. Only your email below made it clear to me what the problem was (the old US address). In addition, keep in mind that I registered the account many years ago, which made the message even more confusing (what exactly did I do wrong many years ago?).

The fourth recommendation is to provide better ways to contact your support. Your actions put services into severe distress. Please give them the option to call / online chat somewhere, rather than providing a link to a totally standard, bland support form. I am privileged to be able to have a small audience on Twitter, and to know some Amazon employees personally. But that should not be a condition to get a situation like that resolved quickly. Put a human back into the loop. Yes, it may cost some money - but the risk of losing customers may be more costly down the line. And you’d probably find that the first two recommendations would mean that most situations would anyways sort themselves out before it gets to that stage. But the last option should always be the ability to talk to an actual person.

I hope this feedback is useful. As a service provider myself, I often find that I can learn the most from detailed customer feedback, even - or especially - when it’s criticism. In the grand scheme of things, the magnitude of this event is very low; but in an age of increasing concern about automated decision making, I nevertheless feel the episode is symptomatic of a development going in the wrong direction, and hope to have provided some ways to correct course.

Very best,

Marcel

Applied Machine Learning Days

2019-11-23T10:36:38Z

The Applied Machine Learning Days (AMLD) 2020 is around the corner - in about two months, we’ll open our doors for the fourth time to an expected audience of over 2000 people, with 30 hands-on sessions and 29 tracks on machine learning and artificial intelligence with top speakers from around the world.

An industry representative told me the other day “it’s amazing how you managed to create such a high quality event, and such a brand, from nothing, in just three years”. I’m very grateful for these kind words. But it also made me think. Did we succeed? Where could we go? Where should we go? What did we do right, or wrong? And then I also realized that there is no written record of how AMLD came to be, and so I felled compelled to write this post.

In 2016, my lab at EPFL launched crowdAI, an AI challenge platform (today, it’s a spin off with the name AIcrowd). The idea was to run public machine learning challenges, in an affordable and open source way. We had ideas for a few challenges, specifically also around the research we were doing, and we knew how to build a challenge platform - but what could we offer to the community as prizes, with loads of money not being an option? After some thinking, we decided that one cool prize could be to bring people to Switzerland, in the winter (snowy Alps!), to a small workshop where top performers in these challenges could share their approaches, and learn from each other.

Around the same time, EPFL hired to new professors, Martin Jaggi (machine learning) and Bob West (data science). The three of us felt like it would be a cool idea to create something a bit bigger than a small workshop around the topic of applying machine learning to lots of interesting problems, and we created the Applied Machine Learning Days, with the idea to bring together ML practitioners to share do’s and don’t of this exciting technology. Without any resources, we went ahead and started inviting people to speak at AMLD, and managed to attract a great speaker setup, mostly from our own personal networks. We were hopeful that around 100 people would show up (ML was a hot topic, after all). To our great surprise, hundreds of people signed up, and the largest room we could find on campus in the short term had a capacity of 450 people. AMLD 2017, a 2-day event with talks, was a great success, and we were motivated to do more after that.

When putting together the website for AMLD 2017, I added the slogan “2 days of talks and tutorials”. But given that the AMLD 2017 organization was rather rushed, we did not really have the time to organize tutorials. So for AMLD 2018, we created a call for workshops, and to our great delight, numerous high-quality workshops were proposed. Given that AMLD 2017 ended up being much bigger than planned, we felt that AMLD 2018 could be even bigger, and in the summer of 2017, we brought on an event manager to help us coordinate the event full time (hi Sylvain!). AMLD 2018 thus became a 4-day event, with 2 days of workshops, and 2 days of talks. Almost 1000 people ended up coming to the event, with the workshop weekend completely booked out and long waiting lists.

At this point, we realized we had hit a nerve. People really seemed to like the mix of academia and industry. In parallel, many AI events were popping up left and right (and of course we were not the first either), with some of them being very much focused on marketing and sales, while traditional ML conferences were highly technical. We seemed to have found a sweet spot in between these two extremes, where practitioners and enthusiasts from all types of organizations could come together and learn from each other.

AMLD 2018 was great, but we realized that the single track model would not work for much longer. Thus, the idea of domain-specific tracks - AI & your field - was born. For AMLD 2019, we opened a call for tracks, and once again, the community came along and put together awesome tracks! Given the expected increase in size, we asked Sylvain to stay onboard full time :-). Overall, AMLD 2019 ended up being again a 4-day event, with 2 days of workshops, and 2 days of conference with both keynotes and domain-specific parallel tracks that over 1700 people attended. Speakers like Garry Kasparov, Jeff Dean, and Zeynep Tufekci gave the event a very special vibe.

For AMLD 2020, we primarily thought “never change a winning team”. But nonetheless, I became personally frustrated that while we were holding this interesting event, the public discussion was getting increasingly negative and concerned about this technology, and most of the uncertainty - not surprisingly - was about work, jobs, and skills. So we decided to extend AMLD by one day, and to have a third day that more specifically focuses on all things AI & economy: jobs, skills, employment, HR, social policy, startups, etc. which we're organizing with our neighbors and colleagues from the University of Lausanne. Given the growth, we recently also brought on another person to help with the organization (hi Pauline!).

And once again, the community came along and put together absolutely stunning workshops and tracks. Some of the tracks have such a stellar speaker line up that they would very much go through as independent conferences in their own right!

On reflecting what made AMLD work so well, in such a short time, I’ve come to learn a number of insights. The first is to create an event that you would love going to. This is a truism in industry, certainly in the consumer sector - if you are not using your own product or service, why would anyone else? I keep reminding people that we are not organizing AMLD because somebody told us to. We are doing it simply because we want such an event to exist. Indeed, one of the most difficult thing for us as organizers is to not be able to enjoy the event as visitors. Tough life 😉

The second insight is to not do it alone, but together with others. People were often shocked to hear that the event management team was composed of one person, for a conference of the size of AMLD. But of course there were hundreds of volunteers behind the scene, from the volunteers helping, people in the labs of the organizers, and others who came to help during the event. And most of all, of course, the workshop and track organizers who put together the program.

The final insight is to take it easy on the hype, and just stick to quality. The amount of AI bullshit available on the internet and at some events has taken on rather stunning proportions. Personally I have nothing against some long-term thinking and some excitement around it. But at some point, one should put up, or shut up. It’s for that reason that we want AMLDs to always be associated with academic institutions. That is not to say that non-academic institutions wouldn’t be able to put together great events; of course they are. But academic institutions have the benefit that they are full of deeply skeptical scientists that won’t tolerate overselling for too long, and most speakers will naturally focus on serious work when they present at an academic institution.

So, what is the future of AMLD? I can’t say for sure, but it’s worth reflecting on what the ultimate goal of AMLD is. An event is a huge effort, both for organizers and attendees. If you calculate the overall costs, and the energy spent by thousands of people coming together in a particular location, the numbers are absolutely enormous. So there’d better be a very good reason why you do this. For me, the ultimate reason to organize AMLD is to make sure that this technology remains on people’s radar, and becomes accessible to them. Modern machine learning is once-a-lifetime kind of technology, and may even end up being a once-a-century kind of technology. If AMLD can help many more people to understand this technology and use it for their goals, then it will have been worth it. Because I believe very strongly in Feynman’s observation of “what I cannot create, I do not understand.”

That is the ultimate reason I believe that AMLD should grow much more, both in size and in scope. To give you an idea of the importance of machine leaning, PwC believes that by 2030, AI (they mean machine learning) will boost GDP by 13% globally, and up to 26% locally. That’s 15.7 Trillion Dollars, more than today’s GDP of China and India combined. But more than money, machine learning will affect all social systems deeply. Not mastering this technology is simply not an option. Events like AMLD can do their share to ensure a well informed society, from academia to industry to the general public.

Why I am not interacting on LinkedIn

2019-09-21T16:03:46Z

TL;DR: I won't participate in LinkedIn communications, because I have no more trust in LinkedIn. For important matters, please send email instead.

___

LinkedIn would have the potential of being useful. Unfortunately, it has in recent times become a master of dark patterns (see https://www.darkpatterns.org/), and I see no indication of this stopping any time soon.

Just a few examples:

I've been getting LinkedIn requests from women "to share the passion of love", with LinkedIn apparently being unable / unwilling to filter this as spam.
LinkedIn has started prefilling communication fields in ways that I do not like at all - for example, pre-filling responses with "Hi XYZ, thanks for reaching out. I’d like to learn more." - No, I don't like to learn more. It's preposterous to pre-fill communication forms like that, LinkedIn.
I've been getting emails saying "You have 1 new message" - but instead of showing the message right there (I am looking at email right now, for goodness' sake), LinkedIn forces me to open the website, or the app, so that it can track me better.

I could of course leave LinkedIn, the same way I left Facebook a few years ago. The problem is that the network itself is very interesting, and unlike Facebook, I never had any real trust that information on LinkedIn would be private. So rather than abandoning it, I just want to clarify that I am not using it for communication purposes, because a) I cannot assume our communication to remain private, and b) I'd like to stay away from the dark patterns of LinkedIn as much as possible. Like everyone else, I am trying to do my best to mitigate the onslaught of digital overload - my detachment from LinkedIn is a further step in that direction.

The perils of "free" education

2019-08-27T19:59:50Z

Imagine reading one day on a restaurant website the following:

"Come eat with us FOR FREE! That's right - we believe in open food, and that nutrition is a basic human right for everyone! We provide free meals, created for people at any hunger level! Eat as much as you want!"

Sounds ridiculous, doesn't it? Even a place that would offer "all your nutritional needs covered for just $49 / month" sounds incredibly suspicious. Would you really eat there? What could they possibly be putting on those plates to cover their expenses?

It's pretty simple - you know that food has a price, and that the people making and serving it have expenses to cover - so anything extremely cheap is most likely very bad quality, or a scam, and immediately raises red flags.

Yet, when it comes to education, we quickly seem to let our guard down. Free courses to learn anything I want? Bring it on! A full education for $49 / month? Sounds good, count me in! No suspicion is raised - this is normal. After all, most of us didn't pay for school either.

The fundamental problem there is that we easily confuse education with information. Yes, information can be free, and when it comes to knowledge about the world, I'd argue it should be free. But education is not just information - not by a long shot. Education is helping learners make a selection about what is worth learning (at least initially); it's helping learners differentiate good quality from bad quality; it's helping learners when they get stuck; it's reviewing learners' work, and give them guidance on how to improve; it's assessing their knowledge at regular levels, and eventually putting your name to vouch for the level of know-how they have. And it's a million other things as well, as anyone who has ever taught another person anything can readily confirm.

Many of these things cannot be automated yet, and the question is not only if they ever will be, but also if that's what learners really want. But whatever the future may bring: today, when you are getting an education, someone is paying for it. And thus, if it's free, or almost free to you, then someone else is paying for it. Do you know who that is, and why they are doing it? Do you know "the deal"?

Our students at EPFL, who currently pay about 1'200 $ per year - a tiny fraction of the true cost - (hopefully) know that it's the Swiss tax payers who are paying for them. They also hopefully know that the tax payers are paying because they think they're getting more in return in the long run - the until know safe assumption being that a well-educated population will be a wealthy population. Same for all the parents in the country (the vast majority) who send their kids to the excellent public schools, at no direct cost to them - again paid for by the tax payers, for the same reason.

This is not free. In fact, many governments spend multiple percentage points of their GDP on education. The EU, for example, spent 715 Billion Euros on education in 2017. That's right: that's € 715'000'000'000 in a single year. So much for free education.

So that makes you wonder - what are all the people thinking who are signing up for (almost) free online education? That somehow, all of these mechanisms don't apply anymore? Part of the problem, as mentioned above, is that we confuse online education with online information. Online information can be free, yes - although even there, its creation and maintenance costs something.

But the problem with (almost) free online education goes further. In the same way that you are not a Facebook user, but the Facebook product (with advertisers being the customers), free online education means that you're not the only one who is learning something - someone else, with vested financial interests, is also learning something about you. What kind of learner you are, for example. How quickly you grasp new concepts. How well you work with others. How you solve problems. How you search for solutions. How motivated you are. If you go to any job interview, these are exactly the kinds of things companies want to know about potential hires. And there is a huge market developing that sells this information about you to companies that are hiring - directly, or indirectly through recruitment services. It's worth a lot of money - enough to pay for the education.

It may be a deal worth making. But we should be aware that there is a deal in the first place, and most of us simply are not. And we should realize that the education that we hope would advance our career, may actually be putting a break on it.

There is a long term solution, and a short term solution. The long term solution is appropriate legislation - that learners getting the free education deal must be kept totally in the clear about this deal. Perhaps an even better solution would be to prevent such deals in the first place, at least for adult, continued education; I'm not entirely sure yet. The short term solution is to prevent the problem in the first place, and find someone who is truly interested in your education so that they will pay for it. Oftentimes, that will be you; other times, that may be your employer, or perhaps even your government, should you be so lucky to live or work in an environment that supports life long learning and continued education.

At the EPFL Extension School, we think about these issues a lot. We offer courses and programs for digital up-skilling online, and the topics of lifelong learning, online education, and data ownership are parts of our daily discussions. The entire learning experience going through the EPFL Extension School is what we offer as a service, and because of that, we don't even have to think about monetizing any data about our learners to anyone. In fact, we viciously protect our learner's data, far beyond our legal obligations. Being in total control of our learner's data was also a major factor when we decided to build our own learning platform, rather than using someone else's.

I think that's the fairer deal.

1. Meditate

2019-06-04T15:57:01Z

This is one of the brain tools I can't really understand anymore how I managed to do without.

My favorite - and largely only - form of meditation that I practice regularly is mindfulness mediation. I first encountered the concept about 15 years ago when I came across a book called "Wherever you go, there you are" by Jon Kabat Zinn. I was about to become a PhD student at the time and so my natural instinct was to think that this was likely some trivial nonsense. But I was in enough adolescence-related mental pain at the time that I thought I'd give it a try. It changed the way I looked at myself, and at how the mind works. It was the first time when I fully realized, I am not my thoughts, and that thoughts are objects I can study objectively. I've been expanding on this concept for quite some time ever since then.

I most recently came back to regular practice with the Waking Up app, which I very much like (and I can also recommend the book with the same name by the same author, Sam Harris).

Mindfulness meditation has become a key tool for me, and today, as we are in the midst of the attention economy, being able to realize when someone tries to hijack your mind has become extremely valuable. That's of course in addition to all the benefits you get from realizing when your mind gets hijacked by your own thoughts. I now rank the ability to do basic mindful mediation so highly that I will teach my kids to understand it before I teach them how to code (and if you've ever been on the receiving end of one of my sermons about everyone having to learn how to code, you know what that means).

So this is my first advice: Look into mindfulness meditation.

Notes to my younger self

2019-06-04T15:56:33Z

I recently heard someone on a podcast ask a guest, "what advice would you give to your younger self"? The question was rhetorical, of course, as the younger self clearly missed the chance to listen to any advice, but I thought it was a nice way of soliciting condensed advice based on years of life experience. And naturally, I started thinking, what advice would I give? After some reflection, I decided to write it down in small, bit-sized blog posts. It's not going to be useful to me - but in the same way that I occasionally find other people's advice very useful, I hope this may be of use to someone else (hey there!).

Come to think of it, "advice" may be the wrong word here - let's go with "ways to think about the world based on some things I've experienced". I'll keep this list going and growing for a while. Whatever reason brings you here, I hope one of those may change the way you look at certain things in a way that benefits you. That'll already have made it worth it (#payingitforward).

1. Medidate

Facebook

2018-04-10T18:28:57Z

It's impossible to escape the Facebook "scandal" at the moment, and it's important to be fully aware of what is going on. I think this is a defining moment in our digital evolution as a society, so it's worth spending some time reflecting on what is happening.

As you have surely heard, it has been revealed that a rogue researcher by the name of Kogan at Cambridge University has built an app that scrapped a lot of data from Facebook users and their "friends". (Apologies for the many quotes, but so many of the words used in this story have been hijacked to mean different things.) Nothing that Kogan did at that point was illegal by the terms of Facebook. This is, of course, the crux of the story - it may not have been illegal by Facebook's term, but it may have been highly unethical nonetheless. In any case, Kogan then shared the data with a third party, which was illegal, and this is how the company Cambridge Analytica (CA) got hold of the data. It was then used by CA for political purposes. Some people say CA was a decisive factor in the Trump and Brexit victories, but there is at the moment no evidence for that.

The reactions of shock that I've heard so far are of four types:
1. Why does Facebook have so much data on us?
2. Why does Facebook allow others to obtain our personal data?
3. How is this data used to manipulate us?
4. Are all tech companies the same? What about Apple, Google, Amazon, Twitter?

Let's address each of these points briefly.

1. Why does Facebook have so much data on us?
The easy answer is because we give it to them. But there is more to this than meets the eye. Facebook tracks you almost everywhere you go online. Facebook also tricks you into sharing more data than you are probably aware of. As many Android users have found out, Facebook has been scrapping their call and text message data for years - either without permission, or using extremely sleazy tricks to get "permission" from its users. Facebook's value proposition is targeted advertising. Advertisers pay lots of money to Facebook to show their ads specifically to a small target group. This is a highly efficient way to advertise because you know you are advertising to the right audience. It's this lucrative advertising model that has turned Facebook into one of the most highly valued companies on the planet. Yes, Facebook is a surveillance machine, but it has itself no malicious intent - it just wants to know everything about you so it can match you to advertisers. Facebook is not a data seller, it is a matchmaker. The more it knows about you, the better it can match you with those who are willing to pay.

2. Why does Facebook allow others to obtain our personal data?
If data is Facebook's gold, why would it share it with others - such as Kogan, or anyone developing a Facebook app - on its platform? The best answer I can give is that by opening up to app developers, Facebook was hoping to increase engagement on its platform. The more you use the Facebook platform, the more Facebook knows about you, which is good for its matching-making capabilities. Facebook is, of course, aware of this problem and has already some time ago begun to limit data access. Given the current scandal and bad press, Facebook will almost certainly continue to constrain data access to third parties.

3. How is this data used to manipulate us?
As mentioned above, Facebook is in the matchmaking business. It sells this access to anyone willing to pay for it. This is no secret - you can go to Facebook and read in great detail how it works. Facebook writes: "With our powerful audience selection tools, you can target the people who are right for your business." It should come as no surprise that by business, they mean anyone willing to pay, including politicians and organizations with political intent. Advertising is manipulation.

4. Are all tech companies the same? What about Apple, Google, Amazon, Twitter?
It's easy to engage in the blame game and begin to accuse all tech companies of being "data hungry". Isn't it always good to know more about your users? Yes, but as we're learning, that knowledge is also a huge liability (and this doesn't even factor in direct legal liabilities - hello GDPR). The central question is whether that knowledge is core to your business. This is clearly not the case for Apple. The vast majority of Apple's business is selling hardware with a very high margin. Apple is now actively advertising the fact that it can take privacy very seriously because its business doesn't depend on user data, which is both true and smart. The majority of income for Amazon is also not advertising, but services (retail and web services). For Google and Twitter, the story is different, because their business does indeed depend on knowing their users for better advertisement. Close to 90% of Google's and Twitter's income comes from advertisement. Twitter may be in a better position because it is a micro-blogging platform, and it would be difficult to be outraged by the fact that Twitter data can be used by anyone given that it is de facto public data. In addition, Twitter's size is still very small compared to Facebook. Google may be the closest to Facebook in terms of business models. But importantly, Googles does not run a social communication network - it tried with Google Plus, but failed - and that sets it a bit apart. It is difficult to insert manipulative political content into the discussion unless you are the discussion platform. Still, the concern with Google is that its business currently depends most strongly on knowing users intimately.

Now what?
These answers can provide us with some insights. The first is that Facebook is never going to change substantially. The more it knows about you, the better it can do its matchmaking, which is of existential importance to its multi-billion dollar business. That is why Mark Zuckerberg has been on a 14-year apology tour - he embodies the idea of asking for forgiveness, not for permission. The second is that Facebook will continue to be used for political manipulation. As historian Niall Ferguson put it so aptly, there are two kinds of politicians: those who understand Facebook advertising, and those who will lose. We have just seen the tip of the iceberg. The third is that regulation will be quintessential to tame the beast, which is not Facebook, but the extreme effectiveness of micro-targeting. I believe you can manipulate absolutely everyone if you know all the details about their lives, their friends, their fears, and their dreams. And it is generally not necessary to manipulate everyone very strongly; by just nudging a fraction of people undecided on an issue, systems can change rather dramatically. Nudging 10% of swing voters will define the victor; nudging 10% of undecided parents to opt out of vaccination will lead to large disease outbreaks, etc. The fourth is that Mark Zuckerberg may have to step down from Facebook, which could spell its end in the long run. He built Facebook, and stands for everything that happened, for better or worse. I fully believe Facebook did not have any malicious intentions - they simply discovered an extremely lucrative business model and ran with it. But this is not just another "oops - we're sorry" story that's going to go away soon. People are waking up to the core of the Facebook business model - and to some extent to the micro-targeting model - and they don't like it. Someone will have to face the consequences.

CODA
As a final note, I've found it incredibly liberating, a bit more than a year ago, to leave Facebook. I did it because it took more from me than it gave me, and truly valuable interactions I continued to have through other communication channels. I was also getting concerned about its surveillance power, but that was the lesser problem to me, then. But fundamentally, I do believe that the only way to solve the extreme micro-targeting problem is by abandoning those platforms whose business are entirely built on it, and for many of us, this should be easy. I am extremely disturbed to hear some people argue their ability to communicate with friends depends on Facebook. In the end, unless we realize that Facebook's business depends on being our communication platform, and on knowing everything that we communicate through it for efficient micro-targeting, we won't be able to argue we're part of the solution.

Rule 10: Be the best you can be, not the best there is

2018-04-06T20:28:18Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

Comparing yourself to others is perhaps the greatest source of self-inflicted unhappiness there is. Unfortunately in academia, it's rampant. But by realizing that this is a major source of stress, you can better recognize when you fall victim to it, and try to ease its negative effects.

No matter how hard you try, there will always be someone who is better than you. It's a mathematical necessity for all but one person. The day you come to terms with this reality is the day you become more relaxed, and being relaxed makes you perform better (as already indicated in rule 9).

That doesn't mean sitting back and drinking mojitos all day long. In fact, becoming the best you can be is hard work. Some even argue (myself included) that it'll take you an entire life, because it's a never-ending task. Trying to consistently improve yourself seems like a smart strategy in general, not just for a career. The important question to ask is not "how can I be as good or better than person X", but "how can I be a bit better today than I was yesterday". It seems like a small difference, but the effect is quite enormous.

I know I'll never be the best scientist in the world. I was never the best evolutionary biologist, never the best network scientist, not even the best digital epidemiologist. I won't be the best writer, the best blogger, the best pianist. And even though my kids tell me I'm the best dad in the world, I know I'll never be, because there are over a billion dads in the world, and surely some of them are better. And that's just fine with me, as long as I'm trying to be the best I can be. And if tomorrow, I'll try to be just a little bit better than I was today, everything will turn out all right in the end.

Rule 9: Have alternatives

2018-04-02T13:34:29Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

This rule has carried me through both my academic and non-academic lives for two decades, and it's still going strong.

Having alternatives gives you peace of mind, and in my experience, peace of mind is what allows you to take that occasional extra risk that's necessary to excel at what you do, to innovate at a higher pace than what you'd be comfortable with in the absence of alternatives.

Having alternatives does not mean not being 100% committed to what you currently do. It simply means having that deep trust that tells you "even if things go totally wrong, I'll be fine. There will be something else".

Some people have that trust even when there are no obvious alternatives. I envy those people. Fundamentally, I think they are right. In the end, it'll be alright. In my dreams, I am as cool as that :-) But in my real life, I am not, and I love having a backup plan.

For somewhat random reasons, my backup plan has always involved web technologies. It's something I began playing with as an undergrad, and that I kept getting better at over the years, out of a fascination for the rapidly expanding web and all its implications. The day I realized these skills have serious market value was the day I became a much more relaxed and focused student of biology. I studied biology for the love of plants and animals, and I did my PhD in theoretical biology because I wanted to very deeply understand the most important idea in the world (evolution). I absolutely did what I loved, but it was also absolutely clear that the market for this kind of knowledge was virtually non-existent, and that having an alternative was necessary.

Asking people to reflect on alternative career paths is some kind of taboo - often used as a euphemism to suggest that they're not good enough at what they're doing. This is not at all what I mean when I invite people to reflect on alternatives; quite the opposite. Realizing that you have options is a great relief and brings back a sense of control. And because of that, it will most likely improve your ability to concentrate on what you're currently doing, enabling you to do the best work you possibly can.

Rule 8: Be visible

2018-03-30T14:45:20Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

As indicated at the end of the last rule (networks, networks, networks), talking about your work and ideas is very important, and it gets more and more important by the day.

Some of us have grown up in a culture that is deeply rooted in the exact opposite idea. When I grew up, I learned proverbs like "Reden ist Silber, Schweigen ist Gold" (speech is silver, silence is golden), or "Eigenlob stinkt" (self-praise stinks). I've written before that I think modest chronic under-confidence is much more harmful than modest chronic overconfidence, so here I'll focus exclusively on my belief that being quiet about your own work, in the hope that it'll be discovered because of its own merit, is a bad idea.

Ultimately, in order to be recognized for your work, it needs to be known. You need to be known. The traditional route is to publish in good journals, present at good conferences, and network with the right people. These are still very good ideas, precisely because they help you and your work be visible. But they are by no means the only routes. Today, there is a multitude of options that you can add to that arsenal, and amplify the effects of the traditional route. The most obvious one is public social media - in other words, Twitter. I didn't care too much about Facebook before the CA story, because at the end of the day, I don't need my "friends" to hear about my work - I need to reach everyone else. I strongly advise you to tweet, and tweet regularly; not just about your work, but generally interesting stuff. People follow other people if they think they are a good source of information. Try to be one.

The other extremely good way, and completely underutilized in my opinion, is to do interesting things on the web. There is no science that you could not somehow make more attractive on the web. Most of the work I do these days is fundamentally web-based, which makes things a little easier - it's already online by design. But even if you work in, say, molecular biology, you're only limited by your creativity with respect to what you can do on the web. Why don't you create that amazing website where you list your work, blog about it, blog about other people's work, create interactive visualizations of your models, write short tutorials on certain aspects about your work that you know is relevant to others? When you put in consistent effort into such things, you'll grow your visibility dramatically - often explosively, if something you did on the web goes viral for one reason or another.

Naturally, there is trade-off here, in the sense that you can only invest so much time in such visibility efforts. But when you think about it, the kinds of skills you'll learn doing that - mostly in the form of getting proficient with web technologies - are highly marketable, and will be extremely useful for the rest of your career. For PhD students, I would recommend to spend at least 10% of your time on doing this. It'll be worth it.

Rule 7: Networks, networks, networks

2018-03-11T13:05:06Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

To get the job of your dreams, you need two things:

Have the right skills
Be at the right place, at the right time.

Most people know what is needed to meet the first criterion: education & talent. That one's easy to agree on.

What's harder is to agree on is how much the second point matters, and how you achieve that goal. Even die-hard fans of the idea that "I got here because I'm awesome and hard-working" are coming around to the idea that that's not the entire story. There are always people who are working harder, and are smarter than you, so other factors must be at play, too.

How to be at the right place, at the right time? Luck is one of the things that makes that happen. The problem with luck, of course, is that you can't do anything about it, by definition. "Just be lucky" isn't great advice.

Better advice can be found by thinking about social networks. The small-world phenomenon - the observation that you are connected to everyone on the planet by just a few hops - is now well understood and described. In other words, there is always the "I know someone who knows someone who knows someone who knows about this fantastic opportunity" situation. But in order to take advantage of this situation, you can improve your position in the network, to be closer than others to such opportunities.

This is what people usually mean when they say you should network. Honestly, I never understood exactly what they meant. "To network" seems like a verb, but it makes little sense. We are all part of the big human social network, so what exactly does it mean "to network"?

In my experience, to network productively means to try and get closer to interesting opportunities, and to interesting people (because interesting opportunities tend to cluster around interesting people). For that to happen, you need more connections to those people. One advice could therefore be to talk to as many people as possible. But that alone won't cut it - if you spend all your time socializing, and talking to new people, what will you tell them? That you are spending 100% of your time on socializing? Clearly, there is a trade-off between doing novel, interesting things, and talking to others about it.

Importantly, the other extreme - doing 100% interesting work and 0% networking - is not a good idea either. Unfortunately, it remains some kind of ideal, especially in the academic world, where a lot of people continue to think that eventually, their work will speak for themselves. That is very, very rarely, if ever, the case. If you're doing great things, tell others about it!

The other benefit of networking with interesting people is not just to tell them about what you're doing, but to learn about what they and their contacts are doing. The number of interesting ideas one can get from a good social network is absolutely astounding.

So overall, I would argue you should network as much as possible, i.e. to talk about your work, and to get more ideas, where "as much as possible" means as long as it doesn't negatively impact your work. Coincidentally, this is why I am such a huge fan of Twitter - it's an extremely efficient way to talk about your work and ideas, and to get input from other people you find interesting. But that's something for the next rule.

Closing tidbit 1: My own introduction to social network theory was during a sociology class at Stanford, where the professor asked us to read work by a sociologist named Mark Granovetter on "how people get jobs". Pretty boring, I thought. But as I dug deeper, I came to learn about his fascinating findings that most people seem to get crucial information about job opportunities not from strong ties in the network (good friends and family), but predominantly through weak ties (i.e. acquaintances). This phenomenon has been observed in many other network phenomena. His paper "The strength of weak ties" has been cited over 45,000 times, and he's a strong contender for a Nobel.

Closing tidbit 2: The US Bureau of Labour Statistics says that 70 percent of jobs are found through networking.

Rule 6: Say no

2017-11-24T19:02:32Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

Ask anyone in more advanced stages of their career about their biggest weaknesses in their professional lives, and you'll very often hear the phrase "I say yes too often".

So here is a simple rule: say no more often.

It sounds like bad advice. Shouldn't we be more open? Shouldn't we welcome new opportunities? Shouldn't we be excited if we are asked for input? Yes! Yes, we should be, but the more choices we have, the more selective we have to be.

Your time is very limited. Your time of full concentration is even more limited. The real problem is that for every yes, you're taking away your resources from other things that you also said yes to. If you say yes to too many things, you either won't be able to give your projects the attention they need, or you'll disappoint people you said yes to previously (or both). Either way, it's bad.

"But isn't my CV more impressive if it has lots of stuff on it? The more, the better?" you may ask, especially early in your career. The advice I'd give here is the same as the advice I'd give on how to prepare a presentation - most people will be able to take away one thing from it, a few may be able to walk away with 2-3 things. That's it. I think the same is true for a CV - after some basic vetting, you will be mainly associated with one thing that you did exceptionally well. The thing that truly stood out. The thing nobody else did.

This reminds me of one of the many great pieces of startup advice that YC gives: "We often say that a small group of customers who love you is better than a large group who kind of like you." I would argue the same is true when people decide whether to hire you or fund you. If everybody feels OK about your work, you're in trouble. If you have a few people who love one or two things you did extremely well, you'll be doing much better - they'll be your champions. I know this flies in the face of the advice some people give, especially in academia, which boils down to "just be sure to have all checkboxes ticked off, and don't show any weaknesses". All I can say is these people are wrong. Of course, if you're looking to spend your working life at incredibly boring places, you should follow these rules. Which, coincidentally, reminds me of yet another great piece of advice I heard around YC: When looking for brilliant people, look for the presence of strength, not the absence of weakness.

What does this have to do with saying no? Simple: you can't do something great unless you devote very large chunks of time on it. With every yes, you dilute yourself. So be careful when you say yes. Say yes only to things you can absolutely commit to, and no to everything else. Don't feel bad about saying no - you're really saying yes again to the things that you've already committed to.

Rule 5: get on board with tech

2017-11-14T12:08:17Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

This is one of the simpler rules, but I still find it surprising that even young people don't seem to grasp the extent to which technology is absolutely central in every job of the future (and increasingly of the present). Not being able to write and read code, and to understand how the web and computers work, at a fairly good level, will increasingly be the same as not being able to read and write.

Part of the reason, I suppose, has to do with the fact that it's currently very popular to take the contrarian view - you can find op-ed pieces saying "don't learn to code". The best advice I can give is to completely ignore these pieces. If you bother looking up some of these articles, you will almost invariably find that they are written by people who have made a great career based on their very ability to code. It's really simple: those who understand and shape technology will lead, the rest will follow.

Of course, not everyone who can program will be a programmer, just like not everyone who can write will become a writer.

A slight extension of this rule is to fully embrace technology. I am not saying that all technology is always good, nor would I say generally that the more technology, the better. We can argue about this forever, but there is a clear historical pattern you must be aware of: there has always been more technology at time t+1, than at time t. Fully embracing technology is the only way to be able to deal with it. Even if you come to the conclusion that a given technology is bad (for whatever reason), you will be much better equipped to criticize it if you fully understand it.

So, get on board with tech. It's not optional anymore.

Rule 4: Surround yourself with people who are better than you

2017-09-12T15:12:32Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

There's a saying that I love: if you're the smartest person in the room, you're in the wrong room.

As much as you will grow professionally on your own, I strongly believe that a large part - perhaps even the largest part - of your growth will come from the people you are surrounded by.

One way to look at this is the following: imagine that you will become the average of the five people you are surrounded by the most. I don't think this way of thinking is too far away from the truth. As a consequence, if you are surrounded by people who are in some ways better than you, then that means that you will be able to learn a lot from them, and grow. The opposite is also true, hence the saying that if you are the smartest person in the room, you should really find another room.

This doesn't feel natural to most of us. It certainly doesn't feel natural to me. For most of us, the feeling of being the smartest person in the room gives us a cozy feeling; a feeling of being in control of the situation; a feeling that there is nothing to worry about. But in reality, you should actually be worried, because it means you are not growing as much as you could.

On the flip side, being the least smart person in the room can be quite painful (notice that I use smart here somewhat liberally, not necessarily to mean intelligent, but simply to be very good at something). In my experience, the ability to stand this pain is an extreme booster for anything you do. Whether it's personal development, scientific research, sports, arts: if you surround yourself with people who are better than you, you will grow.

When I was younger, I had a phase where I was ambitious enough to become a decent squash player. At some point, one of my work colleagues at the time invited me to go and play squash with him. Never in my life was I so humiliated doing sports. I did not stand a chance against this guy. Nevertheless, it became obvious very quickly that I had never learned faster, and more, than playing against him. By playing against someone who was better than me, again and again, my own game improved dramatically. And ironically, my aspirations of becoming a decent squash player eventually came true (that was a long time ago ;-).

Another mantra that is relevant here, and that I am sure you have heard many times before, is to get out of your comfort zone. The idea here is exactly the same: by challenging yourself - truly challenging yourself so that it feels uncomfortable - you will build the resilience and strength that is important for growth.

So don't be afraid to feel stupid. Feeling stupid is a sure sign that you are exposing yourself to things you don't know. Feeling stupid is an opportunity to learn. A great read in this regard is the timeless essay The importance of stupidity in scientific research.

Rule 3: Enthusiasm makes up for a lot

2017-09-12T15:10:11Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

As mentioned in the first rule - do novel, interesting things - eighty percent of success is showing up, according to Woody Allen. Another famous quote is "success is 1% inspiration, and 99% perspiration" (attributed to Edison). Both of these quotes ring very true to me. And what you need in order to keep showing up, and to keep perspiring, is enthusiasm and drive.

Enthusiasm makes up for a lot. Not for everything, but for a lot. The best people I've worked with were deeply enthusiastic about the things they were working on. The vast majority of us are not born genius. But with enthusiasm, we can come as close as possible. Enthusiasm makes us continue in the face of difficulty, and failure. Enthusiasm keeps us going through the rough spots, which we will inevitably hit. Enthusiasm is contagious.

The advice here is not so much a simple "be enthusiastic", but rather, that if you don't feel deep enthusiasm for a particular thing, it's going to be very challenging. On the flip side, if you do feel deep enthusiasm for something, but don't feel you can compete with others in terms of brilliance, don't let that discourage you. By consistently showing up, and by continuing to work hard on it, you will eventually get farther than most.

Because enthusiasm is contagious, be sure to surround yourself with people that are truly enthusiastic about the things they're working on. Which brings us to next rule: if you're the smartest person in the room, you're in the wrong room.

Rule 2: If you can't decide, choose change

2017-09-05T04:39:44Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

It took me about 30 years to figure this out, but ever since I stumbled on it, I've found it applicable to any situation.

We need to make decisions every single day, and it seems that much of the career angst that befalls all of us from time to time is based on the fear that we could make the wrong decisions. Decisions are easy when the advantages clearly outweigh the disadvantages (or the other way around). Things get tricky when the balance is not as clear, and when the lists of potential positives and negatives add up to roughly the same. The inability to make a decision is one of the most dreadful feelings.

Whenever I am in such a situation where I can't decide because all options seem roughly equal, I choose the one that represents most change.

Here's why: on a path that is dotted with making decisions, you are inevitably going to have regrets down the line. There are two possible types of regrets; in the first one, you regret a path not taken; in the second, you regret having taken a path. My philosophy is to avoid the "path not taken" regret. It's the worse kind of regret. You will at times have regrets about having taken the wrong path - but at least you took the bloody path! It meant change, and it was probably exciting, at least for a while. Even if it turns out to have been the wrong decision, you've learned something new. You moved. You lived.

As far as we know, we only have this one life. Explore! Thus: when in doubt, choose change.

Rule 1: Do novel, interesting things

2017-09-03T18:19:39Z

(This post is part of a bigger list of rules that I have found helpful for thinking about a career, and beyond. See this post for an explainer).

This is possibly the most important rule, and thus I am putting it right at the start. The way this rule is phrased is partly inspired by Y Combinator's rule for startups: make something people want. If I were asked to condense career advice - specifically in academia, but I think also more broadly - into four words, it would be these: do novel, interesting things.

The rule is really composed of three sub rules: First, do something (surprisingly underestimated in my experience). Second, do something that is novel. And third, do something that is not just novel, but also interesting. Let's take a look at these, one by one.

Do something
I find it hard to overstate how important this is. I've met countless of brilliant young people who were clearly very smart and creative but had nothing to show for it. In academia, this is often phrased in the more negative "publish or perish", which I think is slightly misleading and outdated. What it should really say is "show your work that demonstrates your thinking, creativity, brilliance, determination, etc.". It doesn't have to be papers - it could really be anything. A blog. A book. Essays. Software. Hardware. Events you organized. Whatever - as long as it has your stamp over it, as long as you can say "I did this", you'll be fine.

I need to repeat that it's hard to overstate how important that is. As Woody Allen famously said, "Eighty percent of life is showing up". This is why I urge anyone who wants a job in a creative field - and I consider science and engineering to be creative fields - to actually be creative and make things, and make them visible. The most toxic idea in a young person's mind is that they have nothing interesting to say, and so they shouldn't say it in the first place. It gets translated into not showing what you've done, or worse, into not even doing it. Don't fall into that trap (I've written a bit more about this in a post entitled The curse of self-contempt).

Do something novel
Novelty is highly underrated. This is a bit of a personal taste, but I prefer something that is novel but still has rough edges, over something that is a perfect copy of something existing. I suppose the reason most people shy away from novelty, especially early in their career, is that it takes guts: it's easy for others to ridicule novel things (in fact, most novel things initially seem a little silly). But especially early in your career is when novelty matters the most, because that's when you are actually the most capable to be doing novel things since your brain is not yet completely filled up with millions of other people's ideas.

Novelty shouldn't be misunderstood as "groundbreakingly novel from every possible angle". It is often sufficient to take something existing and reinvent only a small part of it, which in turn may make the entire thing much more interesting.

Do something that is also interesting
While novelty is per se often interesting, it's not a guarantee. So make sure that what you do is interesting to you, and at least a few other people. It's obvious that doing things that are interesting will be good for your career. This doesn't need a lot of explanation, but it's important to realize that what is interesting is for you to figure out. Most people don't think a lot about this, and go on doing what everybody else is doing, because that must mean it's interesting (when in reality it often isn't, at least not to you). The ability to articulate for yourself why something is interesting is extremely important. Practice it by pitching your ideas to an imaginary audience - you'll very quickly feel whether an idea excites you, or whether you feel like you're just reciting someone else's thinking.

10 rules for the career of your dreams ;-)

2017-09-03T18:19:26Z

A few weeks ago, I was asked to give a short presentation at a career workshop for visiting international students at EPFL. The idea of the event was to have a few speakers who all shared an academic background to reflect on their (very different) career paths. As I started writing down the things that have guided me throughout the years, I began to realize that this would end up being one of those lists ("12 things to do before you die"), and naturally, I tried to come up with the tackiest title I could imagine, which is the title of this post.

Underneath the tacky title, however, is a serious list of rules that I've developed over the years. Some of these rules I've known to be helpful since I was a high school student. Others I've discovered much later, looking back on my path and trying to figure out, with hindsight, why I went one way, rather than the other.

Almost two years ago, I've received an email from a student asking for career advice. I answered, and decided to post the answer on this blog. That post - advice to a student - has been viewed a few hundred times since then, and I figured I should also share the career workshop talk, as a few people, today and in the future, may find it helpful. There is little use in uploading the slides since they were just the list of the rules. What I am going to do here instead is to expand on each of the rules a little more. This is a bit of an experiment to me, but hopefully, this will work out fine. Each of the ten rules will be its own post, and I will keep updating this post with links to each rule once they get published. So without further ado, here is the unfinished list of rules which I hope to complete over the next few weeks:

1. Do novel, interesting things

2. If you can't decide, choose change

3. Enthusiasm makes up for a lot

4. Surround yourself with people who are better than you

5. Get on board with tech

6. Say no

7. Networks, networks, networks

8. Be visible

9. Have alternatives

10. Be the best you can be, not the best there is

AI-phobia - a convenient way to ignore the real problems

2017-06-26T10:55:29Z

Artificial intelligence - AI - is a big topic in the media, and for good reasons. The underlying technologies are very powerful and will likely have a strong impact on our societies and economies. But it's also very easy to exaggerate the effects of AI. In particular, it's very easy to claim that AI will take away all of our jobs, and that as a consequence, societies are doomed. This is an irrational fear of the technology, something I refer to as AI-phobia. Unfortunately, because it is so appealing, AI-phobia currently covers the pages of the world's most (otherwise) intelligent newspapers.

A particularly severe case of AI-phobia appeared this past weekend in the New York Times, under the ominous headline "The Real Threat of Artificial Intelligence". In it, venture capitalist Kai Fu Lee paints a dark picture of the future where AI will lead to mass unemployment. This argument is as old as technology - each time a new technology comes along, the pessimists appear, conjuring up the end of the world as we know it. Each time, when faced with the historic record that shows they've always been wrong, they respond by saying "this time is different". Each time they're still wrong. But, really, not this time, claims Lee:

Unlike the Industrial Revolution and the computer revolution, the A.I. revolution is not taking certain jobs (artisans, personal assistants who use paper and typewriters) and replacing them with other jobs (assembly-line workers, personal assistants conversant with computers). Instead, it is poised to bring about a wide-scale decimation of jobs — mostly lower-paying jobs, but some higher-paying ones, too.

Where is Wikipedia's [citation needed] when you need it the most? Exactly zero evidence is given for the rather outrageous claims that AI will bring about a wide-scale decimation of jobs. Which is not very surprising, because there is no such evidence.

Lee then goes into the next paragraph, claiming that the companies developing AI will make enormous profits, and that this will lead to increased inequality:

This transformation will result in enormous profits for the companies that develop A.I., as well as for the companies that adopt it. Imagine how much money a company like Uber would make if it used only robot drivers. Imagine the profits if Apple could manufacture its products without human labor. Imagine the gains to a loan company that could issue 30 million loans a year with virtually no human involvement. (As it happens, my venture capital firm has invested in just such a loan company.)

We are thus facing two developments that do not sit easily together: enormous wealth concentrated in relatively few hands and enormous numbers of people out of work. What is to be done?

There are numerous problems with this argument. Technology has always helped to do something better, faster, or cheaper - any new technology that wouldn't do at least one of these things would be dead on arrival. With every new technology that comes along, you could argue that the companies that develop it will make huge profits. And sometimes they do, especially those that manage to get a reasonable chunk of the market early on. But does this always lead to an enormous wealth concentration? The two most recent transformative technologies, microprocessors and the internet, have certainly made some people very wealthy, but by and large the world has profited as a whole, and things are better than they have ever been in human history (something many people find hard to accept despite the overwhelming evidence).

What's more, technology has a funny way of spreading in ways that most people didn't intend or foresee. Certainly, a company like Uber could in principle use only robot drivers (I assume Lee refers to autonomous vehicles). But so could everybody else, as this technology will be in literally every car in the near future. Uber would probably even further lower their prices to be more competitive. Other competitors could get into this market very easily, again lowering overall prices and diminishing profits. New businesses could spin up, based on a new available cheap transportation technology. These Uber-like companies could start to differentiate themselves by adding a human touch, creating new jobs that don't yet exist. The possibilities are endless, and impossible to predict.

Lee then makes a few sensible suggestions - which he calls "the Keynesian approach" - about increasing tax rates for the super wealthy, using the money to help those in need, and also argues for a basic universal income. These suggestions are sensible, but they are sensible already in a world without AI.

He then singles out the US and China in particular, and this is where things get particularly weird:

This leads to the final and perhaps most consequential challenge of A.I. The Keynesian approach I have sketched out may be feasible in the United States and China, which will have enough successful A.I. businesses to fund welfare initiatives via taxes. But what about other countries?

Yes, what about them? Now, I do not for a moment doubt that the US and China will have many successful AI businesses. The US in particular has almost single-handedly dominated the technology sector in the past few decades, and China has been catching up fast. But to suggest that these countries can tackle the challenge because they have AI businesses that will be able to fund "welfare initiatives via taxes" - otherwise called socialism, or a social safety net - is ignoring today's realities. The US in particular has created enormous economic wealth thanks to technology in the past few decades, but has completely failed to ensure that this wealth is distributed fairly among its citizens, and is consistently ranked as one of the most unequal countries in the world. It is clearly not the money that is lacking here.

Be that as it may, Lee believes that most other countries will not be able to benefit from the taxes that AI companies will pay:

So if most countries will not be able to tax ultra-profitable A.I. companies to subsidize their workers, what options will they have? I foresee only one: Unless they wish to plunge their people into poverty, they will be forced to negotiate with whichever country supplies most of their A.I. software — China or the United States — to essentially become that country’s economic dependent, taking in welfare subsidies in exchange for letting the “parent” nation’s A.I. companies continue to profit from the dependent country’s users. Such economic arrangements would reshape today’s geopolitical alliances.

Countries other than US and China beware! The AI train is coming and you will either be poor or become dependent slaves of the US and Chinese AI companies that will dominate the world! You could not make this up if you had to (although there are some excellent Sci-Fi novels that are not too far away from this narrative).

I am sure Kai Fu Lee is a smart person. His CV is certainly impressive. But it strikes me as odd that he wouldn't come up with better alternatives, and instead only offers a classical case of a false dichotomy. There are many possible ways to embrace the challenges, and only a few actually have to do with technology. Inequality does not seem to be a technical problem, but rather a political problem. The real problems are not AI and technology - they are schools that are financed by local property taxes, health insurance that is tied to employment, extreme tax cuts for the wealthy, education systems with exorbitant tuition fees, and so on.

Let's not forget these problems by being paralyzed by the irrational fear caused by AI-phobia.

Lee closes by saying

A.I. is presenting us with an opportunity to rethink economic inequality on a global scale.

It would be a shame if we would indeed require artificial intelligence to tackle economic inequality - a product of pure human stupidity.

Gender diversity in tech - a promise

2017-03-24T15:35:32Z

It doesn't take much to realize that the gender ratio in technology is severely out of balance. Whether you look at employees at tech companies, computer science faculty members, graduates in computer and information sciences, user surveys on StackOverflow, you find almost the same picture anywhere.

From personal experience, it seems to me that the situation is considerably worse in Europe than in the US, but I don't have any data to back this up.

If there is any good news, it's that the problem is increasingly recognized - not nearly enough, but at least it's going in the right direction. The problem is complex and there is a lot of debate about how to solve it most effectively. This post is not about going into this debate, but rather to make a simple observation, and a promise.

The simple observation is that I think a lot of it has to do with role models. We can do whatever we want, if a particular group is overwhelmingly composed of one specific phenotype, we have a problem, because anyone who is not of that phenotype is more likely to feel "out of place" than they would otherwise, no matter how welcoming that group is.

The problem is that for existing groups, progress may be slow because adding new people to the group to increase diversity may initially be difficult, for many different reasons. Having a research group that is mostly male, I am painfully aware of the issues.

For new groups, however, progress can be faster, because it is often easier to prevent a problem than to fix one. And this is where the promise comes in. Last year, I became the academic director of a new online school at EPFL (the EPFL Extension School, which will focus on continued technology education). This sounds more glorious than it should, because at the time, this new school was simply an idea in my head, and the team was literally just me. But I made a promise to myself, namely that I would not build a technology program and have practically no women teaching on screen. No matter how well they would do it, if the teachers were predominantly male, we would be sending once again, without ill intent, the subtle signal that technology is something for guys.

Today, I want to make this promise publicly. At the EPFL Extension School, we will have gender parity for on-screen instructors. I can't guarantee that we will achieve this at all times, because we are not (yet) a large operation, and I also recognize that at any point in time we may be out of balance, hopefully in both directions, due to the fact that people come and people go. But it will be part of the school's DNA, and if we're off balance, we know what we have to do, and the excuse that it's a hard problem once you have it won't be acceptable.

Technology in public health: A discussion with Caroline Buckee

2017-03-19T11:10:54Z

A few weeks ago, I came across a piece in the Boston Globe entitled Sorry, Silicon Valley, but ‘disruption’ isn’t a cure-all. It's a very short op-ed, so I recommend reading it. The piece was written by Caroline Buckee, Assistant Professor at the Harvard T.H. Chan School of Public Health. I know Caroline personally, and given that she has written some of the key papers in digital epidemiology, I was surprised to read her rant. Because Caroline is super smart and her work extremely innovative, I started to ask myself if I am missing something, so I decided to write to her. My idea was that rather than arguing over Twitter, we could have a discussion by email, which we can then publish on the internet. To my great delight, she agreed, and I am now posting the current state of the exchange here.

From: Marcel Salathé
To: Caroline Buckee
Date: 16. March 2017

Dear Caroline,

I hope this email finds you well. Via Marc I recently found you on Twitter, and I’m looking forward to now seeing more frequently what you’re up to.

Through Twitter, I also came across an article you wrote in the Boston Globe (the "rant about pandemic preparedness", as you called it on Twitter). While I thought it hilarious as a rant, I also thought there were a lot of elements in there that I strongly disagree with. At times, you come across as saying “how dare these whippersnappers with their computers challenge my authority”, and I think if I had been a just-out-of-college graduate reading this, excited about how I could bring digital tools to the field of global health, I would have found your piece deeply demotivating.

So I wanted to clarify with you some issues you raised there, and share those with the broader community. Twitter doesn’t work well for this, in my experience; but would you be willing to do this over email? I would then put the entire discussion on my blog, and you can of course do whatever you want to do with it. I promise that I won’t do any editing at all, and I will also not add anything beyond what we write in the emails.

Would you be willing to do this? I am sure you are super busy as well, but I think it could be something that many people may find worthwhile reading. I know I would.

All the best, and I hope you won’t have to deal with snow any longer in Boston!

Cheers,
Marcel

From: Caroline Buckee

To: Marcel Salathé

Date: 16. March 2017

Hi Marcel,

Sure, I would be happy to do that, I think this is a really important issue - I'll put down some thoughts. As you know, I like having technical CS and applied math grads in my group, and in no way do I think that the establishment should not be challenged. We may disagree as to who the establishment actually is, however.

My concern is with the attitudes and funding streams that are increasingly prevalent among people I encounter from the start up world and Silicon Valley more generally (and these look to become even more important now that NIH funding it going away) - the attitude that we no longer need to do real field work and basic biology, that we can understand complex situations through remote sensing and crowd sourcing alone, that short term and quick fix tech solutions can solve problems of basic biology and complex political issues, that the problem must be to do with the fact that not enough physicists have thought about it. There is a pervasive arrogance in these attitudes, which are ultimately based on the assumption that technical skill can make up for ignorance.

As for the idea that my small article would give any new grad pause for thought, I hope it does. I do not count myself as an expert at this stage of my career - these issues require years of study and research. I believe I know enough to understand that a superficial approach to pandemic preparedness will be unsuccessful, and I am genuinely worried about it. The article was not meant to be discouraging, it was supposed to make that particular echo chamber think for a second about whether they should perhaps pay a little more attention to the realities, rich history, and literature of the fields they are trying to fix. (As a side note, I have yet to meet a Silicon Valley graduate in their early 20's who is even slightly deflated when presented with evidence of their glaring ignorance... but I am a bit cynical...!)

In my experience, my opinion is unpopular (at my university and among funders), and does not represent "the establishment". At every level, there is an increasing emphasis on translational work, a decreasing appetite for basic science. This alarms me because any brief perusal of the history of science will show that many of the most important discoveries happen in pursuit of some other scientific goal whose original aim was to understand the world we live in in a fundamental sense - not to engineer a solution to a particular problem. In my field, I think the problem with this thinking is illustrated well by the generation of incredibly complex simulation models of malaria that are intended to help policy makers but are impossible to reproduce, difficult to interpret, and have hundreds of uncertain parameters, all in spite of the fact that we still don't understand the basic epidemiological features of the disease (e.g. infectious duration and immunity).

I think there is the potential for an amazing synergy between bright, newly trained tech savvy graduates and the field of global health. We need more of them for sure. What we don't need is to channel them into projects that are not grounded in basic research and deeply embedded in field experience.

I would enjoy hearing your thoughts on this - both of us are well-acquainted with these issues and I think the field is quite divided, so a discussion could be useful.

I hate snow. I hate it so much!

Take care,

Caroline

From: Marcel Salathé
To: Caroline Buckee
Date: 18. March 2017

Dear Caroline,

Many thanks for your response, and thanks for doing this. I agree with you that it’s an important issue.

I am sorry that you encounter the attitude that we "no longer need to do real field work and basic biology, that we can understand complex situations through remote sensing and crowd sourcing alone”. This would indeed be an arrogant attitude, and I would be as concerned as you are. It does not reflect, however, my experience, which has largely been that basic research and field work are all that is needed, and the new approaches we and others tried to bring to the table were not taken seriously (the “oh you and your silly tech toys” attitude). So you can imagine why your article rubbed me a bit the wrong way.

I find both of these attitudes shortsighted. Let’s talk about pandemic preparedness, which is the focus of your article. Why wouldn’t we want to bring all possible weapons to the fight? It's very clear to me that both basic science and field work as well as digital approaches using mobile phones, social media, crowdsourcing, etc. will be important in addressing the threat of pandemics. Why does it have to be one versus the other? Is it just a reflection of the funding environment, where one dollar that is going to a crowdsourcing project is one dollar that is taken away from basic science? Or is there a more psychological issue, in that basic science is worthy of a dollar, but novel approaches like crowdsourcing are not?

You write that “the next global pandemic will not be prevented by the perfectly designed app. “Innovation labs” and “hackathons” have popped up around the world, trying to make inroads into global health using technology, often funded via a startup model of pilot grants favoring short-term innovation. They almost always fail.” And just a little later, you state that "Meanwhile, the important but time-consuming effort required to evaluate whether interventions actually work is largely ignored.” Here again, it’s easy to interpret this as putting one against the other. Evaluation studies are important and should be funded, but why can’t we at the same time use hackathons to bring people together and pick each other’s brains, even if only for a few days? In fact, hackathons may be the surest way to demonstrate that some problems can’t be solved on a weekend. And while it’s true that most ideas developed there end up going nowhere, some ideas take on a life of their own. And sometimes - very rarely, but sometimes - they lead to something wonderful. But independent of the outcome, people often walk away enlightened from these events, and have often made new connections that will be useful for their futures. So I would strongly disagree with you that they almost always fail.

Your observation that there is "an increasing emphasis on translational work, a decreasing appetite for basic science” is probably correct, but rather than blaming it on 20 year old SiliconValley graduates, I would ask ourselves why that is. Translational work is directly usable in practice, as per its definition. No wonder people like it! Basic research, on the other hand, is a much tougher sell. Most of the time, it will lead nowhere. Sometimes, it will lead to interesting places. And very rarely, it will lead to absolutely astonishing breakthroughs that could not have happened in any other way (such as the CRISPR discovery). By the way, in terms of probabilities of success, isn’t this quite similar to the field of mobile health apps, wich you dismissed as "a wasteland of marginally promising pilot studies, unused smartphone apps, and interesting but impractical gadgets that are neither scalable nor sustainable”? But I digress. Anyways, rather than spending our time explaining this enormous value of basic research to the public, which ultimately funds it, we engage in pity fights over vanity publications and prestige. People holding back data so that they can publish more; people publishing in closed access journals; hiring and tenure committees valuing publications in journals with high impact factors much more than public outreach. I know you agree here, because at one point you express this very well in your piece when you say that "the publish-or-perish model of promotion and tenure favors high-impact articles over real impact on health."

This that is exactly what worries me, and it worries me much, much more than a few arrogant people from Silicon Valley. We are at a point where the academic system is so obsessed with prestige that it created perverted incentives leading to the existential crisis science finds itself in. We are supposed to have an impact on the world, but the only way impact is assessed is by measures that have very little relevance in the real world, such as citation records and prizes. We can barely reproduce each other’s findings. For a long time, science has moved away from the public, and now it seems that the public is moving away from science. This is obviously enormously dangerous, leading to “alternative fact” bubbles, and politicians stating that people have had enough of experts.

On this background, I am very relieved to see scientists and funders excited about crowdsourcing, about citizen science, about creating apps that people can use, even at the risk that many of them will be abandoned. I would just wish that when traditional scientific experts see a young out of college grad trying to solve public health with a shiny new app, that they would go and offer to help them with their expertise - however naive their approach, or rather *especially* when the approach is naive. If they are too arrogant to accept the help, so be it. The people who will change things will always appreciate a well formed critique, or an advice that helps them jump over a hurdle much faster.

What I see, in short, is that very often, scientific experts, who already have a hard time getting resources, feel threatened by new tech approaches, while people trying to bring new tech approaches to the field are getting the cold shoulder from the more established experts. This, to me, is the wrong fight, and we shouldn’t add fuel to the fire by providing false choices. Why does it have to be "TED talks and elevator pitches as a substitute for rigorous, peer-reviewed science”; why can’t it be both?

Stay warm,

Marcel

PS Have you seen this “grant application” by John Snow? It made me laugh and cry at the same time… tinyurl.com/lofaoop

From: Caroline Buckee

To: Marcel Salathé
Date: 18. March 2017

Hi Marcel,

First of all, I completely and totally agree about the perverse incentives, ridiculous spats, and inefficiencies of academic science - it's a broken system in many ways. We spend our lives writing grants, we battle to get our papers into "high impact" journals (all of us do even though we hate doing it), and we are largely rewarded for getting press, bringing in money, and doing new shiny projects rather than seeing through potentially impactful work.

You say that I am probably right about basic science funding going away, but I didn't follow the logic from there. We should educate the public instead of engaging in academic pettiness - yes, I agree. Basic science is a tough sell - not sure I agree about that as much, but this is probably linked to developing a deeper and broader education about science at every level. Most basic science leads nowhere? Strongly disagree! If you mean by "leads nowhere" that it does not result in a product, then fine, but if you mean that it doesn't result in a greater understanding of the world and insights into how to do experiments better, even if they "fail", then I disagree. The point is that basic science is about seeking truth about the world, not in designing a thing. You can learn a lot from engineering projects, but the exercise is fundamentally different in its goals and approach. Maybe this is getting too philosophical to be useful.

In any case, I think it's important to link educating the public about the importance of basic science directly to the arrogance of Silicon Valley; it's not unrelated. Given that NIH funding is likely to become even more scarce, increasing the time and effort spent getting funding for our work, these problems will only get worse. I agree with you that this is a major crisis, but I do think it is important to think about the role played by Silicon Valley (and other wealthy philanthropists for that matter) as the crisis deepens. As they generously step in to fill the gaps - and I think it's wonderful that they consider doing so - it creates the opportunity for them to set the agenda for research. Large donations are given by rich donors whose children have rare genetic conditions to study those conditions in particular. The looming threat of mortality among rich old (mostly white) dudes is going to keep researchers who study dementia funded. I am in two minds about whether this increasing trend of personalized, directed funding from individuals represents worse oversight than we have right now with the NIH etc., but it is surely worth thinking about. And tech founders tend to think that tech-style solutions are the way forward. It is not too ridiculous, I don't think, to imagine a world where much if not most science funding comes from rich old white dudes who decide to bequeath their fortunes to good causes. How they decide to spend their money is up to them, but that worries me; should it be up to them? Who should set the agenda? It would be lovely to fund everything more, but that won't happen - there will always be fashionable and unfashionable approaches, not everyone gets funded, and Silicon Valley's money matters.

Public health funding in low and middle income settings (actually, in every setting, but particularly in resource-limited regions) is also a very constrained zero sum game. Allocating resources for training and management of a new mHealth system does take money away from something else. Crowd sourcing and citizen science could be really useful for some things, but yes, in many cases I think that sexy new tech approaches do take funding away from other aspects of public health. I would be genuinely interested - and perhaps we could write this up collaboratively - to put together some case studies and try to figure out exactly how many and which mHealth solutions have actually worked, scaled up, and been sustained over time. We could also dig into how applied public health grants are allocated by organizations to short-term tech pilot studies like the ones I was critical of versus other things, and try to evaluate what that means for funding in other domains, and which, if any, have led to solutions that are being used widely. This seems like it might be a useful exercise.

We agree that there should be greater integration of so-called experts and new tech grads but I don't see that happening very much. I don't think it's all because the experts are in a huff about being upstaged, although I'm sure that happens sometimes. If we could figure this out I would be very happy. This is getting too long so I will stop, but I think it's worth us thinking about why there is so little integration. I suspect some of it has to do with the timescales of global health and requirements for long-term relationship building and slow, careful work in the field. I think some of it has to do with training students to value get-rich-quick start-up approaches and confident elevator pitches over longer term investments in understanding and grappling with a particular field. I do think that your example (a young tech grad trying to naively build an app, and the expert going to them to try to help) should be reversed. In my opinion, the young tech grad should go and study their problem of choice with experts in the field, and subsequently solicit their advice about how to move forward with their shiny app idea, which may by then have morphed into something much more informed and ultimately useful...

PS :)

From: Marcel Salathé
To: Caroline Buckee
Date: 19. March 2017

Dear Caroline

My wording of “leads nowhere” may indeed have been too harsh, I agree with you that if well designed, then basic research will always tell us something about the world. My reference there was indeed that it doesn’t necessary lead to a product or a usable method. This is probably a good time where I should stress that I am a big proponent of basic research - anyone who doubts that is invited to go read my PhD thesis which was on a rather obscure aspect of theoretical evolutionary biology!

I actually think that the success distribution of basic research is practically identical with that of VC investments. Most VC investments are a complete loss, some return the money, very few return a few X, and the very rare one gives you 100X - 1000X. So is it still worth doing VC investments? Yes, as long as that occasional big success comes along. And so it is with basic research, except, as you say, and I agree, that we will never lose all the money, because we always learn something. But even if you dismiss that entirely, it would still be worth doing.

The topic we seem to be converging on is how much money should be given to what. Unless I am completely misinterpreting you, the frustration in your original piece came from the notion that a dollar in new tech approaches is a dollar taken away from other aspects of public health. With respect to private money, I don’t think we have many options. Whoever gives their wealth gets to decide how it is spent, which is only fair. I myself get some funding from private foundations and I am very grateful for it, especially because I am given the necessary freedom I need to reach the goals I want to achieve with this funding. The issue we should debate more vigorously is how much public money should be spent on what type of approach. In that respect, I am equally interested in the funding vs outcome questions you raised.

As to why there isn’t more integration between tech and public health, I don’t have any answers. My suspicion is that it is a cultural problem. The gap between the two worlds is still very large. And people with tech skills are in such high demand that they can choose from many other options that seem more exciting (even if in reality they end up contributing to selling more and better ads). So I think there is an important role for people like us, who have legs in both worlds, and who can at least try to communicate between the two. This is why I am so careful not to present them as “either or” approaches - an important part of the future work will be done by the approaches in combination.

(I think we’ve clarified a lot of points and I understand your view much better now. I’m going to go ahead and put this on the blog, also to see if there are any reactions to it. I am very happy to go on and discuss more - thanks for doing this!)

Marcel

Self-driving cars: the public health breakthrough of the early 21st century

2017-02-17T13:22:33Z

As readers of this blog know, I am a big fan of self-driving cars. I keep saying that self-driving cars are going to be the biggest public health breakthrough of the early 21st century. Why? Because the number of people that get injured or killed by humans in cars is simply astounding, and self-driving cars will bring this number close to zero.

If you have a hard time believing this, consider these statistics from the Association for Safe International Road Travel:

In the US alone,

each year 37,000+ people die in car crashes - over 1,600 are children, almost 8,000 are teenagers
each year, about 2.3 million people are injured or disabled

Globally, the numbers are even more staggering:

each year, almost 1,3 million people die in car crashes
each year, somewhere between 20 and 50 million people are injured or disabled
Car crashes are the leading cause of death among young people ages 15-29, and the second leading cause of death worldwide among young people ages 5-14.

If car accidents were an infectious disease, we would very clearly make driving illegal.

Self-driving cars will substantially reduce those numbers. It has recently been shown that the current version of Tesla's autopilot reduced crashes by a whopping 40% - and we're in early days when it comes to the sophistication of these systems.

All these data points lead me to the conclusion stated above, that self-driving cars are going to be the biggest public health breakthrough of the early 21st century.

I cannot wait to see the majority of cars being autonomous. I have two kids of age 4 and 7 - the only time I am seriously worried about their safety is when they are in a car, or when they play near a road, and the stats make this fear entirely rational. According to the CDC, the injuries due to transportation are the leading cause of death for children in the US, and I don't assume that this is much different in Europe.

In fact, the only time I am worried about my own safety is when I am in a car, or near a car. I am biking to and from the train station every day, and if you were to plot my health risk over the course of a day, you'd see two large peaks exactly when I'm on the bike.

If there is any doubt that I am super excited to see full autonomous vehicles on the street, I hope to have put them to rest. But what increasingly fascinates me about self-driving cars, beyond the obvious safety benefits, is what they will do to our lives, and how they will affect public transport, cities, companies, etc. I have some thoughts on this and will write another blog post later. My thinking on this has taken an unexpected turn after reading Rodney Brook's blog post entitled "Unexpected Consequences of Self Driving Cars", which I recommend highly.

Some News(letter)

2016-12-08T15:54:03Z

(It appears as I was updating this website, I was accidentally sending out a test post - my sincere apologies. The digital transformation is hard. QED.)

I've recently deleted my Facebook account, once again. I'm pretty sure that it's final this time. I respect Facebook as a company and I can see how one can love their products. But it's not for me. It used to be great for staying in touch with friends, and seeing pictures and news of friends and family. But Facebook ended up being mostly ads and "news" about Trump. Then came the fake news story. Then came the long-held realization about micro-targeting for political purposes. Then came the censorship deal with China's government. And the realization that what I see on Facebook, and what my friends see about me, is entirely driven by an algorithm. And this algorithm is entirely driven to maximize profits for Facebook. None of this on its own is a game stopper, but in its combination, and the fact that there was little upside to being on Facebook, I had to quit. In a strange way, I felt dirty using Facebook - I knew I was the product that was being monetized, but I checked nevertheless multiple times a day, just to think each and every time "why am I doing this?".

I have seen a lot of "quit social media you'll better off" posts lately. For me, Twitter has been too much of a benefit, professionally and personally, to just walk away from it. It truly has become my major source of professional news. I try to keep the politics and personal stuff to a minimum, and appreciate if others do that as well. To me, Twitter is what LinkedIn wanted to be - a professional network. The fact that my Twitter client does not algorithmically filter the tweets is a great benefit. The fact that Twitter has remained true to its roots - short messages where you need to get to the point fast - is a great benefit. The fact that Twitter is public by default is a great benefit - it means that I think twice about posting rants (sometimes I fail at this).

But I feel uneasy about Twitter too. The fact that I am delegating communication to a third party is troubling. What if Twitter gets sold to another company and becomes a horrible product? What if they change in a way I don't like? What if they go out of business? I cannot contact the people who follow me, without Twitter's blessing. If Twitter is gone, my contacts are gone. This is not good.

That is why I decided to start a newsletter called Digital Intelligence. I know there are some people who follow me on Twitter because I provide them with interesting bits of news, typically around anything digital - technology, education, academia, economy, etc. - that I find on the web and that I love to share. I think for many it would in fact be more efficient to subscribe to the newsletter instead. Email may not feel like the hip thing in 2017, but I think there are many benefits to email that are simply not there with Twitter. All of us already use email. It's easily searchable. It's not typically filtered by an algorithm (beyond the spam filter, of course). It allows us to go beyond 140 characters when necessary.

But ultimately, I want to be able to communicate with other people without the dependency of a third party platform. In the digital world, email is the only way to do that.

Data of the people, by the people, for the people

2016-06-14T08:42:14Z

About 150 years ago, the American president Abraham Lincoln gave a very short speech - only a few minutes long - on a battlefield in Gettysburg, Pennsylvania. The occasion was to honor the soldiers who died in a fierce battle at the height of the American Civil War. Despite the brevity of the speech, and the fact that almost nobody understood what Lincoln was saying, it is now perhaps the most famous speech in US history by a US president. It is only ten sentences long, but to condense it even further here, Lincoln essentially said that there is nothing anyone could do to properly honor the fallen soldiers, other than to help ensuring that the idea of this newly conceived nation would continue to live on, and that “government of the people, by the people, for the people, shall not vanish from the earth.”

Why is this such a powerful line? It’s powerful because it expresses in very simple terms the basic idea of democracy, that we the people can form government, and that we the people can make political decisions, which is in itself the best guarantee that the decisions are made in the best interest of us, the people.

So, what does all of this have to do with open data?

Fundamentally, government is about organizing power. The vast majority of us agrees that power should be distributed among the many, not the few. To quote John Dalberg Acton: “Liberty consists in the division of power. Absolutism, in the concentration of power.” That is what democracy is about. And that is the discussion we should have about data. Because data is power. And if liberty consist in the division of power, or in the divided access to power, then that means that liberty also consist in the division of data.

But what does it even mean to say that data equals power?

Data contains information, and information can be used for commercial gains. We all understand that. But the power of data is much more fundamental than that. To understand this, we need to reflect on where we are as humans, at this point in time. We have now entered the second machine age - an age where machines will not only be much stronger, physically, as they have been for centuries, but also much, much smarter than we are. Not just a little smarter, but orders of magnitudes smarter. Most of us have come to terms with the fact that machines will achieve human intelligence. But think about machines that are ten times smarter, a hundred times smarter. How do you feel about a machine that is a million times smarter than a human? It’s a question worth asking, because while we may not live to see such a machine, our children, or grandchildren, probably will. In any case, even a machine that’s 100 times smarter than us is something you wouldn’t want to compete against. You wouldn’t feel comfortable if such machines were controlled by a small elite group. However, if such a machine were an agent, at your service, and if everyone would have such agents, which they’d use to make their lives better, that would be an entirely different story. Thus, when AI - artificial intelligence - becomes very powerful, it would be a disaster if that power were in the hands of a few. We would go back to absolutism, and despotism. We therefore need to ensure that the power of AI is distributed widely.

There are some efforts, like the non-profit organization OpenAI, that aim to ensure that this is the case. In fact, if you follow the field of machine learning a little bit, a field that is currently at the heart of many of the AI-relevant breakthroughs, then you would see that most organizations are now open-sourcing the code that’s behind these AI breakthroughs. That’s a good thing, because it helps ensuring that the raw machinery to build AI, the algorithms, are indeed in the hands of many.

But this is not enough - not nearly. It’s very important to recognize that the power of AI is not simply in the algorithms; it’s not simply in the technology per se. It’s in the data. AI becomes intelligent when it can quickly learn on large amounts of data. AI without data does not exist. The analog version, the human brain, can perhaps help us to understand this idea a bit better. A human brain, in isolation, can only do so many things. It’s when the brain can learn on data that the magic happens. We call this education, or learning more generally. The brain itself is necessary, but it is the access to data - in the form of knowledge, and education - that makes us the most intelligent individuals to ever walk the face of the earth; of such an intelligence that we can even create artificial intelligence. And to take this analogy one step further, if you learn on small, false, or just generally crappy data, your brain will consistently make the wrong predictions. Coincidentally, this is why science has been such a boon for mankind: the scientific method helps us ensure that our brains get trained on high quality data.

So this is the central idea here:

The enormous power of AI is based on data. If we want everyone to have access to this power, we need widespread access to data.

Put slightly differently:

Broad open data access is an absolute necessity for human liberty in the machine age.

If we accept this, then the question immediately arises, how do we get there? The fact that AI power is derived from data also means that from an economic perspective, privileged data access is incredibly valuable. Market players with privileged data access have absolutely no interest in losing this privilege. This is understandable - in the information economy, being able to extract information from data that can be used commercially is a matter of life and death, economically speaking. Forcing these players to give up their privileged access to data, which they generally collected themselves, would likely have severely negative economic consequences. It would also be highly unethical - for example, I’d be very upset if we forced Google to open up their data centers where anyone could have access to my data. There has to be another way.

I would like to offer a suggestion for another way. Access to personal data should be controlled by those who generate the data, not by those who collect it. The data generator is the person whose data is collected. In order for the data generator to be able to control access, the collector needs to provide the person a copy of the personal data.

Let’s make an example. Let’s say you use a provider’s map on your smartphone to drive from A to B. As you’re driving, GPS data of your trip is collected by the app maker. The app maker uses this kind of data to give you real-time traffic information. Great service - but you’ll never be able to access this data. You should be able to access this data, either in real time or with some delay, and do whatever you please to do with it, from training your own AI to sharing or selling it to third parties.

Another example. Let’s say you track your fitness with some device, you always shop for food at the same grocery store, and you also took part in a cohort study where your genome was sequenced, with your permission of course. The fitness device maker may reuse your data to make a more compelling product; the grocery store may direct ads at you for new products that fit your profile; and the cohort study will use your DNA data for research. All good - but is it easy for you to combine these three data sources? Not at the moment. You should be able to access all three data source - your fitness data, your nutrition data, and your DNA, without having to ask anyone for permission, for whatever reason. If you’re now asking, “why would anyone want that data”, you are asking the exact wrong question. It’s not anyone’s business why you would want that data - the point is that you should be able to get it with zero effort, in machine readable form, and then you should be allowed to do with it whatever you want to. It's your data.

In some situations, we’re already close to this scenario. For example, when you open a bank account, of course you will be able to access every last detail of any transaction at any point in time, whenever and wherever you want to, without having to ask anyone. Any banking service without this possibility would be unthinkable. Why isn’t it like this with any service? If I can have my financial data like that, why can I not have the same access to my health data, my location data, my shopping data?

Once our own data is easily accessible for us, then it will be possible for us to let others access the data, provided we allow it. We can for example give the data to third parties such as trusted research groups, not-for-profit-organizations, or even trusted parts of the government or trusted corporations. At the moment, this sounds very futuristic. But imagine, for example, a trusted health data organization, perhaps a cooperative, where hundreds of thousands or even millions of people share their health data. This would be an enormous data pool that could be studied by public health officials to make better recommendations. It could be investigated by pharmaceutical companies to design new drugs. And, to bring this back to the original thought about AI, anyone could use this data to improve the artificial intelligence agents that will increasingly make health decisions on our behalf.

Today, we’ll hear many excellent arguments that make the case for open data, highlighting social, political, economical and scientific aspects. My argument is that human liberty cannot exist in the machine age that is run by algorithms, unless people have broad access to data to improve their own intelligent agents. From this perspective, it makes no sense to be concerned about “smart machines”, or “smart algorithms” - the major concern should be about closed data. We won’t be able to leverage the phenomenal power of smart, learning, machines for the public good, and for distributed AI - for distributed power, really - if all the data is locked away, accessible only to select few. We need data of the people, by the people, for the people.

1 Year Apple Watch

2016-04-28T20:59:32Z

It's now been roughly one year since I started wearing Apple Watch. I must say I find it quite a compelling device. I use it primarily as an activity tracker, and I also really love the calendar. Generally, its tight integration with Apple's ecosystem makes it a real winner for me.

It's hard to objectively justify a device. For sure I have objectively become more aware about how much I move and exercise per day. I put it on more or less first thing in the morning, but when I forget to do that, I quickly have the feeling that something is missing. In other words, it has become part of my daily routine, something only a few devices have managed to do. By comparison, I've tried pretty much all iPads since version 1 and none of them ever managed to become irreplaceable.

I do hope Apple will use its "win by continuous improvements" strategy for the watch as well. It's clear we're at very early days in the wearable space. I'm excited to see what's ahead.

I have only one very urgent request: please fix Siri. ("I'm sorry Marcel, I did not get what you said").

The curse of self-contempt

2016-03-25T11:46:21Z

(I wrote this post almost 9 months ago, but never published it, for reasons that now escape me. Realizing this omission today, I decided to publish it since I haven't changed my mind about the issue).

This morning, a friend shared an article on Twitter, originally published in the Guardian, with the title "Sophie Hunger: Sadly, I don't need a history to be able to exist somewhere". Sophie Hunger is a Swiss musician who, as the daughter of a diplomat, spent large parts of her live abroad (in other European countries). The article is about authenticity, home, and identity. In it, she writes:

"I can't be proud to be Swiss, although I'm predestined to have these kind of feelings. I'm afraid, I'm not an entirely humble person, but I do have the typical European extra dose of self-contempt. Yet, I discipline myself not to feel proud about my country because I know it is a dishonourable kind of feeling. What have I done to be Swiss, and why should it be an achievement? You see, there's a philosophical problem there."

Eight years ago, I left Switzerland - where I was born and raised - to travel the world a bit, and then to permanently move and live in the US. When I left the country I grew up in, I had the exact same feelings that Sophie expressed in her article. Now, having just returned, I see these feelings in an entirely different way: as part of the root of European angst, perhaps the root of European arrogance, and to some extent as a terrible curse: the curse of self-contempt.

Before I go on, let me clarify that I don't think all Europeans are angsty or arrogant. But while in the US, I have often been astounded by the arrogance of some Europeans, criticizing everything - especially the ones who were either just visiting, or hadn't been in the country for long. Even more surprisingly, Americans would usually take it lightly, laugh with the visitors, which in turn infuriated them even more - did they not get that they were being criticized (stupid!), or were they making fun of them (arrogant!)?

The curse of self-contempt is almost entirely absent in the US, at least compared to Europe. Instead, Americans are brought up to be proud of what they achieved, and full of hope for where they can go. It's widely known that American students are off the charts when it comes to self-esteem and self-confidence. And it's easy for the European critic to laugh this off, especially when the stats on important measures like reading ability and math skills are much more average. But I now believe that it's better to be overconfident about yourself, than under- confident. Extremes in both direction are harmful. But in the long run, modest chronic under-confidence is much more harmful than modest chronic overconfidence.

In the culture I grew up, I was taught that "Eigenlob stinkt", which literally means that "self-praise stinks". And that feeling is still part of the national identity - just two days ago, it was the headline of a paragraph in one of Switzerland's major newspapers (NZZ), in an article about the Swiss National holiday. Think about this expression for a moment. It quite clearly states that it is very bad if you praise yourself. How dare you praise yourself? What have you done that is worthy of praise? Let others be the judge to decide who is worthy of praise.

This is the curse of self-contempt - the inability to be content with yourself, or to praise yourself. It is almost unavoidable that arrogance follows. And as Sophie Hunger's paragraph shows, not only are you not supposed to praise yourself, but don't dare to be proud of your country, because it is dishonorable too. After all, you have not done anything to be Swiss, so how dare you be proud?

This is no critique of Sophie Hunger (the irony would be unbearable). As I mentioned, I had the exact same feelings, and I am grateful that she expressed those complex feelings in a few clear sentences. I am merely pointing out that I now consider these feelings harmful to any one person, and certainly harmful to a society. Of course it's crucial to live in a society where critical thought is possible and even encouraged, and an occasional dose of self-reflection and self-criticism is certainly healthy too. But not to be allowed to be praising yourself, or to be proud of the place you grew up in, that strikes me as highly destructive to the development of healthy people and a healthy population.

I have no internal conflict feeling proud of what the Swiss, my ancestors, have achieved, while at the same time feeling disgust at some of the dark historical moments, and some current developments. Just like I don't have any conflict to feel happy with myself, without ignoring, and working on, my darker sides. On a higher level, I can also feel proud to be European - part of a continent that has inflicted so much harm to others, and to itself, until 60 years ago, but that has been remarkably peaceful and resilient in the past few decades, and that managed to keep its cool when others (cough, USA, cough) lost their temper for a decade. And I can do this perfectly well while being alarmed by the current inability to find good solutions to deep economic and social problems. Feeling love for something and retaining the ability to see and point out problems, with the goal to improve - these things are not exclusive, but rather depend on each other.

It is my hope that I can retain this attitude for as long as possible. Just as living abroad for many years has changed me, being back at home will probably change me again over the years. Perhaps this is why I feel compelled to write this, so I can remember in the future.

Jack of all trades

2016-03-08T16:08:14Z

The Swiss National Science Foundation just published an interview with me, in the form of an article (you can read the article in english, french, or german). The last paragraph reads as follows:

So he wears the caps of scientist, entrepreneur, author and musician. Can he manage them all? "I envy those scientists who spend all of their energy on a single pursuit. Being active in a number of different research fields sometimes leads you to think that you lack depth in a number of them. But given that modern science is interdisciplinary, becoming involved in areas outside of one’s comfort zone is also an asset. After all, why choose one approach over another?"

I could probably write an entire book on the idea expressed in this paragraph. Interdisciplinary research has fascinated me from the beginning of my career as a scientist. Doing interdisciplinary science is hard. It's hard because, despite best efforts by the various institutions involved in science, the cards are stacked against you:

A truly interdisciplinary research project is hard to get funded; experts in one discipline won't understand - or worse, trivialize - the challenges in the other disciplines.
A truly interdisciplinary research project is hard to execute; different domains speak different languages, have different theories, consider different issues relevant.
A truly interdisciplinary research project is hard to get published; they don't fit in the neat categories of most journals that are rooted in their disciplines, and there are only a few multidisciplinary journals. Also, point 1.
A truly interdisciplinary research project is hard to get noticed; there are almost no conferences, prizes, recognitions, societies, etc. for interdisciplinary work.

These challenges are increasingly recognized. Unfortunately, there is almost nothing substantial that is being done to address them. And it's not for the lack of trying. It is just simply a very, very hard problem to solve. Disciplines may be arbitrary, but they do exist for a good reason.

But the key point I tried to address in the interview - and which led to the highly condensed last paragraph cited above - is that the biggest hurdle for doing interdisciplinary science is found in oneself. At least, that is my experience. Doing interdisciplinary science means spending much time trying to understand the other disciplines. You can't do interdisciplinary science without having a basic grasp of the other disciplines. The more you understand of the other disciplines, the more interesting your interdisciplinary research will be.

And here's the catch: all this time you spend keeping up with understanding at least superficially what's going on in the other disciplines, is time you'd normally spend keeping up with your own field. As a consequence, you are constantly in danger of becoming a "jack of all trades, master of none". I highly recommend reading the Wikipedia entry on the etymology of this term. When it first emerged, it was simply "jack of all trades", meaning a person who was able to do many different things. The negative spin "master of none" was only added later, but it's deeply engrained in our culture. The fact that similar sayings exist in all other languages, as listed on the Wikipedia page, speaks volumes.

In science, not being perceived as an outstanding expert in one particular field is a real danger to one's career, especially in the mid-career stage. The incentive structure of science is hugely influenced by reputation, which is the main reason scientists are so excited about anything with prestige. At the beginning of your career, as a student, it's clear you're not an expert; at the end, it's clear you're an expert, which presumably is why you survived in the system for so long (exceptions apply). But in the ever growing stretch in between - especially the roughly ten years between PhD and tenure - you definitely do not want to be seen as a "jack of all trades, master of none"

Unless you don't give a damn, which, if you're like me, is what I advise you to do.

I wasn't sarcastic when I said that I envy scientists who spend all of their time working on a single topic. Focus is something I strive for in everything I do. How marvelous to be consumed by one particular question! How satisfying it must be to point all one's neurons to a single problem, like a laser! What a pleasure to be fully in command of all the literature in your speciality! How wonderful to go back to the same conferences, knowing everyone by name, being friends with most of them. Alas, it is not for me.

I'm drawn to many different fields, just like I'm drawn to experiencing many different types of food. Goodness knows I can get obsessed about one particular food item, spending years trying to perfect it. But that doesn't mean I'm not intently curious at all the other things that surround me. In science, I've decided I find the space between disciplines too interesting to be focusing exclusively on one discipline.

But this is the catch 22: you need to be able to deal with the fact that you're not as much of an expert in your main discipline as you could be. Are you able to deal with this?

One advice that I would give, completely unsolicited, like everything on this blog, is to first become very very good in one particular field. Good enough that you find it easy to publish, get funding, get jobs, get invited to conferences, and so on. At this point, you'll be in a much stronger position to branch out. You'll still face all the negative incentives listed above, but at least you have a home base you can return to if things get too crazy.

And when everything goes haywire, always remember:

Specialization is for insects.

Open Data: Our Best Guarantee for a Just Algorithmic Future

2016-02-10T08:17:47Z

(Two days ago I gave a talk at TEDxLausanne - I'll post the video when it will become available. This is the prepared text of the talk.)

Imagine you are coming down with the flu. A sudden, rapid onset of a fever, a sore throat, perhaps a cough. Worried, you start searching for your symptoms online. A few days later, as you're not getting better, you decide it's time to go see a doctor. Again a few days later, at your appointment with the doctor, you get diagnosed with the flu. And because flu is a notifiable disease, your doctor will pass on that information to the public health authorities.

Now, let's pause for a moment and reflect on what just happened. The first thing you did was to go on the internet. Let’s say you searched on Google. Google now has a search query from you with typical flu-related search terms. And Google has that information from millions of other people who are coming down with the flu as well - 1 two 2 weeks before that information made it to the public health authorities. In other words, from the perspective of Google, it will be old news.

In fact, this example isn’t hypothetical. Google Flu Trends was the first big example of a new field called “digital epidemiology”. When it launched, I was a postdoc. It became clear to me that the data that people generate about being sick, or staying healthy, would increasingly bypass the traditional healthcare systems, and go through the internet, apps, and online services. Not only would these novel data streams be much faster than traditional data streams, they would also be much larger, because - sadly - many more people have access to the internet through a phone than access to a health care system. In epidemiology, speed and coverage are everything; something the world was painfully reminded of last year during the Ebola outbreak.

So I became a digital epidemiologist - and I wondered: what other problems could we solve with these new data? Diseases like the flu, Ebola, and Zika get all the headlines, but there is an entire world of diseases that regularly kills on a large scale that almost nobody talks about: plant diseases. Today, 500 million smallholder farmers in the world depend on their crops doing well, but help is often hard to get when diseases start spreading. Now that the internet and mobile phones are omnipresent, even in low income countries, it seemed that digital epidemiology could help, and so a colleague, David Hughes, and I built a platform called PlantVillage. The idea was simple - if you have a disease in your field or garden, simply snap a picture with your phone and load it onto the site. We’ll immediately have an expert look at it and help you.

This system works well - but there are only so many human experts available in real time. Can we possibly get the diagnosis done by a machine too? Can we teach a computer to see what’s in an image?

A project at Stanford called ImageNet tried to do this with computer vision – they created a dataset of hundreds of thousands of images - showing things like a horse, a car, a frog, a house. They wanted to develop software that could learn from the images, to later correctly classify images that the software had never seen before. This process is called “machine learning”, because you are letting a machine learn on existing data. The other way of saying this is that you are training an algorithm on existing data. And when you do this right, then the end product - the trained algorithm - can work with information it hasn’t encountered before. But the people at Image Net didn’t just use machine learning. They organized a challenge - a friendly competition - by saying “here, everybody can have access to all this data - if you think you can develop an algorithm that is better than the current state of the art, go for it!” And go for it, people did! Around the world, hundreds of research teams participated in this challenge, submitting their algorithms. And a remarkable thing happened. In less than five years, the field experienced a true revolution. At the end, the algorithms weren’t merely better than the previous ones. They were now better than humans.

Machine learning is an incredibly hot and exciting research field, and it’s the basis of all the “artificial intelligence” craze that’s going on at the moment. And it's not just academic: it is how Facebook recognizes your friends when you upload an image. It is how Netflix recommends which movies you will probably like. And it is how self driving cars will bring you safely from A to B in the very near future.

Now, take the ImageNet project, but replace the images of horses and cars and houses, with images of plant diseases. That is what we are now doing with PlantVillage. We are collecting hundreds of thousands of images from diseased and healthy plants around the world, making them open access, and we are running open challenges where everyone can pitch in algorithms that can correctly identify a disease. Imagine how transformational this can be! Imagine if these algorithms can be just as good, or perhaps even better, than human experts. Imagine what can happen when you build these algorithms into apps, and release those apps for free to the 5 billion people around the globe with smartphones.

It’s clear to me now that this not only the future of PlantVillage, but a future of applied science more generally. Because if you can do this with plant diseases, you can do this with human diseases as well. You can in principle do it with skin cancer detection. Basically, any task where a human needs to make a decision based on an image, you can train an algorithm to be just as good. And it doesn’t stop at images, of course. Text, videos, sounds, more complex data altogether - anything is up for grabs. As long as you have enough good data that a machine learning algorithm can train on, it’s only a matter of time until someone will develop an algorithm that will reach and exceed human performance. And here, we're not talking science fiction, in the next 50 years, we're talking now, in the next couple of years. And this is why these large datasets - big data - are so exciting. Big data is not exciting because it’s big per se. It’s exciting because that bigness means that algorithms can learn from vast amounts of knowledge stored in those datasets, and achieve human performance.

If algorithms derive their power from data, then data equals power. So who has the data? Things may be ethically easy with images of horses, cars, houses, or even plant diseases - but what about the data concerning your personal health? Who has the data about our health, data which will form the basis for smart, personalized health algorithms? The answer may surprise you, because it’s not just about your past visits to doctors, and to hospitals. It’s your genome, your microbiome, all the data from your various sensors, from smartphones to smartwatches. The drugs you took. The vaccines you received. The diseases you had. Everything you eat, every place you go to, how much you exercise. Almost anything you do is relevant to your health in one way or another. And all that data exists somewhere. In hospital databases. In electronic health records. On the servers of the Googles and Apples and Facebooks of this world. In the databases of the grocery stores, where you buy your food. In the databases of the credit card companies who know where you bought what, when. These organizations have the data on which to train the future algorithms of smart personalized healthcare.

Today, these mainly business organizations provide us with compelling services that we love to use. In the process, they collect a lot of data about us, and store them in their mostly secure databases. They use these data primarily driven by the potential of commercial gains. But the data are closed, not accessible to the public - we imprison our data in those silos that only a selected few have access to, because we are afraid of privacy loss. And because of this fear, we don’t let the data work for us.

Remember Google Flu Trends that I mentioned a few minutes ago? Last year, Google shut it down. Why? We can only speculate. But what this reminds us of is that those who have the data with which they can build these fantastic services... can also shut them down. And when it comes to our health, to our wealth, to our public infrastructure, we should be really careful to think deeply about who owns the data. I applaud Google for what they have done with Google Flu Trends. I am a happy consumer of many Google services that I love to use. But it is our responsibility to ensure that we don’t start to depend too strongly on systems that can be shut down any day without warning, because of a business decision that's been made thousands of miles away.

So, how we can strike the right balance between protecting individual privacy and unleashing big data for the good of the public? I think the solution lies in giving each of us a right to a copy of our data. We can then take a copy of our data, and either choose to retain complete privacy - or we can choose to donate parts of these data to others, to research projects, or into the public domain to pursue a public good, with the reassurance that these data will not be used by insurance companies, banks and employers to discriminate against us.

Implementing this vision is not going to be easy, but it is possible. It has to be possible. Why? Two reasons (at least). First, our data is already digital, stored in machines somewhere and hence eminently hackable. We should have regulations in place to manage the risks of the inevitable data breaches. Second, we are now running full speed into a 2nd machine age where machines will not only be much stronger than us - as they have been in the past decades - but also much, much smarter than us. We need to continue to ensure that the machines work in our common interest. It’s not smart machines and artificial intelligence we should be concerned about - they are smart and intelligent because of the data. Our concern should be about closed data. We won’t be able to leverage the phenomenal power of smart, learning, machines for the public good if all the data is locked away.

Open data is not what we should be afraid of - it's what we should embrace. It’s our best guarantee that we remain in control of the algorithms that will rule our digital world in the future.