Death by Algorithm - an Amazon Post Mortem

On February 12, my research lab at EPFL received an email from Amazon (via the email address "mturk-noreply@amazon.com") which contained the following:

Greetings from Amazon Mechanical Turk,

We regret to inform you that your Amazon Mechanical Turk (MTurk) account has been suspended effective immediately. We took this action because you did not provide true and accurate information about your location during the registration process, which violates our Participation Agreement.

You can review the complete Participation Agreement at the link below:

https://www.mturk.com/participation-agreement 

Your remaining prepaid HITs balance will be refunded to the payment method you used to purchase the Prepaid HITs once your outstanding Worker liability has been resolved.

Wow.

Some context: Amazon MTurk is a crowdsourcing platform that allows us to crowdsource certain tasks to people, and pay them for it. This platform is very popular especially in research. In our case, we use MTurk to annotate tweets, and while our usage of it is on and off, this email caught us in the middle of intense MTurk activity for a coronavirus study.

I should also add that we've been using Amazon MTurk for many years, and have spent tens of thousands of dollars on it. That's in addition to the tens of thousands of dollars we spend on AWS infrastructure every single year. We are no Netflix, but by all common standards we should be considered a good customer.

36 hours later, we've been able to resolve the situation. Through a personal contact at Amazon, I managed to get a clear explanation from MTurk. Having used the service for many years, I created the account back in the day while I was still in the US. Since a few years, we've been using the account from Switzerland. For some reason, this discrepancy - usage from Switzerland while having an address in the US - only triggered the system now.

The person at MTurk was very helpful and helped me resolve the issue quickly. We're up and running again. But many things about this experience rubbed me the wrong way (if you happen to follow me on Twitter, you have probably seen my strong reaction). I tried to summarize everything in an email to Amazon, which I am posting here as well.

Hi [name redacted]

Many thanks for your detailed message, and your help - much appreciated. 

I’d like to explain my side of the story, and the reasons for my strong reaction. I’ve always been very positively impressed by interactions with Amazon staff, and after all, many of us learned customer obsession from Amazon. That’s why when we received the email on 2/12 about immediate account suspension, we were dumbfounded. How could this happen?

Your explanations now make things clear to me. The short version is, your system must have gotten triggered by an old US address from my time at Penn State many years ago, and as you correctly observed, all of our activity is coming out of Switzerland. But that has been the case for almost five years now, so I don’t understand why your system got triggered now, all of a sudden.

In any case, I don’t have insights into your system, but I’d like to share with you the customer experience side. We’ve been using Amazon MTurk for many years, on and off. We must have spent tens of thousands of dollars on it. We use AWS infrastructure for all our projects, and we spend tens of thousands of dollars on that every year. Right now, we were in the middle of a coronavirus research project when we got the message of immediate account suspension. From your records, I was able to find a message from MTurk on 1/29 at 4:52 AM my time, indicating that there might be an issue. At that time, I was at a large machine learning conference that I organize, and I had my email on auto-respond saying I won’t be able to read email, and that important mails should be resent. Of course, given that the Amazon email was automatic and coming from a noreply address, my response went nowhere. Thus, not having read that email, your system kicked us out without further warning on 2/12.

Initially, I thought you had suspended our account without warning. I can see now that you had sent us one warning. But I still find the action extremely draconian. Emails can get lost in spam filters, or can be missed due to multiple circumstances. To suspend an account that has had no problems for many years and spent reasonable amounts of money because inaction after a single email is extremely frustrating, to say the least. We can live with not having access to MTurk for a day or two, but given our dependence on Amazon for infrastructure, the entire experience makes me incredibly nervous. Are we at risk of waking up one day with all our platforms down because we missed that one email? I can’t really take that risk, as I am sure you understand. So my first recommendation would be to follow up with one or two more emails. Rather than waiting 14 days between an email and a decision, please send a few emails in that time period, with escalating warning messages. 

In any case, after not having seen any action on the account, your system decided to suspend the account. This left us completely stranded. Why would you do that? If after some warnings your realize that customer has taken no action, why don’t you just shut down the service provision - without account suspension? That would have allowed us to immediately log into the account, see that something is wrong, and fix it - problem solved. The fact that you completely locked us out is actually the worst part of the story. It’s like a landlord locking you out of your apartment because you didn’t fix something, but you can’t get back in to fix it. So my second recommendation would be to not suspend accounts, but suspend services.

If you would follow these two recommendations, the situation would have been completely avoidable. We probably wouldn’t have experienced an interruption, and even if we had, we would have been able to fix the issue immediately.

I have nevertheless two other recommendations. The first is to make your emails more clear. The statement that "Our records indicate that you have provided incomplete or inaccurate information during the Amazon Mechanical Turk (MTurk) registration process” is not meaningful for me as a user. The addresses in the system didn’t show me any problems. Only your email below made it clear to me what the problem was (the old US address). In addition, keep in mind that I registered the account many years ago, which made the message even more confusing (what exactly did I do wrong many years ago?).

The fourth recommendation is to provide better ways to contact your support. Your actions put services into severe distress. Please give them the option to call / online chat somewhere, rather than providing a link to a totally standard, bland support form. I am privileged to be able to have a small audience on Twitter, and to know some Amazon employees personally. But that should not be a condition to get a situation like that resolved quickly. Put a human back into the loop. Yes, it may cost some money - but the risk of losing customers may be more costly down the line. And you’d probably find that the first two recommendations would mean that most situations would anyways sort themselves out before it gets to that stage. But the last option should always be the ability to talk to an actual person.

I hope this feedback is useful. As a service provider myself, I often find that I can learn the most from detailed customer feedback, even - or especially - when it’s criticism. In the grand scheme of things, the magnitude of this event is very low; but in an age of increasing concern about automated decision making, I nevertheless feel the episode is symptomatic of a development going in the wrong direction, and hope to have provided some ways to correct course.

Very best,
Marcel

Applied Machine Learning Days

The Applied Machine Learning Days (AMLD) 2020 is around the corner - in about two months, we’ll open our doors for the fourth time to an expected audience of over 2000 people, with 30 hands-on sessions and 29 tracks on machine learning and artificial intelligence with top speakers from around the world.

An industry representative told me the other day “it’s amazing how you managed to create such a high quality event, and such a brand, from nothing, in just three years”. I’m very grateful for these kind words. But it also made me think. Did we succeed? Where could we go? Where should we go? What did we do right, or wrong? And then I also realized that there is no written record of how AMLD came to be, and so I felled compelled to write this post.

In 2016, my lab at EPFL launched crowdAI, an AI challenge platform (today, it’s a spin off with the name AIcrowd). The idea was to run public machine learning challenges, in an affordable and open source way. We had ideas for a few challenges, specifically also around the research we were doing, and we knew how to build a challenge platform - but what could we offer to the community as prizes, with loads of money not being an option? After some thinking, we decided that one cool prize could be to bring people to Switzerland, in the winter (snowy Alps!), to a small workshop where top performers in these challenges could share their approaches, and learn from each other.

Around the same time, EPFL hired to new professors, Martin Jaggi (machine learning) and Bob West (data science). The three of us felt like it would be a cool idea to create something a bit bigger than a small workshop around the topic of applying machine learning to lots of interesting problems, and we created the Applied Machine Learning Days, with the idea to bring together ML practitioners to share do’s and don’t of this exciting technology. Without any resources, we went ahead and started inviting people to speak at AMLD, and managed to attract a great speaker setup, mostly from our own personal networks. We were hopeful that around 100 people would show up (ML was a hot topic, after all). To our great surprise, hundreds of people signed up, and the largest room we could find on campus in the short term had a capacity of 450 people. AMLD 2017, a 2-day event with talks, was a great success, and we were motivated to do more after that.

When putting together the website for AMLD 2017, I added the slogan “2 days of talks and tutorials”. But given that the AMLD 2017 organization was rather rushed, we did not really have the time to organize tutorials. So for AMLD 2018, we created a call for workshops, and to our great delight, numerous high-quality workshops were proposed. Given that AMLD 2017 ended up being much bigger than planned, we felt that AMLD 2018 could be even bigger, and in the summer of 2017, we brought on an event manager to help us coordinate the event full time (hi Sylvain!). AMLD 2018 thus became a 4-day event, with 2 days of workshops, and 2 days of talks. Almost 1000 people ended up coming to the event, with the workshop weekend completely booked out and long waiting lists.

At this point, we realized we had hit a nerve. People really seemed to like the mix of academia and industry. In parallel, many AI events were popping up left and right (and of course we were not the first either), with some of them being very much focused on marketing and sales, while traditional ML conferences were highly technical. We seemed to have found a sweet spot in between these two extremes, where practitioners and enthusiasts from all types of organizations could come together and learn from each other.

AMLD 2018 was great, but we realized that the single track model would not work for much longer. Thus, the idea of domain-specific tracks - AI & your field - was born. For AMLD 2019, we opened a call for tracks, and once again, the community came along and put together awesome tracks! Given the expected increase in size, we asked Sylvain to stay onboard full time :-). Overall, AMLD 2019 ended up being again a 4-day event, with 2 days of workshops, and 2 days of conference with both keynotes and domain-specific parallel tracks that over 1700 people attended. Speakers like Garry Kasparov, Jeff Dean, and Zeynep Tufekci gave the event a very special vibe.

For AMLD 2020, we primarily thought “never change a winning team”. But nonetheless, I became personally frustrated that while we were holding this interesting event, the public discussion was getting increasingly negative and concerned about this technology, and most of the uncertainty - not surprisingly - was about work, jobs, and skills. So we decided to extend AMLD by one day, and to have a third day that more specifically focuses on all things AI & economy: jobs, skills, employment, HR, social policy, startups, etc. which we're organizing with our neighbors and colleagues from the University of Lausanne. Given the growth, we recently also brought on another person to help with the organization (hi Pauline!).

And once again, the community came along and put together absolutely stunning workshops and tracks. Some of the tracks have such a stellar speaker line up that they would very much go through as independent conferences in their own right!

On reflecting what made AMLD work so well, in such a short time, I’ve come to learn a number of insights. The first is to create an event that you would love going to. This is a truism in industry, certainly in the consumer sector - if you are not using your own product or service, why would anyone else? I keep reminding people that we are not organizing AMLD because somebody told us to. We are doing it simply because we want such an event to exist. Indeed, one of the most difficult thing for us as organizers is to not be able to enjoy the event as visitors. Tough life 😉

The second insight is to not do it alone, but together with others. People were often shocked to hear that the event management team was composed of one person, for a conference of the size of AMLD. But of course there were hundreds of volunteers behind the scene, from the volunteers helping, people in the labs of the organizers, and others who came to help during the event. And most of all, of course, the workshop and track organizers who put together the program.

The final insight is to take it easy on the hype, and just stick to quality. The amount of AI bullshit available on the internet and at some events has taken on rather stunning proportions. Personally I have nothing against some long-term thinking and some excitement around it. But at some point, one should put up, or shut up. It’s for that reason that we want AMLDs to always be associated with academic institutions. That is not to say that non-academic institutions wouldn’t be able to put together great events; of course they are. But academic institutions have the benefit that they are full of deeply skeptical scientists that won’t tolerate overselling for too long, and most speakers will naturally focus on serious work when they present at an academic institution.


So, what is the future of AMLD? I can’t say for sure, but it’s worth reflecting on what the ultimate goal of AMLD is. An event is a huge effort, both for organizers and attendees. If you calculate the overall costs, and the energy spent by thousands of people coming together in a particular location, the numbers are absolutely enormous. So there’d better be a very good reason why you do this. For me, the ultimate reason to organize AMLD is to make sure that this technology remains on people’s radar, and becomes accessible to them. Modern machine learning is once-a-lifetime kind of technology, and may even end up being a once-a-century kind of technology. If AMLD can help many more people to understand this technology and use it for their goals, then it will have been worth it. Because I believe very strongly in Feynman’s observation of “what I cannot create, I do not understand.”

That is the ultimate reason I believe that AMLD should grow much more, both in size and in scope. To give you an idea of the importance of machine leaning, PwC believes that by 2030, AI (they mean machine learning) will boost GDP by 13% globally, and up to 26% locally. That’s 15.7 Trillion Dollars, more than today’s GDP of China and India combined. But more than money, machine learning will affect all social systems deeply. Not mastering this technology is simply not an option. Events like AMLD can do their share to ensure a well informed society, from academia to industry to the general public.









Why I am not interacting on LinkedIn

TL;DR: I won't participate in LinkedIn communications, because I have no more trust in LinkedIn. For important matters, please send email instead.

___

LinkedIn would have the potential of being useful. Unfortunately, it has in recent times become a master of dark patterns (see https://www.darkpatterns.org/), and I see no indication of this stopping any time soon.

Just a few examples:

  • I've been getting LinkedIn requests from women "to share the passion of love", with LinkedIn apparently being unable / unwilling to filter this as spam.
  • LinkedIn has started prefilling communication fields in ways that I do not like at all - for example, pre-filling responses with "Hi XYZ, thanks for reaching out. I’d like to learn more." - No, I don't like to learn more. It's preposterous to pre-fill communication forms like that, LinkedIn.
  • I've been getting emails saying "You have 1 new message" - but instead of showing the message right there (I am looking at email right now, for goodness' sake), LinkedIn forces me to open the website, or the app, so that it can track me better.

I could of course leave LinkedIn, the same way I left Facebook a few years ago. The problem is that the network itself is very  interesting, and unlike Facebook, I never had any real trust that information on LinkedIn would be private. So rather than abandoning it, I just want to clarify that I am not using it for communication purposes, because a) I cannot assume our communication to remain private, and b) I'd like to stay away from the dark patterns of LinkedIn as much as possible. Like everyone else, I am trying to do my best to mitigate the onslaught of digital overload - my detachment from LinkedIn is a further step in that direction.




The perils of "free" education

Imagine reading one day on a restaurant website the following:

"Come eat with us FOR FREE! That's right - we believe in open food, and that nutrition is a basic human right for everyone! We provide free meals, created for people at any hunger level! Eat as much as you want!"

Sounds ridiculous, doesn't it? Even a place that would offer "all your nutritional needs covered for just $49 / month" sounds incredibly suspicious. Would you really eat there? What could they possibly be putting on those plates to cover their expenses?

It's pretty simple - you know that food has a price, and that the people making and serving it have expenses to cover - so anything extremely cheap is most likely very bad quality, or a scam, and immediately raises red flags.

Yet, when it comes to education, we quickly seem to let our guard down. Free courses to learn anything I want? Bring it on! A full education for $49 / month? Sounds good, count me in! No suspicion is raised - this is normal. After all, most of us didn't pay for school either.

The fundamental problem there is that we easily confuse education with information. Yes, information can be free, and when it comes to knowledge about the world, I'd argue it should be free. But education is not just information - not by a long shot. Education is helping learners make a selection about what is worth learning (at least initially); it's helping learners differentiate good quality from bad quality; it's helping learners when they get stuck; it's reviewing learners' work, and give them guidance on how to improve; it's assessing their knowledge at regular levels, and eventually putting your name to vouch for the level of know-how they have. And it's a million other things as well, as anyone who has ever taught another person anything can readily confirm.

Many of these things cannot be automated yet, and the question is not only if they ever will be, but also if that's what learners really want. But whatever the future may bring: today, when you are getting an education, someone is paying for it. And thus, if it's free, or almost free to you, then someone else is paying for it. Do you know who that is, and why they are doing it? Do you know "the deal"?

Our students at EPFL, who currently pay about 1'200 $ per year - a tiny fraction of the true cost - (hopefully) know that it's the Swiss tax payers who are paying for them. They also hopefully know that the tax payers are paying because they think they're getting more in return in the long run - the until know safe assumption being that a well-educated population will be a wealthy population. Same for all the parents in the country (the vast majority) who send their kids to the excellent public schools, at no direct cost to them - again paid for by the tax payers, for the same reason.

This is not free. In fact, many governments spend multiple percentage points of their GDP on education. The EU, for example, spent 715 Billion Euros on education in 2017. That's right: that's € 715'000'000'000 in a single year. So much for free education.

So that makes you wonder - what are all the people thinking who are signing up for (almost) free online education? That somehow, all of these mechanisms don't apply anymore? Part of the problem, as mentioned above, is that we confuse online education with online information. Online information can be free, yes - although even there, its creation and maintenance costs something.

But the problem with (almost) free online education goes further. In the same way that you are not a Facebook user, but the Facebook product (with advertisers being the customers), free online education means that you're not the only one who is learning something - someone else, with vested financial interests, is also learning something about you. What kind of learner you are, for example. How quickly you grasp new concepts. How well you work with others. How you solve problems. How you search for solutions. How motivated you are. If you go to any job interview, these are exactly the kinds of things companies want to know about potential hires. And there is a huge market developing that sells this information about you to companies that are hiring - directly, or indirectly through recruitment services. It's worth a lot of money - enough to pay for the education.

It may be a deal worth making. But we should be aware that there is a deal in the first place, and most of us simply are not. And we should realize that the education that we hope would advance our career, may actually be putting a break on it.

There is a long term solution, and a short term solution. The long term solution is appropriate legislation - that learners getting the free education deal must be kept totally in the clear about this deal. Perhaps an even better solution would be to prevent such deals in the first place, at least for adult, continued education; I'm not entirely sure yet. The short term solution is to prevent the problem in the first place, and find someone who is truly interested in your education so that they will pay for it. Oftentimes, that will be you; other times, that may be your employer, or perhaps even your government, should you be so lucky to live or work in an environment that supports life long learning and continued education.

At the EPFL Extension School, we think about these issues a lot. We offer courses and programs for digital up-skilling online, and the topics of lifelong learning, online education, and data ownership are parts of our daily discussions. The entire learning experience going through the EPFL Extension School is what we offer as a service, and because of that, we don't even have to think about monetizing any data about our learners to anyone. In fact, we viciously protect our learner's data, far beyond our legal obligations. Being in total control of our learner's data was also a major factor when we decided to build our own learning platform, rather than using someone else's.

I think that's the fairer deal.


1. Meditate

This is one of the brain tools I can't really understand anymore how I managed to do without.

My favorite - and largely only - form of meditation that I practice regularly is mindfulness mediation. I first encountered the concept about 15 years ago when I came across a book called "Wherever you go, there you are" by Jon Kabat Zinn. I was about to become a PhD student at the time and so my natural instinct was to think that this was likely some trivial nonsense. But I was in enough adolescence-related mental pain at the time that I thought I'd give it a try. It changed the way I looked at myself, and at how the mind works. It was the first time when I fully realized, I am not my thoughts, and that thoughts are objects I can study objectively. I've been expanding on this concept for quite some time ever since then.

I most recently came back to regular practice with the Waking Up app, which I very much like (and I can also recommend the book with the same name by the same author, Sam Harris).

Mindfulness meditation has become a key tool for me, and today, as we are in the midst of the attention economy, being able to realize when someone tries to hijack your mind has become extremely valuable. That's of course in addition to all the benefits you get from realizing when your mind gets hijacked by your own thoughts. I now rank the ability to do basic mindful mediation so highly that I will teach my kids to understand it before I teach them how to code (and if you've ever been on the receiving end of one of my sermons about everyone having to learn how to code, you know what that means).

So this is my first advice: Look into mindfulness meditation.