security research, software archaeology, geek of all trades
329 stories
·
7 followers

Learning from Near Misses

2 Shares

[Update: Steve Bellovin has a blog post]

One of the major pillars of science is the collection of data to disprove arguments. That data gathering can include experiments, observations, and, in engineering, investigations into failures. One of the issues that makes security hard is that we have little data about large scale systems. (I believe that this is more important than our clever adversaries.) The work I want to share with you today has two main antecedents.

First, in the nearly ten years since Andrew Stewart and I wrote The New School of Information Security, and called for more learning from breaches, we’ve seen a dramatic shift in how people talk about breaches. Unfortunately, we’re still not learning as much as we could. There are structural reasons for that, primarily fear of lawsuits.

Second, last year marked 25 years of calls for an “NTSB for infosec.” Steve Bellovin and I wrote a short note asking why that was. We’ve spent the last year asking what else we might do. We’ve learned a lot about other Aviation Safety Programs, and think there are other models that may be better fits for our needs and constraints in the security realm.

Much that investigation has been a collaboration with Blake Reid, Jonathan Bair, and Andrew Manley of the University of Colorado Law School, and together we have a new draft paper on SSRN, “Voluntary Reporting of Cybersecurity Incidents.”

A good deal of my own motivation in this work is to engineer a way to learn more. The focus of this work, on incidents rather than breaches, and on voluntary reporting and incentives, reflects lessons learned as we try to find ways to measure real world security. The writing and abstract reflect the goal of influencing those outside security to help us learn better:

The proliferation of connected devices and technology provides consumers immeasurable amounts of convenience, but also creates great vulnerability. In recent years, we have seen explosive growth in the number of damaging cyber-attacks. 2017 alone has seen the Wanna Cry, Petya, Not Petya, Bad Rabbit, and of course the historic Equifax breach, among many others. Currently, there is no mechanism in place to facilitate understanding of these threats, or their commonalities. While information regarding the causes of major breaches may become public after the fact, what is lacking is an aggregated data set, which could be analyzed for research purposes. This research could then provide clues as to trends in both attacks and avoidable mistakes made on the part of operators, among other valuable data.

One possible regime for gathering such information would be to require disclosure of events, as well as investigations into these events. Mandatory reporting and investigations would result better data collection. This regime would also cause firms to internalize, at least to some extent, the externalities of security. However, mandatory reporting faces challenges that would make this regime difficult to implement, and possibly more costly than beneficial. An alternative is a voluntary reporting scheme, modeled on the Aviation Safety Reporting System housed within NASA, and possibly combined with an incentive scheme. Under it, organizations that were the victims of hacks or “near misses” would report the incident, providing important details, to some neutral party. This database could then be used both by researchers and by industry as a whole. People could learn what does work, what does not work, and where the weak spots are.

Please, take a look at the paper. I’m eager to hear your feedback.

Read the whole story
zwol
5 days ago
reply
Mountain View, CA
acdha
5 days ago
reply
Washington, DC
Share this story
Delete

Git PSA: git-rev-parse

2 Shares

Another public service announcement about Git.

There are a number of commands everyone learns when they first start out using Git. And there are some that almost nobody learns right away, but that should be the first thing you learn once you get comfortable using Git day to day.

One of these has the uninteresting-sounding name git-rev-parse. Git has a bewildering variety of notations for referring to commits and other objects. If you type something like origin/master~3, which commit is that? git-rev-parse is your window into Git's understanding of names:

  % git rev-parse origin/master~3
  37f2bc78b3041541bb4021d2326c5fe35cbb5fbb

A pretty frequent question is: How do I find out the commit ID of the current HEAD? And the answer is:

   % git rev-parse HEAD
   2536fdd82332846953128e6e785fbe7f717e117a

or if you want it abbreviated:

   % git rev-parse --short HEAD
   2536fdd

But more important than the command itself is the manual for the command. Whether you expect to use this command, you should read its manual. Because every command uses Git's bewildering variety of notations, and that manual is where the notations are completely documented.

When you use a ref name like master, Git finds it in .git/refs/heads/master, but when you use origin/master, Git finds it in .git/refs/remotes/origin/master, and when you use HEAD Git finds it in .git/HEAD. Why the difference? The git-rev-parse manual explains what Git is doing here.

Did you know that if you have an annoying long branch name like origin/martin/f42876-change-tracking you can create a short alias for it by sticking

    ref: origin/martin/f42876-change-tracking

into .git/CT, and from then on you can do git log CT or git rebase --onto CT or whatever?

Did you know that you can write topic@{yesterday} to mean “whatever commit topic was pointing to yesterday”?

Did you know that you can write ':/penguin system' to refer to the most recent commit whose commit message mentions the penguin system, and that 'HEAD:/penguin system' means the most recent such commit on the HEAD branch?

Did you know that there's a powerful sublanguage for ranges that you can give to git-log to specify all sorts of useful things about which commits you want to look at?

Once I got comfortable with Git I got in the habit of rereading the git-rev-parse manual every few months, because each time I would notice some new useful tool.

Check it out. It's an important next step.

[ Previous PSAs:

]

Read the whole story
zwol
10 days ago
reply
Mountain View, CA
Share this story
Delete

Love your bugs

2 Shares

In early October I gave a keynote at Python Brasil in Belo Horizonte. Here is an aspirational and lightly edited transcript of the talk. There is also a video available here.

I love bugs

I’m currently a senior engineer at Pilot.com, working on automating bookkeeping for startups. Before that, I worked for Dropbox on the desktop client team, and I’ll have a few stories about my work there. Earlier, I was a facilitator at the Recurse Center, a writers retreat for programmers in NYC. I studied astrophysics in college and worked in finance for a few years before becoming an engineer.

But none of that is really important to remember – the only thing you need to know about me is that I love bugs. I love bugs because they’re entertaining. They’re dramatic. The investigation of a great bug can be full of twists and turns. A great bug is like a good joke or a riddle – you’re expecting one outcome, but the result veers off in another direction.

Over the course of this talk I’m going to tell you about some bugs that I have loved, explain why I love bugs so much, and then convince you that you should love bugs too.

Bug #1

Ok, straight into bug #1. This is a bug that I encountered while working at Dropbox. As you may know, Dropbox is a utility that syncs your files from one computer to the cloud and to your other computers.

1
2
3
4
5
6
7
8
9
10
11
12
        +--------------+     +---------------+
        |              |     |               |
        |  METASERVER  |     |  BLOCKSERVER  |
        |              |     |               |
        +-+--+---------+     +---------+-----+
          ^  |                         ^
          |  |                         |
          |  |     +----------+        |
          |  +---> |          |        |
          |        |  CLIENT  +--------+
          +--------+          |
                   +----------+

Here’s a vastly simplified diagram of Dropbox’s architecture. The desktop client runs on your local computer listening for changes in the file system. When it notices a changed file, it reads the file, then hashes the contents in 4MB blocks. These blocks are stored in the backend in a giant key-value store that we call blockserver. The key is the digest of the hashed contents, and the values are the contents themselves.

Of course, we want to avoid uploading the same block multiple times. You can imagine that if you’re writing a document, you’re probably mostly changing the end – we don’t want to upload the beginning over and over. So before uploading a block to the blockserver the client talks to a different server that’s responsible for managing metadata and permissions, among other things. The client asks metaserver whether it needs the block or has seen it before. The “metaserver” responds with whether or not each block needs to be uploaded.

So the request and response look roughly like this: The client says, “I have a changed file made up of blocks with hashes 'abcd,deef,efgh'”. The server responds, “I have those first two, but upload the third.” Then the client sends the block up to the blockserver.

1
2
3
4
5
6
7
8
9
10
11
12
                +--------------+     +---------------+
                |              |     |               |
                |  METASERVER  |     |  BLOCKSERVER  |
                |              |     |               |
                +-+--+---------+     +---------+-----+
                  ^  |                         ^
                  |  | 'ok, ok, need'          |
'abcd,deef,efgh'  |  |     +----------+        | efgh: [contents]
                  |  +---> |          |        |
                  |        |  CLIENT  +--------+
                  +--------+          |
                           +----------+

That’s the setup. So here’s the bug.

1
2
3
4
5
6
7
8
9
10
11
12
                +--------------+
                |              |
                |  METASERVER  |
                |              |
                +-+--+---------+
                  ^  |
                  |  |   '???'
'abcdldeef,efgh'  |  |     +----------+
     ^            |  +---> |          |
     ^            |        |  CLIENT  +
                  +--------+          |
                           +----------+

Sometimes the client would make a weird request: each hash value should have been sixteen characters long, but instead it was thirty-three characters long – twice as many plus one. The server wouldn’t know what to do with this and would throw an exception. We’d see this exception get reported, and we’d go look at the log files from the desktop client, and really weird stuff would be going on – the client’s local database had gotten corrupted, or python would be throwing MemoryErrors, and none of it would make sense.

If you’ve never seen this problem before, it’s totally mystifying. But once you’d seen it once, you can recognize it every time thereafter. Here’s a hint: the middle character of each 33-character string that we’d often see instead of a comma was l. These are the other characters we’d see in the middle position:

1
l \x0c < $ ( . -

The ordinal value for an ascii comma – , – is 44. The ordinal value for l is 108. In binary, here’s how those two are represented:

1
2
bin(ord(',')): 0101100  
bin(ord('l')): 1101100  

You’ll notice that an l is exactly one bit away from a comma. And herein lies your problem: a bitflip. One bit of memory that the desktop client is using has gotten corrupted, and now the desktop client is sending a request to the server that is garbage.

And here are the other characters we’d frequently see instead of the comma when a different bit had been flipped.

1
2
3
4
5
6
7
8
,    : 0101100
l    : 1101100
\x0c : 0001100
<    : 0111100
$    : 0100100
(    : 0101000
.    : 0101110
-    : 0101101

Bitflips are real!

I love this bug because it shows that bitflips are a real thing that can happen, not just a theoretical concern. In fact, there are some domains where they’re more common than others. One such domain is if you’re getting requests from users with low-end or old hardware, which is true for a lot of laptops running Dropbox. Another domain with lots of bitflips is outer space – there’s no atmosphere in space to protect your memory from energetic particles and radiation, so bitflips are pretty common.

You probably really care about correctness in space – your code might be keeping astronauts alive on the ISS, for example, but even if it’s not mission-critical, it’s hard to do software updates to space. If you really need your application to defend against bitflips, there are a variety of hardware & software approaches you can take, and there’s a very interesting talk by Katie Betchold about this.

Dropbox in this context doesn’t really need to protect against bitflips. The machine that is corrupting memory is a user’s machine, so we can detect if the bitflip happens to fall in the comma – but if it’s in a different character we don’t necessarily know it, and if the bitflip is in the actual file data read off of disk, then we have no idea. There’s a pretty limited set of places where we could address this, and instead we decide to basically silence the exception and move on. Often this kind of bug resolves after the client restarts.

Unlikely bugs aren’t impossible

This is one of my favorite bugs for a couple of reasons. The first is that it’s a reminder of the difference between unlikely and impossible. At sufficient scale, unlikely events start to happen at a noticable rate.

Social bugs

My second favorite thing about this bug is that it’s a tremendously social one. This bug can crop up anywhere that the desktop client talks to the server, which is a lot of different endpoints and components in the system. This meant that a lot of different engineers at Dropbox would see versions of the bug. The first time you see it, you can really scratch your head, but after that it’s easy to diagnose, and the investigation is really quick: you look at the middle character and see if it’s an l.

Cultural differences

One interesting side-effect of this bug was that it exposed a cultural difference between the server and client teams. Occasionally this bug would be spotted by a member of the server team and investigated from there. If one of your servers is flipping bits, that’s probably not random chance – it’s probably memory corruption, and you need to find the affected machine and get it out of the pool as fast as possible or you risk corrupting a lot of user data. That’s an incident, and you need to respond quickly. But if the user’s machine is corrupting data, there’s not a lot you can do.

Share your bugs

So if you’re investigating a confusing bug, especially one in a big system, don’t forget to talk to people about it. Maybe your colleagues have seen a bug shaped like this one before. If they have, you might save a lot of time. And if they haven’t, don’t forget to tell people about the solution once you’ve figured it out – write it up or tell the story in your team meeting. Then the next time your teams hits something similar, you’ll all be more prepared.

How bugs can help you learn

Recurse Center

Before I joined Dropbox, I worked for the Recurse Center. The idea behind RC is that it’s a community of self-directed learners spending time together getting better as programmers. That is the full extent of the structure of RC: there’s no curriculum or assignments or deadlines. The only scoping is a shared goal of getting better as a programmer. We’d see people come to participate in the program who had gotten CS degrees but didn’t feel like they had a solid handle on practical programming, or people who had been writing Java for ten years and wanted to learn Clojure or Haskell, and many other profiles as well.

My job there was as a facilitator, helping people make the most of the lack of structure and providing guidance based on what we’d learned from earlier participants. So my colleagues and I were very interested in the best techniques for learning for self-motivated adults.

Deliberate Practice

There’s a lot of different research in this space, and one of the ones I think is most interesting is the idea of deliberate practice. Deliberate practice is an attempt to explain the difference in performance between experts & amateurs. And the guiding principle here is that if you look just at innate characteristics – genetic or otherwise – they don’t go very far towards explaining the difference in performance. So the researchers, originally Ericsson, Krampe, and Tesch-Romer, set out to discover what did explain the difference. And what they settled on was time spent in deliberate practice.

Deliberate practice is pretty narrow in their definition: it’s not work for pay, and it’s not playing for fun. You have to be operating on the edge of your ability, doing a project appropriate for your skill level (not so easy that you don’t learn anything and not so hard that you don’t make any progress). You also have to get immediate feedback on whether or not you’ve done the thing correctly.

This is really exciting, because it’s a framework for how to build expertise. But the challenge is that as programmers this is really hard advice to apply. It’s hard to know whether you’re operating at the edge of your ability. Immediate corrective feedback is very rare – in some cases you’re lucky to get feedback ever, and in other cases maybe it takes months. You can get quick feedback on small things in the REPL and so on, but if you’re making a design decision or picking a technology, you’re not going to get feedback on those things for quite a long time.

But one category of programming where deliberate practice is a useful model is debugging. If you wrote code, then you had a mental model of how it worked when you wrote it. But your code has a bug, so your mental model isn’t quite right. By definition you’re on the boundary of your understanding – so, great! You’re about to learn something new. And if you can reproduce the bug, that’s a rare case where you can get immediate feedback on whether or not your fix is correct.

A bug like this might teach you something small about your program, or you might learn something larger about the system your code is running in. Now I’ve got a story for you about a bug like that.

Bug #2

This bug also one that I encountered at Dropbox. At the time, I was investigating why some desktop client weren’t sending logs as consistently as we expected. I’d started digging into the client logging system and discovered a bunch of interesting bugs. I’ll tell you only the subset of those bugs that is relevant to this story.

Again here’s a very simplified architecture of the system.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
                                   +--------------+
                                   |              |
               +---+  +----------> |  LOG SERVER  |
               |log|  |            |              |
               +---+  |            +------+-------+
                      |                   |
                +-----+----+              |  200 ok
                |          |              |
                |  CLIENT  |  <-----------+
                |          |
                +-----+----+
                      ^
                      +--------+--------+--------+
                      |        ^        ^        |
                   +--+--+  +--+--+  +--+--+  +--+--+
                   | log |  | log |  | log |  | log |
                   |     |  |     |  |     |  |     |
                   |     |  |     |  |     |  |     |
                   +-----+  +-----+  +-----+  +-----+

The desktop client would generate logs. Those logs were compress, encrypted, and written to disk. Then every so often the client would send them up to the server. The client would read a log off of disk and send it to the log server. The server would decrypt it and store it, then respond with a 200.

If the client couldn’t reach the log server, it wouldn’t let the log directory grow unbounded. After a certain point it would start deleting logs to keep the directory under a maximum size.

The first two bugs were not a big deal on their own. The first one was that the desktop client sent logs up to the server starting with the oldest one instead of starting with the newest. This isn’t really what you want – for example, the server would tell the client to send logs if the client reported an exception, so probably you care about the logs that just happened and not the oldest logs that happen to be on disk.

The second bug was similar to the first: if the log directory hit its maximum size, the client would delete the logs starting with the newest instead of starting with the oldest. Again, you lose log files either way, but you probably care less about the older ones.

The third bug had to do with the encryption. Sometimes, the server would be unable to decrypt a log file. (We generally didn’t figure out why – maybe it was a bitflip.) We weren’t handling this error correctly on the backend, so the server would reply with a 500. The client would behave reasonably in the face of a 500: it would assume that the server was down. So it would stop sending log files and not try to send up any of the others.

Returning a 500 on a corrupted log file is clearly not the right behavior. You could consider returning a 400, since it’s a problem with the client request. But the client also can’t fix the problem – if the log file can’t be decrypted now, we’ll never be able to decrypt it in the future. What you really want the client to do is just delete the log and move on. In fact, that’s the default behavior when the client gets a 200 back from the server for a log file that was successfully stored. So we said, ok – if the log file can’t be decrypted, just return a 200.

All of these bugs were straightforward to fix. The first two bugs were on the client, so we’d fixed them on the alpha build but they hadn’t gone out to the majority of clients. The third bug we fixed on the server and deployed.

📈

Suddenly traffic to the log cluster spikes. The serving team reaches out to us to ask if we know what’s going on. It takes me a minute to put all the pieces together.

Before these fixes, there were four things going on:

  1. Log files were sent up starting with the oldest
  2. Log files were deleted starting with the newest
  3. If the server couldn’t decrypt a log file it would 500
  4. If the client got a 500 it would stop sending logs

A client with a corrupted log file would try to send it, the server would 500, the client would give up sending logs. On its next run, it would try to send the same file again, fail again, and give up again. Eventually the log directory would get full, at which point the client would start deleting its newest files, leaving the corrupted one on disk.

The upshot of these three bugs: if a client ever had a corrupted log file, we would never see logs from that client again.

The problem is that there were a lot more clients in this state than we thought. Any client with a single corrupted file had been dammed up from sending logs to the server. Now that dam was cleared, and all of them were sending up the rest of the contents of their log directories.

Our options

Ok, there’s a huge flood of traffic coming from machines around the world. What can we do? (This is a fun thing about working at a company with Dropbox’s scale, and particularly Dropbox’s scale of desktop clients: you can trigger a self-DDOS very easily.)

The first option when you do a deploy and things start going sideways is to rollback. Totally reasonable choice, but in this case, it wouldn’t have helped us. The state that we’d transformed wasn’t the state on the server but the state on the client – we’d deleted those files. Rolling back the server would prevent additional clients from entering this state but it wouldn’t solve the problem.

What about increasing the size of the logging cluster? We did that – and started getting even more requests, now that we’d increased our capacity. We increased it again, but you can’t do that forever. Why not? This cluster isn’t isolated. It’s making requests into another cluster, in this case to handle exceptions. If you have a DDOS pointed at one cluster, and you keep scaling that cluster, you’re going to knock over its depedencies too, and now you have two problems.

Another option we considered was shedding load – you don’t need every single log file, so can we just drop requests. One of the challenges here was that we didn’t have an easy way to tell good traffic from bad. We couldn’t quickly differentiate which log files were old and which were new.

The solution we hit on is one that’s been used at Dropbox on a number of different occassions: we have a custom header, chillout, which every client in the world respects. If the client gets a response with this header, then it doesn’t make any requests for the provided number of seconds. Someone very wise added this to the Dropbox client very early on, and it’s come in handy more than once over the years. The logging server didn’t have the ability to set that header, but that’s an easy problem to solve. So two of my colleagues, Isaac Goldberg and John Lai, implemented support for it. We set the logging cluster chillout to two minutes initially and then managed it down as the deluge subsided over the next couple of days.

Know your system

The first lesson from this bug is to know your system. I had a good mental model of the interaction between the client and the server, but I wasn’t thinking about what would happen when the server was interacting with all the clients at once. There was a level of complexity that I hadn’t thought all the way through.

Know your tools

The second lesson is to know your tools. If things go sideways, what options do you have? Can you reverse your migration? How will you know if things are going sideways and how can you discover more? All of those things are great to know before a crisis – but if you don’t, you’ll learn them during a crisis and then never forget.

Feature flags & server-side gating

The third lesson is for you if you’re writing a mobile or a desktop application: You need server-side feature gating and server-side flags. When you discover a problem and you don’t have server-side controls, the resolution might take days or weeks as you push out a new release or submit a new version to the app store. That’s a bad situation to be in. The Dropbox desktop client isn’t going through an app store review process, but just pushing out a build to tens of millions of clients takes time. Compare that to hitting a problem in your feature and flipping a switch on the server: ten minutes later your problem is resolved.

This strategy is not without its costs. Having a bunch of feature flags in your code adds to the complexity dramatically. You get a combinatoric problem with your testing: what if feature A is enabled and feature B, or just one, or neither – multiplied across N features. It’s extremely difficult to get engineers to clean up their feature flags after the fact (and I was also guilty of this). Then for the desktop client there’s multiple versions in the wild at the same time, so it gets pretty hard to reason about.

But the benefit – man, when you need it, you really need it.

How to love bugs

I’ve talked about some bugs that I love and I’ve talked about why to love bugs. Now I want to tell you how to love bugs. If you don’t love bugs yet, I know of exactly one way to learn, and that’s to have a growth mindset.

The sociologist Carol Dweck has done a ton of interesting research about how people think about intelligence. She’s found that there are two different frameworks for thinking about intelligence. The first, which she calls the fixed mindset, holds that intelligence is a fixed trait, and people can’t change how much of it they have. The other mindset is a growth mindset. Under a growth mindset, people believe that intelligence is malleable and can increase with effort.

Dweck found that a person’s theory of intelligence – whether they hold a fixed or growth mindset – can significantly influence the way they select tasks to work on, the way they respond to challenges, their cognitive performance, and even their honesty.

[I also talked about a growth mindset in my Kiwi PyCon keynote, so here are just a few excerpts. You can read the full transcript here.]

Findings about honesty:

After this, they had the students write letters to pen pals about the study, saying “We did this study at school, and here’s the score that I got.” They found that almost half of the students praised for intelligence lied about their scores, and almost no one who was praised for working hard was dishonest.

On effort:

Several studies found that people with a fixed mindset can be reluctant to really exert effort, because they believe it means they’re not good at the thing they’re working hard on. Dweck notes, “It would be hard to maintain confidence in your ability if every time a task requires effort, your intelligence is called into question.”

On responding to confusion:

They found that students with a growth mindset mastered the material about 70% of the time, regardless of whether there was a confusing passage in it. Among students with a fixed mindset, if they read the booklet without the confusing passage, again about 70% of them mastered the material. But the fixed-mindset students who encountered the confusing passage saw their mastery drop to 30%. Students with a fixed mindset were pretty bad at recovering from being confused.

These findings show that a growth mindset is critical while debugging. We have to recover from confusion, be candid about the limitations of our understanding, and at times really struggle on the way to finding solutions – all of which is easier and less painful with a growth mindset.

Love your bugs

I learned to love bugs by explicitly celebrating challenges while working at the Recurse Center. A participant would sit down next to me and say, “[sigh] I think I’ve got a weird Python bug,” and I’d say, “Awesome, I love weird Python bugs!” First of all, this is definitely true, but more importantly, it emphasized to the participant that finding something where they struggled an accomplishment, and it was a good thing for them to have done that day.

As I mentioned, at the Recurse Center there are no deadlines and no assignments, so this attitude is pretty much free. I’d say, “You get to spend a day chasing down this weird bug in Flask, how exciting!” At Dropbox and later at Pilot, where we have a product to ship, deadlines, and users, I’m not always uniformly delighted about spending a day on a weird bug. So I’m sympathetic to the reality of the world where there are deadlines. However, if I have a bug to fix, I have to fix it, and being grumbly about the existence of the bug isn’t going to help me fix it faster. I think that even in a world where deadlines loom, you can still apply this attitude.

If you love your bugs, you can have more fun while you’re working on a tough problem. You can be less worried and more focused, and end up learning more from them. Finally, you can share a bug with your friends and colleagues, which helps you and your teammates.

Obrigada!

My thanks to folks who gave me feedback on this talk and otherwise contributed to my being there:

  • Sasha Laundy
  • Amy Hanlon
  • Julia Evans
  • Julian Cooper
  • Raphael Passini Diniz and the rest of the Python Brasil organizing team
Read the whole story
zwol
25 days ago
reply
Mountain View, CA
acdha
25 days ago
reply
Washington, DC
Share this story
Delete

A Clash of Cultures

1 Comment and 5 Shares

There’s an Internet controversy going on between Dale Dougherty, the CEO of Maker Media and Naomi Wu (@realsexycyborg), a Chinese Maker and Internet personality. Briefly, Dale Doughtery tweeted a single line questioning Naomi Wu’s authenticity, which is destroying Naomi’s reputation and livelihood in China.

In short, I am in support of Naomi Wu. Rather than let the Internet speculate on why, I am sharing my perspectives on the situation preemptively.

As with most Internet controversies, it’s messy and emotional. I will try my best to outline the biases and issues I have observed. Of course, everyone has their perspective; you don’t have to agree with mine. And I suspect many of my core audience will dislike and disagree with this post. However, the beginning of healing starts with sharing and listening. I will share, and I respectfully request that readers read the entire content of this post before attacking any individual point out of context.

The key forces I see at play are:

  1. Prototype Bias – how assumptions based on stereotypes influence the way we think and feel
  2. Idol Effect – the tendency to assign exaggerated capabilities and inflated expectations upon celebrities
  3. Power Asymmetry – those with more power have more influence, and should be held to a higher standard of accountability
  4. Guanxi Bias – the tendency to give foreign faces more credibility than local faces in China

All these forces came together in a perfect storm this past week.

1. Prototype Bias

If someone asked you to draw a picture of an engineer, who would you draw? As you draw the figure, the gender assigned is a reflection of your mental prototype of an engineer – your own prototype bias. Most will draw a male figure. Society is biased to assign high-level intellectual ability to males, and this bias starts at a young age. Situations that don’t fit into your prototypes can feel threatening; studies have shown that men defend their standing by undermining the success of women in STEM initiatives.

The bias is real and pervasive. For example, my co-founder in Chibitronics, Jie Qi, is female. The company is founded on technology that is a direct result of her MIT Media Lab PhD dissertation. She is the inventor of paper electronics. I am a supporting actor in her show. Despite laying this fact out repeatedly, she still receives comments and innuendo implying that I am the inventor or more influential than I really am in the development process.

Any engineer who observes a bias in a system and chooses not to pro-actively correct for it is either a bad engineer or they stand to benefit from the bias. So much of engineering is about compensating, trimming, and equalizing imperfections out of real systems: wrap a feedback loop around it, and force the error function to zero.

So when Jie and I stand on stage together, prototype bias causes people to assume I’m the one who invented the technology. Given that I’m aware of the bias, does it make sense to give us equal time on the stage? No – that would be like knowing there is uneven loss in a channel and then being surprised when certain frequency bands are suppressed by the time it hits the receivers. So, I make a conscious and deliberate effort to showcase her contributions and to ensure her voice is the first and last voice you hear.

Naomi Wu (pictured below) likely challenges your prototypical ideal of an engineer. I imagine many people feel a cognitive dissonance juxtaposing the label “engineer” or “Maker” with her appearance. The strength of that dissonant feeling is proportional to the amount of prototype bias you have.

I’ve been fortunate to experience breaking my own prototypical notions that associate certain dress norms with intelligence. I’m a regular at Burning Man, and my theme camp is dominated by scientists and engineers. I’ve discussed injection molding with men in pink tutus and learned about plasmonics from half-naked women. It’s not a big leap for me to accept Naomi as a Maker. I’m glad she’s challenging these biases. I do my best engineering when sitting half-naked at my desk. I find shirts and pants to be uncomfortable. I don’t have the strength to challenge these social norms, and secretly, I’m glad someone is.

Unfortunately, prototype bias is only the first challenge confronted in this situation.

2. Idol Effect

The Idol Effect is the tendency to assign exaggerated capabilities to public figures and celebrities. The adage “never meet your childhood hero” is a corollary of the Idol Effect – people have inflated expectations about what celebrities can do, so it’s often disappointing when you find out they are humans just like us.

One result of the Idol Effect is that people feel justified taking pot shots at public figures for their shortcomings. For example, I have had the great privilege of working with Edward Snowden. One of my favorite things about working with him is that he is humble and quick to correct misconceptions about his personal abilities. Because of his self-awareness of his limitations, it’s easier for me to trust his assertions, and he’s also a fast learner because he’s not afraid to ask questions. Notably, he’s never claimed to be a genius, so I’m always taken aback when intelligent people pull me aside and whisper in my ear, “You know, I hear Ed’s a n00b. He’s just using you.” Somehow, because of Ed’s worldwide level of fame that’s strongly associated with security technology, people assume he should be a genius level crypto-hacker and are quick to point out that he’s not. Really? Ed is risking his life because he believes in something. I admire his dedication to the cause, and I enjoy working with him because he’s got good ideas, a good heart, and he’s fun to be with.

Because I also have a public profile, the Idol Effect impacts me too. I’m bad at math, can’t tie knots, a mediocre programmer…the list goes on. If there’s firmware in a product I’ve touched, it’s likely to have been written by Sean ‘xobs’ Cross, not me. If there’s analytics or informatics involved, it’s likely my partner wrote the analysis scripts. She also edits all my blog posts (including this one) and has helped me craft my most viral tweets – because she’s a genius at informatics, she can run analyses on how to target key words and pick times of day to get maximum impact. The fact that I have a team of people helping me polish my work makes me look better than I really am, and people tend to assign capabilities to me that I don’t really have. Does this mean I am a front, fraud or a persona?

I imagine Naomi is a victim of Idol Effect too. Similar to Snowden, one of the reasons I’ve enjoyed interacting with Naomi is that she’s been quick to correct misconceptions about her abilities, she’s not afraid to ask for help, and she’s a quick learner. Though many may disapprove of her rhetoric on Twitter, please keep in mind English is her second language — her sole cultural context in which she learned English was via the Internet by reading social media and chat rooms.

Based on the rumors I’ve read, it seems fans and observers have inflated expectations for her abilities, and because of uncorrected prototype bias, she faces extra scrutiny to prove her abilities. Somehow the fact that she almost cuts her finger using a scraper to remove a 3D print is “evidence” that she’s not a Maker. If that’s true, I’m not a Maker either. I always have trouble releasing 3D prints from print stages. They’ve routinely popped off and flown across the room, and I’ve almost cut my fingers plenty of times with the scraper. But I still keep on trying and learning – that’s the point. And then there’s the suggestion that because a man holds the camera, he’s feeding her lines.

When a man harnesses the efforts of a team, they call him a CEO and give him a bonus. But when a woman harnesses the efforts of a team, she gets accused of being a persona and a front. This is uncorrected Prototype Bias meeting unrealistic expectations due to the Idol Effect.

The story might end there, but things recently got a whole lot worse…

3. Power Asymmetry

“With great power comes great responsibilities.”
-from Spider Man

Power is not distributed evenly in the world. That’s a fact of life. Not acknowledging the role power plays leads to systemic abuse, like those documented in the Caldbeck or Weinstein scandals.

Editors and journalists – those with direct control over what gets circulated in the media – have a lot of power. Their thoughts and opinions can reach and influence a massive population very quickly. Rumors are just rumors until media outlets breathe life into them, at which point they become an incurable cancer on someone’s career. Editors and journalists must be mindful of the power they wield and held accountable for when it is mis-used.

As CEO of Maker Media and head of an influential media outlet, especially among the DIY community, Dale Dougherty wields substantial power. So a tweet promulgating the idea that Naomi might be a persona or a fake does not land lightly. In the post-truth era, it’s especially incumbent upon traditional media to double-check rumors before citing them in any context.

What is personally disappointing is that Dale reached out to me on November 2nd with an email asking what I thought about an anonymous post that accused Naomi of being a fake. I vouched for Naomi as a real person and as a budding Maker; I wrote back to Dale that “I take the approach of interacting with her like any other enthusiastic, curious Maker and the resulting interactions have been positive. She’s a fast learner.”

Yet Dale decided to take an anonymous poster’s opinion over mine (despite a long working relationship with Make), and a few days later on November 5th he tweeted a link to the post suggesting Naomi could be a fake or a fraud, despite having evidence of the contrary.

So now Naomi, already facing prototype bias and idol-effect expectations, gets a big media personality with substantial power propagating rumors that she is a fake and a fraud.

But wait, it gets worse because Naomi is in China!

4. Guanxi Bias

In China, guanxi (关系) is everything. Public reputation is extremely hard to build, and quick to lose. Faking and cloning is a real problem, but it’s important to not lose sight of the fact that there are good, hard-working people in China as well. So how do the Chinese locals figure out who to trust? Guanxi is a major mechanism used inside China to sort the good from the bad – it’s a social network of credible people vouching for each other.

For better or for worse, the Chinese feel that Western faces and brands are more credible. The endorsement of a famous Western brand carries a lot of weight; for example Leonardo DiCaprio is the brand ambassador for BYD (a large Chinese car maker).

Maker Media has a massive reputation in China. From glitzy Maker Faires to the Communist party’s endorsement of Maker-ed and Maker spaces as a national objective, an association or the lack thereof with Maker Media can make or break a reputation. This is no exception for Naomi. Her uniqueness as a Maker combined with her talent at marketing has enabled her to do product reviews and endorsements as source of income.

However, for several years she’s been excluded from the Shenzhen Maker Faire lineup, even in events that she should have been a shoo-in for her: wearables, Maker fashion shows, 3D printing. Despite this lack of endorsement, she’s built her own social media follower base both inside and outside of China, and built a brand around herself.

Unfortunately, when the CEO of Maker Media, a white male leader of an established American brand, suggested Naomi was a potential fake, the Internet inside China exploded on her. Sponsors cancelled engagements with her. Followers turned into trolls. She can’t be seen publicly with men (because others will say the males are the real Maker, see “prototype bias”), and as a result faces a greater threat of physical violence.

A single innuendo, amplified by Power Asymmetry and Guanxi Bias, on top of Idol Effect meshed against Prototype Bias, has destroyed everything a Maker has worked so hard to build over the past few years.

If someone spread lies about you and destroyed your livelihood – what would you do? Everyone would react a little differently, but make no mistake: at this point she’s got nothing left to lose, and she’s very angry.

Reflection

Although Dale had issued a public apology about the rumors, the apology fixes her reputation as much as saying “sorry” repairs a vase smashed on the floor.

Image: Mindy Georges CC BY-NC

At this point you might ask — why would Dale want to slander Naomi?

I don’t know the background, but prior to Dale’s tweet, Naomi had aggressively dogged Dale and Make about Make’s lack of representation of women. Others have noted that Maker Media has a prototype bias toward white males. Watch this analysis by Leah Buechley, a former MIT Media Lab Professor:

Dale could have recognized and addressed this core issue of a lack of diversity. Instead, Dale elected to endorse unsubstantiated claims and destroy a young female Maker’s reputation and career.

Naomi has a long, uphill road ahead of her. On the other hand, I’m sure Dale will do fine – he’s charismatic, affable, and powerful.

When I sit and think, how would I feel if this happened to the women closest to me? I get goosebumps – the effect would be chilling; the combination of pervasive social biases would overwhelm logic and fact. So even though I may not agree with everything Naomi says or does, I have decided that in the bigger picture, hiding in complicit silence on the sidelines is not acceptable.

We need to acknowledge that prototype bias is real; if equality is the goal, we need to be proactive in correcting it. Just because someone is famous doesn’t mean they are perfect. People with power need to be held accountable in how they wield it. And finally, cross-cultural issues are complicated and delicate. All sides need to open their eyes, ears, and hearts and realize we’re all human. Tweets may seem like harmless pricks to the skin, but we all bleed when pricked. For humanity to survive, we need to stop pricking each other lest we all bleed to death.

/me dons asbestos suit

Read the whole story
acdha
32 days ago
reply
Washington, DC
zwol
32 days ago
reply
Mountain View, CA
Share this story
Delete
1 public comment
jepler
32 days ago
reply
"Any engineer who observes a bias in a system and chooses not to pro-actively correct for it is either a bad engineer or they stand to benefit from the bias. So much of engineering is about compensating, trimming, and equalizing imperfections out of real systems: wrap a feedback loop around it, and force the error function to zero." (but read the rest too)
Earth, Sol system, Western spiral arm

A few thoughts on CSRankings.org

1 Share

(Warning: nerdy inside-baseball academic blog post follows. If you’re looking for exciting crypto blogging, try back in a couple of days.)

If there’s one thing that academic computer scientists love (or love to hate), it’s comparing themselves to other academics. We don’t do what we do for the big money, after all. We do it — in large part — because we’re curious and want to do good science. (Also there’s sometimes free food.) But then there’s a problem: who’s going to tell is if we’re doing good science?

To a scientist, the solution seems obvious. We just need metrics. And boy, do we get them. Modern scientists can visit Google Scholar to get all sorts of information about their citation count, neatly summarized with an “H-index” or an “i10-index”. These metrics aren’t great, but they’re a good way to pass an afternoon filled with self-doubt, if that’s your sort of thing.

But what if we want to do something more? What if we want to compare institutions as well as individual authors? And even better, what if we could break those institutions down into individual subfields? You could do this painfully on Google Scholar, perhaps. Or you could put your faith in the abominable and apparently wholly made-up U.S. News rankings, as many academics (unfortunately) do.

Alternatively, you could actually collect some data about what scientists are publishing, and work with that.

This is the approach of a new site called “Computer Science Rankings”. As best I can tell, CSRankings is largely an individual project, and doesn’t have the cachet (yet) of U.S. News. At the same time, it provides researchers and administrators with something they love: another way to compare themselves, and to compare different institutions. Moreover, it does so with real data (rather than the Ouija board and blindfold that U.S. News uses). I can’t see it failing to catch on.

And that worries me, because the approach of CSRankings seems a bit arbitrary. And I’m worried about what sort of things it might cause us to do.

You see, people in our field take rankings very seriously. I know folks who have moved their families to the other side of the country over a two-point ranking difference in the U.S. News rankings — despite the fact that we all agree those are absurd. And this is before we consider the real impact on salaries, promotions, and awards of rankings (individual and institutional). People optimize their careers and publications to maximize these stats, not because they’re bad people, but because they’re (mostly) rational and that’s what rankings inspire rational people do.

To me this means we should think very carefully about what our rankings actually say.

Which brings me to the meat of my concerns with CSRankings. At a glance, the site is beautifully designed. It allows you to look at dozens of institutions, broken down by CS subfield. Within those subfields it ranks institutions by a simple metric: adjusted publication counts in top conferences by individual authors.

The calculation isn’t complicated. If you wrote a paper by yourself and had it published in one of the designated top conferences in your field, you’d get a single point. If you wrote a paper with a co-author, then you’d each get half a point. If you wrote a paper that doesn’t appear in a top conference, you get zero points. Your institution gets the sum-total of all the points its researchers receive.

If you believe that people are rational actors optimize for rankings, you might start to see the problem.

First off, what CSRankings is telling us is that we should ditch those pesky co-authors. If I could write a paper with one graduate student, but a second student also wants to participate, tough cookies. That’s the difference between getting 1/2 a point and 1/3 of a point. Sure, that additional student might improve the paper dramatically. They might also learn a thing or two. But on the other hand, they’ll hurt your rankings.

(Note: currently on CSRankings, graduate students at the same institution don’t get included in the institutional rankings. So including them on your papers will actually reduce your school’s rank.)

I hope it goes without saying that this could create bad incentives.

Second, in fields that mix systems and theory — like computer security — CSRankings is telling us that theory papers (which typically have fewer authors) should be privileged in the rankings over systems papers. This creates both a distortion in the metrics, and also an incentive (for authors who do both types of work) to stick with the one that produces higher rankings. That seems undesirable. But it could very well happen if we adopt these rankings uncritically.

Finally, there’s this focus on “top conferences”. One of our big problems in computer science is that we spend a lot of our time scrapping over a very limited number of slots in competitive conferences. This can be ok, but it’s unfortunate for researchers whose work doesn’t neatly fit into whatever areas those conference PCs find popular. And CSRankings gives zero credit for publishing anywhere but those top conferences, so you might as well forget about that.

(Of course, there’s a question about what a “top conference” even is. In Computer Security, where I work, NDSS is not considered a top conference. That’s because only three conferences are permitted for each field. The fact that this number seems arbitrary really doesn’t help inspire a lot of confidence in the approach.)

So what can we do about this?

As much as I’d like to ditch rankings altogether, I realize that this probably isn’t going to happen. Nature abhors a vacuum, and if we don’t figure out a rankings system, someone else will. Hell, we’re already plagued by U.S. News, whose methodology appears to involve a popcorn machine and live tarantulas. Something, anything, has to be better than this.

And to be clear, CSRankings isn’t a bad effort. At a high level it’s really easy to use. Even the issues I mention above seem like things that could be addressed. More conferences could be added, using some kind of metric to scale point contributions. (This wouldn’t fix all the problems, but would at least mitigate the worst incentives.) Statistics could perhaps be updated to adjust for graduate students, and soften the blow of having co-authors. These things are not impossible.

And fixing this carefully seems really important. We got it wrong in trusting U.S. News. What I’d like is this time for computer scientists to actually sit down and think this one out before someone imposes a ranking system on top of us. What behaviors are we trying to incentivize for? Is it smaller author lists? Is it citation counts? Is it publishing only in a specific set of conferences?

I don’t know that anyone would agree uniformly that these should be our goals. So if they’re not, let’s figure out what they really are.




Read the whole story
zwol
32 days ago
reply
Mountain View, CA
Share this story
Delete

Five Books to Make You Less Stupid About the Civil War

3 Shares

On Monday, the retired four-star general and White House Chief of Staff John Kelly asserted that “the lack of an ability to compromise led to the Civil War.” This was an incredibly stupid thing to say. Worse, it built on a long tradition of endorsing stupidity in hopes of making Americans stupid about their own history. Stupid enjoys an unfortunate place in the highest ranks of American government these days. And while one cannot immediately affect this fact, one can choose to not hear stupid things and quietly nod along.

For the past 50 years, some of this country’s most celebrated historians have taken up the task of making Americans less stupid about the Civil War. These historians have been more effective than generally realized. It’s worth remembering that General Kelly’s remarks, which were greeted with mass howls of protests, reflected the way much of this country’s stupid-ass intellectual class once understood the Civil War. I do not contend that this improved history has solved everything. But it is a ray of light cutting through the gloom of stupid. You should run to that light. Embrace it. Bathe in it. Become it.

Okay, maybe that’s too far. Let’s start with just being less stupid.

One quick note: In making this list I’ve tried to think very hard about readability, and to offer books you might actually complete. There are a number of books that I dearly love and have found indispensable that are not on this list. (Du Bois’s Black Reconstruction in America immediately comes to mind.) I mean no slight to any of those volumes. But this is about being less stupid. We’ll get to those other ones when we talk about how to be smart.

1) Battle Cry Of Freedom: Arguably among the greatest single-volume histories in all of American historiography, James McPherson’s synthesis of the Civil War is a stunning achievement. Brisk in pace. A big-ass book that reads like a much slimmer one. The first few hundred pages offer a catalogue of evidence, making it clear not just that the white South went to war for the right to own people, but that it warred for the right to expand the right to own people. Read this book. You will immediately be less stupid than some of the most powerful people in the West Wing.

2) Grant: Another classic in the Ron Chernow oeuvre. Again, eminently readable but thick with import. It does not shy away from Grant’s personal flaws, but shows him to be a man constantly struggling to live up to his own standard of personal and moral courage. It corrects nearly a half-century of stupidity inflicted upon America by the Dunning school of historians, which preferred a portrait of Grant as a bumbling, corrupt butcher of men. Finally, it reframes the Civil War away from the overrated Virginia campaigns and shows us that when the West was won, so was the war. Grant hits like a Mack truck of knowledge. Stupid doesn’t stand a chance.

3) Reading the Man: A Portrait of Robert E. Lee: Elizabeth Pryor’s biography of Lee, through Lee’s own words, helps part with a lot of stupid out there about Lee—chiefly that he was, somehow, “anti-slavery.” It dispenses with the boatload of stupid out there which hails the military genius of Lee while ignoring the world that all of that genius was actually trying to build.

4.) Out of the House of Bondage: A slim volume that dispenses with the notion that there was a such thing as “good,” “domestic,” or “matronly” slavery. The historian Thavolia Glymph focuses on the relationships between black enslaved women and the white women who took them as property. She picks apart the stupid idea that white mistresses were somehow less violent and less exploitative than their male peers. Glymph has no need of Scarlett O’Haras. “Used the rod” is the quote that still sticks with me. An important point here—stupid ideas about ladyhood and the soft feminine hand meant nothing when measured against the fact of a slave society. Slavery was the monster that made monsters of its masters. Compromising with it was morally bankrupt—and stupid.

5.) The Life and Times of Frederick Douglass: The final of three autobiographies written by the famed abolitionist, and my personal favorite. Epic and sweeping in scope. The chapter depicting the bounty of food on which the enslavers feasted while the enslaved nearly starved is just devastating.

So that should get you to unstupid—but don’t stop there. Read Du Bois. Read Grant’s own memoirs. Read Harriet Jacobs. Read Eric Foner. Read Bruce Levine. It’s not that hard, you know. You’ve got nothing to lose, save your own stupid.

Read the whole story
zwol
33 days ago
reply
Mountain View, CA
acdha
34 days ago
reply
Washington, DC
Share this story
Delete
Next Page of Stories