One day this fall, Ashutosh Garg, the chief executive of a recruiting service called Eightfold.ai, turned up a résumé that piqued his interest.
It belonged to a prospective data scientist, someone who unearths patterns in data to help businesses make decisions, like how to target ads. But curiously, the résumé featured the term “data science” nowhere.
Instead, the résumé belonged to an analyst at Barclays who had done graduate work in physics at the University of California, Los Angeles. Though his profile on the social network LinkedIn indicated that he had never worked as a data scientist, Eightfold’s software flagged him as a good fit. He was similar in certain key ways, like his math and computer chops, to four actual data scientists whom Mr. Garg had instructed the software to consider as a model.
The idea is not to focus on job titles, but “what skills they have,” Mr. Garg said. “You’re really looking for people who have not done it, but can do it.”
This is a book about models. It describes dozens of models in straightforward language and explains how to apply them. Models are formal structures represented in mathematics and diagrams that help us to understand the world. Mastery of models improves your ability to reason, explain, design, communicate, act, predict, and explore.
This book promotes a many-model thinking approach: the application of ensembles of models to make sense of complex phenomena. The core idea is that many-model thinking produces wisdom through a diverse ensemble of logical frames. The various models accentuate different causal forces. Their insights and implications overlap and interweave. By engaging many models as frames, we develop nuanced, deep understandings. The book includes formal arguments to make the case for multiple models along with myriad real-world examples.
The book has a pragmatic focus. Many-model thinking has tremendous practical value. Practice it, and you will better understand complex phenomena. You will reason better. You exhibit fewer gaps in your reasoning and make more robust decisions in your career, community activities, and personal life. You may even become wise.
Twenty-five years ago, a book of models would have been intended for professors and graduate students studying business, policy, and the social sciences along with financial analysts, actuaries, and members of the intelligence community. These were the people who applied models and, not coincidentally, they were also the people most engaged with large data sets. Today, a book of models has a much larger audience: the vast universe of knowledge workers, who, owing to the rise of big data, now find working with models a part of their daily lives.
Organizing and interpreting data with models has become a core competency for business strategists, urban planners, economists, medical professionals, engineers, actuaries, and environmental scientists among others. Anyone who analyzes data, formulates business strategies, allocates resources, designs products and protocols, or makes hiring decisions encounters models. It follows that mastering the material in this book—particularly the models covering innovation, forecasting, data binning, learning, and market entry timing—will be of practical value to many.
Thinking with models will do more than improve your performance at work. It will make you a better citizen and a more thoughtful contributor to civic life. It will make you more adept at evaluating economic and political events. You will be able to identify flaws in your logic and in that of others. You will learn to identify when you are allowing ideology to supplant reason and have richer, more layered insights into the implications of policy initiatives, whether they be proposed greenbelts or mandatory drug tests.
These benefits will accrue from an engagement with a variety of models—not hundreds, but a few dozen. The models in this book offer a good starting collection. They come from multiple disciplines and include the Prisoners’ Dilemma, the Race to the Bottom, and the SIR model of disease transmission. All of these models share a common form: they assume a set of entities—often people or organizations—and describe how they interact.
The models we cover fall into three classes: simplifications of the world, mathematical analogies, and exploratory, artificial constructs. In whatever form, a model must be tractable. It must be simple enough that within it we can apply logic. For example, we cover a model of communicable diseases that consists of infected, susceptible, and recovered people that assumes a rate of contagion. Using the model we can derive a contagion threshold, a tipping point, above which the disease spreads. We can also determine the proportion of people we must vaccinate to stop the disease from spreading.
As powerful as single models can be, a collection of models accomplishes even more. With many models, we avoid the narrowness inherent in each individual model. A many-models approach illuminates each component model’s blind spots. Policy choices made based on single models may ignore important features of the world such as income disparity, identity diversity, and interdependencies with other systems.1 With many models, we build logical understandings of multiple processes. We see how causal processes overlap and interact. We create the possibility of making sense of the complexity that characterizes our economic, political, and social worlds. And, we do so without abandoning rigor—model thinking ensures logical coherence. That logic can be then be grounded in evidence by taking models to data to test, refine, and improve them. In sum, when our thinking is informed by diverse logically consistent, empirically validated frames, we are more likely to make wise choices.
Without models, making sense of data is hard. Data helps describe reality, albeit imperfectly. On its own, though, data can’t recommend one decision over another. If you notice that your best-performing teams are also your most diverse, that may be interesting. But to turn that data point into insight, you need to plug it into some model of the world — for instance, you may hypothesize that having a greater variety of perspectives on a team leads to better decision-making. Your hypothesis represents a model of the world.
Though single models can perform well, ensembles of models work even better. That is why the best thinkers, the most accurate predictors, and the most effective design teams use ensembles of models. They are what I call, many-model thinkers.
"We have charts and graphs to back us up. So f*** off.” New hires in Google’s people analytics department began receiving a laptop sticker with that slogan a few years ago, when the group probably felt it needed to defend its work. Back then people analytics—using statistical insights from employee data to make talent management decisions—was still a provocative idea with plenty of skeptics who feared it might lead companies to reduce individuals to numbers. HR collected data on workers, but the notion that it could be actively mined to understand and manage them was novel—and suspect.
Today there’s no need for stickers. More than 70% of companies now say they consider people analytics to be a high priority. The field even has celebrated case studies, like Google’s Project Oxygen, which uncovered the practices of the tech giant’s best managers and then used them in coaching sessions to improve the work of low performers. Other examples, such as Dell’s experiments with increasing the success of its sales force, also point to the power of people analytics.
But hype, as it often does, has outpaced reality. The truth is, people analytics has made only modest progress over the past decade. A survey by Tata Consultancy Services found that just 5% of big-data investments go to HR, the group that typically manages people analytics. And a recent study by Deloitte showed that although people analytics has become mainstream, only 9% of companies believe they have a good understanding of which talent dimensions drive performance in their organizations.
What gives? If, as the sticker says, people analytics teams have charts and graphs to back them up, why haven’t results followed? We believe it’s because most rely on a narrow approach to data analysis: They use data only about individual people, when data about the interplay among people is equally or more important.
People’s interactions are the focus of an emerging discipline we call relational analytics. By incorporating it into their people analytics strategies, companies can better identify employees who are capable of helping them achieve their goals, whether for increased innovation, influence, or efficiency. Firms will also gain insight into which key players they can’t afford to lose and where silos exist in their organizations.
Most people analytics teams rely on a narrow approach to data analysis.
Fortunately, the raw material for relational analytics already exists in companies. It’s the data created by e-mail exchanges, chats, and file transfers—the digital exhaust of a company. By mining it, firms can build good relational analytics models.
In this article we present a framework for understanding and applying relational analytics. And we have the charts and graphs to back us up.
You want to know which teams are at the forefront of analytics? Just look around at the teams still playing.
Once upon a time, there was the Oakland Athletics and a sacred tome called "Moneyball." It was about baseball teams winning with statistics. Only it wasn't about that at all. It was about market inefficiency. Then John Henry bought the Boston Red Sox, hired Bill James, made Theo Epstein his general manager, and Moneyball spread to a big market.
We're several iterations past all of that. Things move fast in technology, so fast it can even carry a tradition-based industry like baseball into the digital age. These days, every team is playing Moneyball. All of them, as in 30 for 30.
"At this point, I think everyone assumes that their counterpart is smart," Brewers general manager David Stearns said. "And everyone is doing what they can do to unearth competitive advantages." To call it Moneyball is not right, either. Michael Lewis is still turning out ground-breaking work, but to fully capture what is happening in big league front offices, circa 2018, the next inside look at analytics and baseball would need to be authored by someone like the late Stephen Hawking. It's hard to say what you'd call it. "The Singularity" has already been taken.
In machine learning and deep learning we can’t do anything without data. So the people that create datasets for us to train our models are the (often under-appreciated) heroes. Some of the most useful and important datasets are those that become important “academic baselines”; that is, datasets that are widely studied by researchers and used to compare algorithmic changes. Some of these become household names (at least, among households that train models!), such as MNIST, CIFAR 10, and Imagenet.
We all owe a debt of gratitude to those kind folks who have made datasets available for the research community. So fast.ai and the AWS Public Dataset Program have teamed up to try to give back a little: we’ve made some of the most important of these datasets available in a single place, using standard formats, on reliable and fast infrastructure. For a full list and links see the fast.ai datasets page.
fast.ai uses these datasets in the Deep Learning for Coders courses, because they provide great examples of the kind of data that students are likely to encounter, and the academic literature has many examples of model results using these datasets which students can compare their work to. If you use any of these datasets in your research, please show your gratitude by citing the original paper (we’ve provided the appropriate citation link below for each), and if you use them as part of a commercial or educational project, consider adding a note of thanks and a link to the dataset.
An algorithm that was being tested as a recruitment tool by online giant Amazon was sexist and had to be scrapped, according to a Reuters report. The artificial intelligence system was trained on data submitted by applicants over a 10-year period, much of which came from men, it claimed.
Reuters was told by members of the team working on it that the system effectively taught itself that male candidates were preferable. Amazon has not responded to the claims.
Reuters spoke to five members of the team who developed the machine learning tool in 2014, none of whom wanted to be publicly named. They told Reuters that the system was intended to review job applications and give candidates a score ranging from one to five stars.
"They literally wanted it to be an engine where I'm going to give you 100 resumes, it will spit out the top five, and we'll hire those," said one of the engineers who spoke to Reuters.
In today's world, scientists in many disciplines and a growing number of journalists live and breathe data. There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. To enable easy access to this data, we launched Dataset Search, so that scientists, data journalists, data geeks, or anyone else can find the datarequired for their work and their stories, or simply to satisfy their intellectual curiosity.
Similar to how Google Scholar works, Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher's site, a digital library, or an author's personal web page. To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem.
When colleges try to understand their students, they resort to a common tool: the survey.
And surveys are fine, says Dayna Weintraub, director of student-affairs research and assessment at Rutgers University at New Brunswick. But she also recognizes their drawbacks: poor response rates, underrepresentation of particular demographic groups, and, in certain instances, answers that lack needed candor.
And so, to assess and change student conduct in a more effective way, Weintraub and her colleagues have tried a new approach: find existing, direct, and detailed data on how Rutgers students conduct themselves, and combine them.
Leading the effort was Kevin Pitt, director of student conduct at the New Jersey university. Working alongside Weintraub, he and his team analyzed, with granular specificity, the behavior patterns of students in a variety of contexts: consuming excessive alcohol or drugs, in questionable sexual situations, and others. Pitt and his team examined student-level trends within those areas, combining a variety of previously siloed databases to sketch a more-informative picture of student life at Rutgers.
It seems like every business is struggling with the concept of transformation. Large incumbents are trying to keep pace with digital upstarts., and even digital native companies born as disruptors know that they need to transform. Take Uber: at only eight years old, it’s already upended the business model of taxis. Now it’s trying to move from a software platform to a robotics lab to build self-driving cars.
And while the number of initiatives that fall under the umbrella of “transformation” is so broad that it can seem meaningless, this breadth is actually one of the defining characteristic that differentiates transformation from ordinary change. A transformation is a whole portfolio of change initiatives that together form an integrated program.
And so a transformation is a system of systems, all made up of the most complex system of all — people. For this reason, organizational transformation is uniquely suited to the analysis, prediction, and experimental research approach of the people analytics field.
People analytics — defined as the use of data about human behavior, relationships and traits to make business decisions — helps to replace decision making based on anecdotal experience, hierarchy and risk avoidance with higher-quality decisions based on data analysis, prediction, and experimental research. In working with several dozen Fortune 500 companies with Microsoft’s Workplace Analytics division, we’ve observed companies using people analytics in three main ways to help understand and drive their transformation efforts.
Walk up a set of steep stairs next to a vegan Chinese restaurant in Palo Alto in Silicon Valley, and you will see the future of work, or at least one version of it. This is the local office of Humanyze, a firm that provides “people analytics”. It counts several Fortune 500 companies among its clients (though it will not say who they are). Its employees mill around an office full of sunlight and computers, as well as beacons that track their location and interactions. Everyone is wearing an ID badge the size of a credit card and the depth of a book of matches. It contains a microphone that picks up whether they are talking to one another; Bluetooth and infrared sensors to monitor where they are; and an accelerometer to record when they move.
“Every aspect of business is becoming more data-driven. There’s no reason the people side of business shouldn’t be the same,” says Ben Waber, Humanyze’s boss. The company’s staff are treated much the same way as its clients. Data from their employees’ badges are integrated with information from their e-mail and calendars to form a full picture of how they spend their time at work. Clients get to see only team-level statistics, but Humanyze’s employees can look at their own data, which include metrics such as time spent with people of the same sex, activity levels and the ratio of time spent speaking versus listening.
The University of Arizona is tracking freshman students’ ID card swipes to anticipate which students are more likely to drop out. University researchers hope to use the data to lower dropout rates. (Dropping out refers to those who have left higher-education entirely and those who transfer to other colleges.)
The card data tells researchers how frequently a student has entered a residence hall, library, and the student recreation center, which includes a salon, convenience store, mail room, and movie theater. The cards are also used for buying vending machine snacks and more, putting the total number of locations near 700. There’s a sensor embedded in the CatCard student IDs, which are given to every student attending the university.
“By getting their digital traces, you can explore their patterns of movement, behavior and interactions, and that tells you a great deal about them,” Sudha Ram, a professor of management information systems who directs the initiative, said in a press release.
The most illuminating moment of the Eagles’ enchanted season was a Week 3 play ridiculed in Philadelphia but celebrated here by a small cadre of people who recognized its significance almost immediately.
What fueled the excitement among members of the EdjSports crew was not the outcome of the play — a 6-yard sack of Carson Wentz on fourth-and-8 that gifted the Giants good field position — but rather the call itself. Leading by 7-0 on the Giants’ 43-yard line a few minutes before halftime, the Eagles opted not to punt.
By keeping Philadelphia’s offense on the field in a situation almost always played safe in the risk-averse N.F.L., Coach Doug Pederson did not buck conventional wisdom so much as roll his eyes at it.
An intern at EdjSports, responding to a flurry of text messages from his colleagues about the play, ran the numbers at home. The Eagles, by going for it, improved their probability of winning by 0.5 percent. Defending his decision (again) at a news conference the next day, Pederson cited that exact statistic.
Alongside the excitement and hype about our growing reliance on artificial intelligence, there’s fear about the way the technology works. A recent MIT Technology Review article titled “The Dark Secret at the Heart of AI” warned: “No one really knows how the most advanced algorithms do what they do. That could be a problem.” Thanks to this uncertainty and lack of accountability, a report by the AI Now Instituterecommended that public agencies responsible for criminal justice, health care, welfare and education shouldn’t use such technology.
Given these types of concerns, the unseeable space between where data goes in and answers come out is often referred to as a “black box” — seemingly a reference to the hardy (and in fact orange, not black) data recorders mandated on aircraft and often examined after accidents. In the context of A.I., the term more broadly suggests an image of being in the “dark” about how the technology works: We put in and provide the data and models and architectures, and then computers provide us answers while continuing to learn on their own, in a way that’s seemingly impossible — and certainly too complicated — for us to understand.
The modern workplace is awash in meetings, many of which are terrible. As a result, people mostly hate going to meetings. The problem is this: The whole point of meetings is to have discussions that you can’t have any other way. And yet most meetings are devoid of real debate.
To improve the meetings you run, and save the meetings you’re invited to, focus on making the discussion more robust.
When teams have a good fight during meetings, team members debate the issues, consider alternatives, challenge one another, listen to minority views, and scrutinize assumptions. Every participant can speak up without fear of retribution. However, many people shy away from such conflict, conflating disagreement and debate with personal attacks. In reality, this sort of friction produces the best decisions. In my recent study of 5,000 managers and employees, published in my recent book, I found that the best performers are really good at generating rigorous discussions in team meetings. (The sample includes senior and junior managers and individual contributors from a range of industries in corporate America; my aim was to statistically identify work habits that correlate with higher performance.)
So how do you lead a good fight in meetings? Here are six practical tips:
Organizational Network Analysis (ONA) is the set of scientific methods and theories to help understand interactions within an organization. It helps executives and managers to intervene at critical times, increase performance, and reduce costs.
There’s increasing pressure on executives to drive sustained, long-term growth. Yet, they lack the information they need to make informed business decisions and successfully initiate change. As organizations restructure departments to have fewer hierarchical levels, work increasingly occurs between social networks, rather than though prescribed reporting structures. Research shows that employees look to their networks to find information and to solve problems. Communication no longer flows solely from senior management to individual contributors – information moves through social networks, between colleagues and different teams. Organizations can analyze social networks to assess how information flows between teams and to intervene at critical times in order to improve how work gets done.
– Explore the benefits of supporting organizational networks – How network analysis can impact company performance – How to interpret network graphs – Business applications of ONA for human resources, business processes, and corporate real estate decisions
Digital computers have transformed work in almost every sector of the economy over the past several decades (1). We are now at the beginning of an even larger and more rapid transformation due to recent advances in machine learning (ML), which is capable of accelerating the pace of automation itself. However, although it is clear that ML is a “general purpose technology,” like the steam engine and electricity, which spawns a plethora of additional innovations and capabilities (2), there is no widely shared agreement on the tasks where ML systems excel, and thus little agreement on the specific expected impacts on the workforce and on the economy more broadly. We discuss what we see to be key implications for the workforce, drawing on our rubric of what the current generation of ML systems can and cannot do [see the supplementary materials (SM)]. Although parts of many jobs may be “suitable for ML” (SML), other tasks within these same jobs do not fit the criteria for ML well; hence, effects on employment are more complex than the simple replacement and substitution story emphasized by some. Although economic effects of ML are relatively limited today, and we are not facing the imminent “end of work” as is sometimes proclaimed, the implications for the economy and the workforce going forward are profound.