HireVue claims it uses artificial intelligence to decide who’s best for a job. Outside experts call it ‘profoundly disturbing.’
Amazon admitted this week that it experimented with using machine learning to build a recruitment tool. The trouble is, it didn't exactly produce fantastic results and it was later abandoned.
According to Reuters, Amazon engineers found that besides churning out totally unsuitable candidates, the so-called AI project showed a bias against women.
To Oxford University researcher Dr Sandra Wachter, the news that an artificially intelligent system had taught itself to discriminate against women was nothing new.
Read article here.
Don't simply show your data—tell a story with it!
storytelling with data teaches you the fundamentals of data visualization and how to communicate effectively with data. You'll discover the power of storytelling and the way to make data a pivotal point in your story. The lessons in this illuminative text are grounded in theory, but made accessible through numerous real-world examples—ready for immediate application to your next graph or presentation.
Storytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal don't make it any easier. This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story.
Specifically, you'll learn how to:
Understand the importance of context
Determine the appropriate type of graph
Recognize and eliminate the clutter
Direct your audience's attention
Think like a designer when visualizing data
Leverage the power of storytelling to help your message resonate with your audience
Together, the lessons in this book will help you turn your data into high-impact visual stories that stick with your audience. Rid your world of ineffective graphs, one exploding 3D pie chart at a time. There is a story in your data—storytelling with data will give you the skills and power to tell it!
Until that day arrives, Grosz, the Higgins Professor of Natural Sciences at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS), is working to instill in the next generation of computer scientists a mindset that considers the societal impact of their work, and the ethical reasoning and communications skills to do so.
“Ethics permeates the design of almost every computer system or algorithm that’s going out in the world,” Grosz said. “We want to educate our students to think not only about what systems they could build, but whether they shouldbuild those systems and how they should design those systems.”
At a time when computer science departments around the country are grappling with how to turn out graduates who understand ethics as well as algorithms, Harvard is taking a novel approach.
This is a book about models. It describes dozens of models in straightforward language and explains how to apply them. Models are formal structures represented in mathematics and diagrams that help us to understand the world. Mastery of models improves your ability to reason, explain, design, communicate, act, predict, and explore.
This book promotes a many-model thinking approach: the application of ensembles of models to make sense of complex phenomena. The core idea is that many-model thinking produces wisdom through a diverse ensemble of logical frames. The various models accentuate different causal forces. Their insights and implications overlap and interweave. By engaging many models as frames, we develop nuanced, deep understandings. The book includes formal arguments to make the case for multiple models along with myriad real-world examples.
The book has a pragmatic focus. Many-model thinking has tremendous practical value. Practice it, and you will better understand complex phenomena. You will reason better. You exhibit fewer gaps in your reasoning and make more robust decisions in your career, community activities, and personal life. You may even become wise.
Twenty-five years ago, a book of models would have been intended for professors and graduate students studying business, policy, and the social sciences along with financial analysts, actuaries, and members of the intelligence community. These were the people who applied models and, not coincidentally, they were also the people most engaged with large data sets. Today, a book of models has a much larger audience: the vast universe of knowledge workers, who, owing to the rise of big data, now find working with models a part of their daily lives.
Organizing and interpreting data with models has become a core competency for business strategists, urban planners, economists, medical professionals, engineers, actuaries, and environmental scientists among others. Anyone who analyzes data, formulates business strategies, allocates resources, designs products and protocols, or makes hiring decisions encounters models. It follows that mastering the material in this book—particularly the models covering innovation, forecasting, data binning, learning, and market entry timing—will be of practical value to many.
Thinking with models will do more than improve your performance at work. It will make you a better citizen and a more thoughtful contributor to civic life. It will make you more adept at evaluating economic and political events. You will be able to identify flaws in your logic and in that of others. You will learn to identify when you are allowing ideology to supplant reason and have richer, more layered insights into the implications of policy initiatives, whether they be proposed greenbelts or mandatory drug tests.
These benefits will accrue from an engagement with a variety of models—not hundreds, but a few dozen. The models in this book offer a good starting collection. They come from multiple disciplines and include the Prisoners’ Dilemma, the Race to the Bottom, and the SIR model of disease transmission. All of these models share a common form: they assume a set of entities—often people or organizations—and describe how they interact.
The models we cover fall into three classes: simplifications of the world, mathematical analogies, and exploratory, artificial constructs. In whatever form, a model must be tractable. It must be simple enough that within it we can apply logic. For example, we cover a model of communicable diseases that consists of infected, susceptible, and recovered people that assumes a rate of contagion. Using the model we can derive a contagion threshold, a tipping point, above which the disease spreads. We can also determine the proportion of people we must vaccinate to stop the disease from spreading.
As powerful as single models can be, a collection of models accomplishes even more. With many models, we avoid the narrowness inherent in each individual model. A many-models approach illuminates each component model’s blind spots. Policy choices made based on single models may ignore important features of the world such as income disparity, identity diversity, and interdependencies with other systems.1 With many models, we build logical understandings of multiple processes. We see how causal processes overlap and interact. We create the possibility of making sense of the complexity that characterizes our economic, political, and social worlds. And, we do so without abandoning rigor—model thinking ensures logical coherence. That logic can be then be grounded in evidence by taking models to data to test, refine, and improve them. In sum, when our thinking is informed by diverse logically consistent, empirically validated frames, we are more likely to make wise choices.
ON MARCH 18, 2018, at around 10 P.M., Elaine Herzberg was wheeling her bicycle across a street in Tempe, Arizona, when she was struck and killed by a self-driving car. Although there was a human operator behind the wheel, an autonomous system—artificial intelligence—was in full control. This incident, like others involving interactions between people and AI technologies, raises a host of ethical and proto-legal questions. What moral obligations did the system’s programmers have to prevent their creation from taking a human life? And who was responsible for Herzberg’s death? The person in the driver’s seat? The company testing the car’s capabilities? The designers of the AI system, or even the manufacturers of its onboard sensory equipment?
“Artificial intelligence” refers to systems that can be designed to take cues from their environment and, based on those inputs, proceed to solve problems, assess risks, make predictions, and take actions. In the era predating powerful computers and big data, such systems were programmed by humans and followed rules of human invention, but advances in technology have led to the development of new approaches. One of these is machine learning, now the most active area of AI, in which statistical methods allow a system to “learn” from data, and make decisions, without being explicitly programmed. Such systems pair an algorithm, or series of steps for solving a problem, with a knowledge base or stream—the information that the algorithm uses to construct a model of the world.
Ethical concerns about these advances focus at one extreme on the use of AI in deadly military drones, or on the risk that AI could take down global financial systems. Closer to home, AI has spurred anxiety about unemployment, as autonomous systems threaten to replace millions of truck drivers, and make Lyft and Uber obsolete. And beyond these larger social and economic considerations, data scientists have real concerns about bias, about ethical implementations of the technology, and about the nature of interactions between AI systems and humans if these systems are to be deployed properly and fairly in even the most mundane applications.
Consider a prosaic-seeming social change: machines are already being given the power to make life-altering, everyday decisions about people. Artificial intelligence can aggregate and assess vast quantities of data that are sometimes beyond human capacity to analyze unaided, thereby enabling AI to make hiring recommendations, determine in seconds the creditworthiness of loan applicants, and predict the chances that criminals will re-offend.
But such applications raise troubling ethical issues because AI systems can reinforce what they have learned from real-world data, even amplifying familiar risks, such as racial or gender bias. Systems can also make errors of judgment when confronted with unfamiliar scenarios. And because many such systems are “black boxes,” the reasons for their decisions are not easily accessed or understood by humans—and therefore difficult to question, or probe.
One day this fall, Ashutosh Garg, the chief executive of a recruiting service called Eightfold.ai, turned up a résumé that piqued his interest.
It belonged to a prospective data scientist, someone who unearths patterns in data to help businesses make decisions, like how to target ads. But curiously, the résumé featured the term “data science” nowhere.
Instead, the résumé belonged to an analyst at Barclays who had done graduate work in physics at the University of California, Los Angeles. Though his profile on the social network LinkedIn indicated that he had never worked as a data scientist, Eightfold’s software flagged him as a good fit. He was similar in certain key ways, like his math and computer chops, to four actual data scientists whom Mr. Garg had instructed the software to consider as a model.
The idea is not to focus on job titles, but “what skills they have,” Mr. Garg said. “You’re really looking for people who have not done it, but can do it.”
An algorithm that was being tested as a recruitment tool by online giant Amazon was sexist and had to be scrapped, according to a Reuters report. The artificial intelligence system was trained on data submitted by applicants over a 10-year period, much of which came from men, it claimed.
Reuters was told by members of the team working on it that the system effectively taught itself that male candidates were preferable. Amazon has not responded to the claims.
Reuters spoke to five members of the team who developed the machine learning tool in 2014, none of whom wanted to be publicly named. They told Reuters that the system was intended to review job applications and give candidates a score ranging from one to five stars.
"They literally wanted it to be an engine where I'm going to give you 100 resumes, it will spit out the top five, and we'll hire those," said one of the engineers who spoke to Reuters.
In today's world, scientists in many disciplines and a growing number of journalists live and breathe data. There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. To enable easy access to this data, we launched Dataset Search, so that scientists, data journalists, data geeks, or anyone else can find the datarequired for their work and their stories, or simply to satisfy their intellectual curiosity.
Similar to how Google Scholar works, Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher's site, a digital library, or an author's personal web page. To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem.
In machine learning and deep learning we can’t do anything without data. So the people that create datasets for us to train our models are the (often under-appreciated) heroes. Some of the most useful and important datasets are those that become important “academic baselines”; that is, datasets that are widely studied by researchers and used to compare algorithmic changes. Some of these become household names (at least, among households that train models!), such as MNIST, CIFAR 10, and Imagenet.
We all owe a debt of gratitude to those kind folks who have made datasets available for the research community. So fast.ai and the AWS Public Dataset Program have teamed up to try to give back a little: we’ve made some of the most important of these datasets available in a single place, using standard formats, on reliable and fast infrastructure. For a full list and links see the fast.ai datasets page.
fast.ai uses these datasets in the Deep Learning for Coders courses, because they provide great examples of the kind of data that students are likely to encounter, and the academic literature has many examples of model results using these datasets which students can compare their work to. If you use any of these datasets in your research, please show your gratitude by citing the original paper (we’ve provided the appropriate citation link below for each), and if you use them as part of a commercial or educational project, consider adding a note of thanks and a link to the dataset.
Do you feel as if there’s always someone watching you at work?
You might be right: the way companies monitor employees has broadened beyond simply requiring workers to tap in and out of an office building. Advances in technology and a hunger for data have now created a market for devices that can measure workers’ movements, fitness and even sleep – all in the name of productivity.
Take Humanyze, for example, a start-up based in Boston, Massachusetts, which supplies companies with employee ID badges replete with inbuilt biometric measuring capabilities.
A plethora of tech within the badges tracks everything from movements and interactions around the office, to lengths of conversations, and even voice tone. CEO Ben Waber told Techworld earlier this year that microphones within the badges can process vocal information to detect whether a person dominates conversations, as well as tone, volume and speed of speech.
With these, the company aims to change the traditional role of management consultants in the workplace. According to Humanyze, these “people analytics,” can help measure everything from how often workers are disrupted, to the effectiveness of diversity and inclusion programmes.
In order to model students' happiness, we apply machine learning methods to data collected from undergrad students monitored over the course of one month each. The data collected include physiological signals, location, smartphone logs, and survey responses to behavioral questions. Each day, participants reported their wellbeing on measures including stress, health, and happiness. Because of the relationship between happiness and depression, modeling happiness may help us to detect individuals who are at risk of depression and guide interventions to help them. We are also interested in how behavioral factors (such as sleep and social activity) affect happiness positively and negatively. A variety of machine learning and feature selection techniques are compared, including Gaussian Mixture Models and ensemble classification. We achieve 70% classification accuracy of self-reported happiness on held-out test data.