Blame bad incentives for bad science

Most of us spend our careers trying to meet — and hopefully exceed — expectations. Scientists do too. But the requirements for success in a job in academic science don’t always line up with the best scientific methods. The net result? Bad science doesn’t just happen — it gets selected for.

What does it mean to be successful in science? A scientist gets a job and funding by publishing a lot of high-impact papers with novel findings. Those papers and findings beget awards and funding to do more science — and publish more papers. “The problem that we face is that the incentive system is focused almost entirely on getting research published, rather than on getting research right,” says Brian Nosek, a psychologist at the University of Virginia in Charlottesville.

This idea of success has become so ingrained that scientists are even introduced when they give talks by the number of papers they have published or the amount of grant funding they have, says Marc Edwards, a civil engineer at Virginia Polytechnic Institute and State University in Blacksburg.

But rewarding researchers for the number of papers they publish results in a “natural selection” of sloppy science, new research shows. The idea of scientific “success” equated as number of publications promotes not just lazy science but also unethical science, another paper argues. Both articles proclaim that it’s time for a culture shift. But with many scientific labs to fund and little money to do it, what does a new, better scientific enterprise look like?

As young scientists apply for tenure-track academic jobs, they may bring an application filled with tens to dozens of papers. Hiring committees can often no longer read or evaluate all of them. So they may come to use numbers as shorthand — numbers of papers published, how many times those papers have been cited and whether the journals the papers are published in are high-impact. “Real evaluation of scientific quality is as hard as doing the science in the first place,” Nosek says. “So, just like everyone else, scientists use heuristics to evaluate each other’s work when they don’t have time to dig into it for a complete evaluation.”

Too much reliance on the numbers means that scientists can — unintentionally or not — game the system. They can publish novel results from experiments with low power and effort. Those novel results inflate publication numbers, increase grant funding and get the scientist a job. Ideally, other scientists would catch this careless behavior in peer review, before the studies are published, weeding out poorly done studies in favor of strong ones. But Paul Smaldino, a cognitive scientist at the University of California, Merced, suspected that when the scientific idea of “meeting expectations” on the job is measured in publication rates, bad science would always win out.

So Smaldino and his colleague Richard McElreath at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, decided to create a computer simulation of the scientific “ecosystem,” based on a model for natural selection in a biological ecosystem. Each “lab” in the simulation was represented by a number. Those labs that best met the parameters for success survived and reproduced, spawning other labs that behaved in the same way. Those labs that didn’t meet expectations “died out.”
The model allowed Smaldino and McElreath to manipulate the definitions of “success.” And when that success was defined as publishing a lot of novel findings, labs succeeded when they did science that was “low effort” — sloppy and probably irreproducible. Research groups doing high-effort, careful science didn’t publish enough. And they went the way of the dinosaurs.

Even putting an emphasis on replication — in which labs got half credit for double-checking the findings of other groups — couldn’t save the system. “That was a surprise for us,” Smaldino says. He assumed that if the low-effort labs got caught by failures to replicate, their success would go down. But scientists can’t replicate every single study, and in the simulation, the lazy labs still thrived. “The most successful are still going to low effort,” he explains, “because not everyone gets caught.” Smaldino and McElreath published their findings September 21 in Royal Society Open Science.

“I think the results they get are probably reasonable,” says John Ioannidis, a methods researcher at Stanford University in California. “Once you have bad practices they can propagate and ruin the scientific process and become dominant. And I think there’s some truth to it, unfortunately.”

The publish-or-perish culture may be having negative consequences already, Edwards says. “I’ve … seen ethical researchers leave academia, not enter in the first place or become unethical,” he says. Scientists might slice their research findings thinner, trying to publish more findings with less data, breaking experiments down to the least publishable unit. That in itself is not unethical, but Edwards worries the high stakes places scientists on the edge of a slippery slope, from least publishable units to sliced-and-diced datasets. “With the wrong incentives you can make anyone behave unethically, and academia is no different.”

Using a theoretical model of his own, Edwards and his colleague Siddhartha Roy show that, at some point, the current academic system could lead a critical mass of scientists to cross the line to unethical behavior, corrupting the scientific enterprise and losing the public’s trust. “If we ever reach this tipping point where our institutions become inherently corrupt, it will have devastating consequences for humanity,” Edwards says. “The fate of the world depends as never before on good trustworthy science to solve our problems. Will we be there?” Edwards and Roy report their model September 22 in Environmental Engineering Science.

To stay away from the slippery slope, scientists will need to change what scientific success looks like. Here’s the rub, though. Scientists are the primary people watching scientists work. When papers go through peer review at scientific journals, ideas get examined in peer-review committees for grant funding or a scientist is being considered for an academic job, it’s other scientists who are guarding those gates to scientific success. A single scientist might be publishing papers, peer-reviewing other peoples’ papers, submitting grants, serving on review committees for other peoples’ grants, editing a journal, applying for a job and serving on a hiring committee — all at the same time. And so the standards for scientific integrity, for rigorous methods, do not reside with the institutions or the funders or the journals. Those standards are within the scientists themselves. The inmates really do run the scientific asylum.

This is not an inherently bad thing. Science needs people with appropriate expertise to read the highly specialized stuff. But it does mean that a movement for culture change needs to come from within the scientific enterprise itself. “This is more likely to happen if you have a grassroots movement where lots of scientists are convinced and are used to performing research in a given way, leading to more reliable results,” Ioannidis says.

What produces more reliable research, though, still requires … research. “I think these are questions that could be addressed with scientific studies,” Ioannidis says. “This is where I’m interested in taking the research, to get studies that are telling us to [do science] this way, [or] this type of leadership is better…. You can test policies.” Science needs more studies of science.

The first step is admitting that problems exist in the current structure. “We’re bought into it — we invested our whole career into the game as it exists,” Edwards says. “We are taught to be cowards when it comes to addressing these issues, because the personal and professional costs of revealing these problems is so high.” It can be painful to see sloppy science exposed. Especially when that science is performed by colleagues and friends. But Edwards says fixing the system will be worth the pain. “I don’t want to wake up someday and realize I’m in a culture akin to professional cycling, where you have to cheat to compete.”

The solution is to add incentives for having an excellent research process, regardless of outcome, Nosek says. Scientists need to be rewarded, funded and promoted for careful, thorough research — even if it doesn’t produce huge differences and groundbreaking results. Nosek points to ideas like registered reports. These are systems where scientists report their experimental plans and methods to a journal, and the journal accepts the paper — whether or not the research produces any noteworthy results.

Despite his results, Smaldino is optimistic that incentives can change, allowing the best science to rise to the top. “I think science is great,” he says. “I think in general scientists aren’t bad scheming people.” The dire predictions of the models don’t have to come to pass. “This is not a condemnation of science,” Smaldino says. “I love science — there’s no other way to learn a lot of things that are important to learn about the world. But the science we do can always be better.”

Stone adze points to ancient burial rituals in Ireland

A stone chopping tool found in Ireland’s earliest known human burial offers a rare peek at hunter-gatherers’ beliefs about death more than 9,000 years ago, researchers say.

The curved-edge implement, known as an adze, was made to be used at a ceremony in which an adult’s largely cremated remains were interred in a pit, says a team led by archaeologist Aimée Little of the University of York in England. Previous radiocarbon dating of burned wood and a bone fragment from the pit, at a site called Hermitage near the River Shannon, places the material at between 9,546 and 9,336 years old.
A new microscopic analysis revealed a small number of wear marks on the sharpened edge of the still highly polished adze, which was probably attached to a wooden handle, the researchers report online October 20 in the Cambridge Archaeological Journal. Little’s group suspects someone wielded the 19.4-centimeter-long adze to chop wood for a funeral pyre or to fell a tree for a grave marker. A hole dug into the bottom of the riverside pit once held a tall wooden post indicating that a person lay buried there, the scientists suspect.

Once the adze fulfilled its ritual duties, a hard stone was ground across the tool’s sharp edge to render it dull and useless, further microscopic study suggests. The researchers regard this act as a symbolic killing of the adze. The dulled tool blade was then placed in the pit, next to the post grave marker, perhaps to accompany the cremated individual to the afterlife.
“By 9,000 years ago, people in Ireland were making very high quality artifacts specifically to be placed in graves, giving us a tantalizing glimpse of ancient belief systems concerning death and the afterlife,” Little says. Her conclusion challenges a popular assumption among researchers that stone tools found in ancient hunter-gatherers’ graves belonged to the deceased while they were still alive. In that scenario, tools and other grave items played no role in burial activities and rituals.
Archaeologist Erik Brinch Petersen of the University of Copenhagen is skeptical. No other European stone adzes or axes from around 10,000 to 6,000 years ago display blunted edges, Petersen says. That makes it difficult to say how such an unusual artifact was used or whether it was intended to accompany a cremated person to the afterlife. In addition, researchers have found only a few European cremations from the same time period.
Since there was no practical reason to turn an effective tool into a chunk of stone that couldn’t cut, Little responds, intentionally dulling the adze’s edge was likely a ritual act. Whatever the meaning, people in Ireland made polished stone tools several thousand years before such implements achieved widespread use in Europe with the arrival of agriculture, Little says.

Excavations in 2001 revealed the Hermitage burial pit. Two small stone tools lay near the polished adze. A couple more burial pits turned up nearby. One contained cremated remains of an adult human from around 9,000 years ago; the other held roughly 8,600-year-old cremated remnants too fragmentary to enable a species identification.

“Hermitage was a special place known about and returned to over hundreds of years,” Little says.

Sounds and glowing screens impair mouse brains

SAN DIEGO — Mice raised in cages bombarded with glowing lights and sounds have profound brain abnormalities and behavioral trouble. Hours of daily stimulation led to behaviors reminiscent of attention-deficit/hyperactivity disorder, scientists reported November 14 at the annual meeting of the Society for Neuroscience.

Certain kinds of sensory stimulation, such as sights and sounds, are known to help the brain develop correctly. But scientists from Seattle Children’s Research Institute wondered whether too much stimulation or stimulation of the wrong sort could have negative effects on the growing brain.
To mimic extreme screen exposure, mice were blasted with flashing lights and TV audio for six hours a day. The cacophony began when the mice were 10 days old and lasted for six weeks. After the end of the ordeal, scientists examined the mice’s brains.

“We found dramatic changes everywhere in the brain,” said study coauthor Jan-Marino Ramirez. Mice that had been stimulated had fewer newborn nerve cells in the hippocampus, a brain structure important for learning and memory, than unstimulated mice, Ramirez said. The stimulation also made certain nerve cells more active in general.

Stimulated mice also displayed behaviors similar to some associated with ADHD in children. These mice were noticeably more active and had trouble remembering whether they had encountered an object. The mice also seemed more inclined to take risks, venturing into open areas that mice normally shy away from, for instance.

Some of these results have been reported previously by the Seattle researchers, who have now replicated the findings in a different group of mice. Ramirez and colleagues are extending the work by looking for more detailed behavioral changes.

For instance, preliminary tests have revealed that the mice are impatient and have trouble waiting for rewards. When given a choice between a long wait for a good reward of four food pellets and a short wait for one pellet, stimulated mice were more likely to go for the instant gratification than non-stimulated mice, particularly as wait times increased.
Overstimulation didn’t have the same effects on adult mice, a result that suggests the stimulation had a big influence on the developing — but not fully formed — brain.

If massive amounts of audio and visual stimulation do harm the growing brain, parents need to ponder how their children should interact with screens. So far, though, the research is too preliminary to change guidelines (SN Online: 10/23/16).

“We are not in a position where we can give parents advice,” said neuroscientist Gina Turrigiano of Brandeis University in Waltham, Mass. The results are from mice, not children. “There are always issues in translating research from mice to people,” Turrigiano said.

What’s more, early sensory input may not affect all children the same way. “Each kid will respond very, very differently,” Turrigiano said. Those different responses might be behind why some children are more vulnerable to ADHD.

There’s still much scientists don’t understand about how sensory input early in life wires the brain. It’s possible that what seems like excessive sensory stimulation early in life might actually be a good thing for some children, sculpting brains in a way that makes them better at interacting with the fast-paced technological world, said Leah Krubitzer of the University of California, Davis. “This overstimulation might be adaptive,” she said. “The benefits may outweigh the deficits.”

Dogs form memories of experiences

Dogs don’t miss much. After watching a human do a trick, dogs remembered the tricks well enough to copy them perfectly a minute later, a new study finds. The results suggest that our furry friends possess some version of episodic memory, which allows them to recall personal experiences, and not just simple associations between, for instance, sitting and getting a treat.

Pet dogs watched a human do something — climb on a chair, look inside a bucket or touch an umbrella. Either a minute or an hour later, the dog was unexpectedly asked to copy the behavior with a “Do it!” command, an imitation that the dogs had already been trained to do. In many cases, dogs were able to obey these surprise commands, particularly after just a minute. Dogs didn’t perform as well when they had to wait an hour for the test, suggesting that the memories grew hazier with time.

Like people, dogs seem to form memories about their experiences all the time, even when they don’t expect to have to use those memories later, study coauthor Claudia Fugazza of Eötvös Loránd University in Budapest and colleagues write November 23 in Current Biology.

First signs of boron on Mars hint at past groundwater, habitability

SAN FRANCISCO — A new element has been found in Mars’ chemical arsenal.

While sampling rocks from Gale Crater, the Curiosity rover detected boron concentrations of about 10 to 100 parts per billion. The discovery is the first find of boron on the Red Planet and hints that the Martian subsurface may have once been habitable for microbes, scientists reported December 13 at the American Geophysical Union’s fall meeting.

The boron was discovered in veins of calcium sulfate. Such features on Earth indicate that nonacidic groundwater with a temperature of around zero to 60° Celsius once flowed through the area — conditions favorable to microbial life. As groundwater evaporates, boron and calcium sulfate are left behind.

How this process unfolded on Mars is uncertain, the researchers said, though they expect more clues to be uncovered as Curiosity continues its trek.

Antimatter hydrogen passes symmetry test

An antimatter atom abides by the same rules as its matter look-alike. Scientists studying antihydrogen have found that the energy needed to bump the atoms into an excited, or high-energy, state is the same as for normal hydrogen atoms.

Scientists at the European particle physics lab CERN in Geneva created antihydrogen atoms by combining antiprotons and positrons, the electron’s antiparticle. Hitting the resulting atoms with a laser tuned to a particular frequency of light boosted the antihydrogen atoms to a higher energy. The frequency of laser light needed to induce this transition was the same as that needed for normal hydrogen atoms, indicating that the energy jump was the same, scientists from the ALPHA-2 experiment report December 19 in Nature.

Antihydrogen’s similarity to hydrogen conforms to a principle known as charge-parity-time, or CPT, symmetry — the idea that the laws of physics would be unchanged if the universe were reflected in a mirror, time reversed, and particles swapped with antiparticles. So far scientists have never discovered a situation where this symmetry doesn’t hold up, but antihydrogen provides a precise way to check for subtle breakdowns in the rule.

Differences between matter and antimatter are essential for the existence of the universe as we know it: The Big Bang produced equal amounts matter and antimatter, yet somehow antimatter became very rare. So scientists are still on the lookout for any unexpected behavior from antimatter.

Some pulsars lose their steady beat

GRAPEVINE, TEXAS — A pair of cosmic radio beacons known as pulsars keep switching off and on, suggesting that there might be vast numbers of undiscovered pulsars hiding in our galaxy.

Pulsars are rapidly spinning neutron stars, the ultradense cores left behind after massive stars explode. Neutron stars are like lighthouses, sweeping a beam of radio waves around the sky. Astronomers see them as steady pulses of radio energy.

But at least two in the Milky Way seem to spend most of their time turned off, Victoria Kaspi, an astrophysicist at McGill University in Montreal, reported January 4 at a meeting of the American Astronomical Society. One, first detected at Arecibo Observatory in Puerto Rico in November 2011, only pulses about 30 percent of the time. Another, also discovered at Arecibo, laid down a steady beat just 0.8 percent of the time when observed in 2013 and 2015. Then starting in August 2015, it abruptly jumped to being on 16 percent of the time for several months.
When sending out pulses, the pulsars seem to behave like any other pulsar, Kaspi said. “You wouldn’t know that they have this dual personality.” Researchers don’t yet know why some pulsars behave this way. But Kaspi said that it’s probably tied to changes in their magnetic fields, which astronomers think help control the radio beacons.

These two intermittent pulsars join three others that had been previously observed. Given that most spend much of their time off, Kaspi said, astronomers might be missing a large population of pulsars in the Milky Way.

Heart-hugging robot does the twist (and squeeze)

A new squishy robot could keep hearts from skipping a beat.

A silicone sleeve slipped over pigs’ hearts helped pump blood when the hearts failed, researchers report January 18 in Science Translational Medicine. If the sleeve works in humans, it could potentially keep weak hearts pumping, and buy time for patients waiting for a transplant.

To make the device contract, biomedical engineer Ellen Roche and colleagues lined it with two sets of narrow tubes. One set encircles the sleeve, like bracelets; the other runs from top to bottom. When air pumps through the tubes, the sleeve compresses (like a clenched fist) and twists (like wrung-out laundry). Those actions mimic how the layers of the heart contract.
Researchers programmed the sleeve to sync with the heart’s motion. And like a healthy heart, the robot sleeve’s double squeeze gets blood moving.

Roche’s team, which did the work while she was at Harvard University, triggered heart failure in six pigs and then measured the volume of blood pumped by the heart with and without the sleeve’s help. Heart failure cut the volume roughly in half, to about 1 liter of blood per minute. But the sleeve restored the pumped volume to about 2½ liters per minute — just about normal, Roche, now at National University of Ireland, Galway, and colleagues report.

Big genetics study blazes path for bringing back tomato flavor

An analysis of nearly 400 kinds of tomatoes suggests which flavor compounds could bring heirloom deliciousness back to varieties that were bred for toughness over taste.

About 30 compounds are important in creating a full-bodied tomato flavor, says study coauthor Harry Klee of the University of Florida in Gainesville. He and colleagues have identified 13 important molecules that have dwindled away in many mass-market varieties. Some of the flavor compounds deliver such a thrill to the human sensory system that even a modest increase could make a big difference, the researchers report January 26 in Science.
“I think this will definitely help,” says Alisdair Fernie, who was not part of the study but has studied tomato chemistry at the Max Planck Institute of Molecular Plant Physiology in Potsdam, Germany. “Taste is incredibly complex,” he says, so creating more appealing commercial varieties “for certain, requires a holistic approach,” he says.

To achieve that holistic view, the researchers teamed up with geneticists at China’s Agricultural Genomics Institute in Shenzhen, who determined the full genetic makeup of a whopping 398 kinds of tomatoes, wild as well as heirloom and commercial. The scientists ran 96 varieties of tomatoes through taste-testing panels, looking for genetic and chemical similarities among those varieties ranked tastiest.

Much of what makes some tomatoes taste better is actually smell, Klee points out. Tongues can detect relatively few qualities, such as sweetness, acidity and softness. Chemical detectors in the nasal passages are far more varied and sensitive. So what really puts the “Mmmm” into a tomato is the whoosh of air forced up into the nasal passages as someone swallows. Airborne compounds, known as volatiles, are abundant in tomatoes, and Klee looks to them for flavor magic.

Of these volatile compounds, some appear in even the tastiest tomatoes at minuscule levels — only parts per trillion. But human senses respond so strongly to the odors that a little bit goes a long way. Tomatoes should taste noticeably better if researchers can breed just four or five heirloom versions of volatile-producing genes back into commercial varieties, Klee says.

Increasing the sweetness of today’s tomatoes, on the other hand, may be tougher. About 80 percent of the sugar in commercial tomatoes comes from the leaves and is transferred to the big red globes as they mature (SN: 7/28/12, p. 18). Because breeders have done such a great job of maximizing the number of fruits on a plant, the plants would need lots of leaves to sweeten them all. So the price of sweeter tomatoes would be making them smaller, and fewer.
“Now we come to the real crux of the problem,” Klee says. “I have to fix the flavor, but I can’t compromise all of the stuff that breeders have done to the modern tomatoes to make them healthier, more productive, more disease resistant and more shippable,” he says.

And let’s not forget about what happens to tomatoes after they’re picked, says Ann Powell, who studied tomato ripening and disease resistance at the University of California, Davis and is now at the National Science Foundation. Cooling weakens flavor, as cooks who shriek at the horror of storing tomatoes in refrigerators have long known. Therefore, Powell says, another study of Klee’s from 2016 — on how chilling can turn on and off genes — makes an important companion to the new work. A combination of breeding better plants and coddling them strategically may be the way forward for tastier tomatoes.

DNA points to millennia of stability in East Asian hunter-fisher population

In a remote corner of eastern Russia, where long winters bring temperatures that rarely flicker above freezing, the genetic legacy of ancient hunter-gatherers endures.

DNA from the 7,700-year-old remains of two women is surprisingly similar to that of people living in that area today, researchers report February 1 in Science Advances. That finding suggests that at least some people in East Asia haven’t changed much over the last 8,000 years or so — a time when other parts of the world saw waves of migrants settle in.
“The continuity is remarkable,” says paleogeneticist Carles Lalueza-Fox of the Institute of Evolutionary Biology in Barcelona, who was not involved with the work. “It’s a big contrast to what has been found in Europe.”

In Western Europe especially, scientists studying ancient DNA have put together a picture of flux, says study coauthor Andrea Manica. “Every few thousand years, there are major turnovers of people.” Around 8,000 years ago, he says, migrating farmers replaced hunter-gatherers in the area. And a few thousand years after that, Bronze Age migrants from Central Asia swept in.

In DNA collected from the bones and teeth of these ancient peoples, scientists can spot genetic signatures of different populations. When a population of farmers balloons, Lalueza-Fox says, the signatures of hunter-gatherers are mostly erased.

But whether that’s true across the globe is unclear, says Manica, of the University of Cambridge. “We wanted to see what happened in other places…. Asia is huge compared to Europe, and it’s been neglected.”
Manica’s team collected DNA from the skeletons of five ancient people found in a cave called Devil’s Gate. The cave rests in a far east finger of Russia, tucked along the border of China and North Korea, and holds human remains, scraps of textiles and bits of broken pottery.

Researchers gathered enough DNA from two of the people to piece together about 6 percent of the genome, the complete set of genetic instructions inside a cell’s nucleus. That’s not much, Manica says, but it’s enough to compare the Devil’s Gate denizens with other people. The researchers analyzed the genomes of people strewn across the far reaches of the continent — from the Dolgan in Siberia to the Thai thousands of kilometers south.

Genetically, the 7,700-year-old women closely resembled the Ulchi, a small group of hunter-fishers who still live off the land today. Manica can’t say whether the Ulchi are direct descendants of the two Devil’s Gate women, or just closely related. But the find suggests a pocket of stability in East Asia — a place where hunter-gatherers weren’t swept out by, or folded into, booming groups of farmers.

Perhaps farming didn’t take off there because the cold climate wasn’t good for growing crops, Manica says. Or maybe the ideas and technologies from farmers and other migrants made it to the Ulchi without an accompanying influx of people. (The Ulchi aren’t like primitive hunter-gatherers of the past. They farm a bit, and have adopted new ways to fish, hunt and store food, he points out.)

“This shows that ideas can travel without people moving with them,” Manica says.

That makes sense, Lalueza-Fox says. But scientists now need more data — additional samples from East Asia, and Southeast Asia, too, he says. “I have a feeling the whole story will be much more complicated.”