Research I'd Like to See
I have a backlog of half-baked ideas for potentially interesting research projects, and it doesn’t make much sense to keep them to myself. In part, I’ve written some of these ideas down as a kind of weak public commitment to write about them properly. But mostly, I hope this ends up being useful for others. I encourage others to share research ideas more — half-baked research ideas are not a precious commodity in fixed supply; it’s the doing something with them that’s valuable. And I think there is an unusually big ‘idea overhang’ in effective altruism and global priorities research. If you have a small garden but a windfall of seeds, (sidenote: Here are some other examples, from Logan Graham, Gwern, Rose Hadshar, Alexey Guzey, and Patrick Collison. Here is a doc and a spreadsheet compiled by Edo Arad. Michael Aird has a very impressive ‘central directory for open research questions’ which you can browse here. Effective thesis also has a great database of research ideas.)
Fee free to ask me for more specifics about something. I also expect I’ll occasionally revisit this and add things.
- An intergovernmental panel for existential risks — I think the IPCC is a very interesting example of how to translate research into policy, through the coordinating mechanism of the UN. For a long time I was wondering what a similar model could look like for catastrophic or existential risks, or perhaps specific kinds of risk (e.g. from powerful AI). A research project could look at how the IPCC was so successful, and how this might be transferred to the risk context.
- If that project ends up saying useful things about building a successful parallel to the IPCC for risks, then this could be relevant for the UN Secretary General’s recent proposals for (i) a Futures Lab for futures impact assessments and “regularly reporting on megatrends and catastrophic risks”, and (ii) a ‘Declaration on Future Generations’. If so, the research could be submitted to the 2023 Summit of the Future recently proposed by the United Nations Secretary General. (See this post for more info)
- On the economics side, it would be good for someone to get clear on the idea of an ‘(existential) risk budget’. There is a sense in which (existential) risk is the ultimate nonrenewable resource — we cannot somehow undo or compensate for periods of risk, in the sense that we can regrow trees after cutting them down. But it would be good to precisify this.
- More broadly, when we think about long timescales, what should the idea of ’sustainability’ actually mean? Surely not ‘a rate of consumption (of some resource) that would not be sustainable over long timescales’, for reasons Nick Bostrom explores in this paper. I think you could do some really useful work in coming up with good (or better) definitions of long-term sustainability, by adapting or adding to the existing literature on sustainability in economics.
- How should we define ‘existential security’? Roughly speaking, existential security is achieved by reaching a period of (sustainably) low existential risk. It’s a state we should aim for, by moving out of the current period of high existential risk. But how could we make this more precise? Which of several competing definitions should we favour?
- Consider the ‘hazard rate’ for any given time: the instantaneous level of existential risk (like for Poisson processes). We could simply say that we’ve reached existential security whenever the hazard rate is low. But this won’t quite do: if we expect the hazard rate to rise soon, we can hardly say we’re secure.
- Instead, I think it might be more suitable to understand existential security as being on a trajectory to not wasting our potential (in all likelihood). Not only is the hazard rate low, but there are reasons to think it will remain sustainably low. (Note the connection with above)
- Sticking with philosophy, you could write about what ethical views other than total utilitarianism have to say about longtermism and existential risk.
- How should we define ‘existential risk’? In The Precipice, Toby Ord goes for “the permanent destruction of human potential” as a working definition. But (i) this could be more precise, and (ii) even at this level of precision, there are competing definitions.
- We could stick with a definition that centers on ‘potential’, where ‘potential’ is understood as something like ‘the best future we could realistically achieve from this point’, and ‘achieving our potential’ means something like ‘ending up somewhere in the ballpark of the best futures we could have realistically achieved from some point’. This is nicely ecumenical: we don’t need to agree on what our best future look like. It’s also objective: we can undergo an existential catastrophe without appreciating it. But this definition might sometimes say that an existential catastrophe has not occurred, when something equivalently bad obviously has occurred. This could happen if we get ‘locked-in’ to some stable regime, such that recovery is realistic (as in possible), but not positively likely. Our potential is preserved, but we’re unlikely ever to actually achieve it.
- To address this issue, you could instead define ‘existential risk’ as “risk of a sudden drop in the expected value of the future of humanity”. This would render existential risk analogous to a stock crash (because the price of a stock encodes expectations about future value). I can think of a few disadvantages. One is that this would admit as existential catastrophes events which don’t seem to deserve that title, such as learning that the expected lifespan of stars was much shorter than previously thought. A second is that different moral perspectives aren’t all going to give commensurable (or even any) expected values for the future, so you would need to assume a narrow range of theories that play well in this regard. Third is that on this definition, the probability of an existential risk becomes (roughly) the probability of a sudden drop in the probability of reaching an excellent future; and it’s awkward to talk about probabilities of probabilities.
- Both these options have things going for them. Is there a hybrid option, or something entirely novel, that captures what really matters about the pre-theoretical idea of existential risk, while avoiding some of these issues?
- Relatedly: when is ‘existential risk’ not a useful concept at all? What does it obscure? Are there related concepts which better illuminate some feature of improving the long-term future? Mostly this is a question about making sure our conceptual tools fully kit us out for improving the long-run future. But inevitably this kind of question will fade into empirical questions about which other longtermist interventions are competitive with existential risk mitigation. So it would also be good to think more about: what are the best objections to prioritising existential risk mitigation?
- How might we model the value of information, especially in the context of mitigating risks? We could imagine a model where we can spend to learn the relative likelihoods of various risks, and we can spend to reduce particular risks. How much should we spend on each? How might this be relevant for risk policy? Would such a model recommend spending much more than at present on researching/identifying which risks are actually biggest, instead of mitigating them all more evenly? If you have an econ background, there’s probably no limit on how sophisticated this could get.
- A final idea is to investigate common features of the most promising interventions to mitigate existential risk. In particular, it seems to me that (speaking very abstractly), certain kinds of asymmetric information are dangerous, and certain kinds of common knowledge are safe.
Representing future generations and longtermism
- Proposals for Our Common Agenda — The UN recently released a report outlining its agenda for the next few years, called Our Common Agenda. I summarised it here — I think it’s very exciting. The report outlines a few proposals relevant for longtermism and representing future generations. But they are deliberately vague, because the UN wants to solicit concrete ideas about what these proposals could do, and they will discuss these in 2023 (at their ‘Summit of the Future’). I think it would be really valuable to look at one proposal in particular, such as the idea of a Special Envoy for Future Generations, and discuss ideas for concrete aspects of that proposal. That might involve looking at the history of similar posts in the UN, seeing how and where they succeeded. I think this would be especially valuable work if you could it could be attached to an institution that is likely to appear credible in the eyes of the UN, such as a university or a longtermist research organisation.
- More on the economics side, I think a review of discounting in policy would be very useful. Future goods are discounted in policy for a variety of reasons — because of inflation, because of a ‘hazard/catastrophe rate’, or even because of ‘pure’ time preference. I think the discourse in policy-focused longtermist research can sometimes gets a bit confused on how these are all different. So it could be useful to decompose reasons for discounting, assessing which make sense in which contexts, and then perhaps saying something about how policy changes with no rate of pure time preference. Useful links may be this, this, and this. More generally, I think ‘intergenerational welfare economics’ could be a really cool field to look into.
- What are the best philosophical objections to longtermism? Could present-day longtermism be missing a crucial consideration?
- Again on the philosophy side, I’ve been wondering a bit recently about cases where we try to represent future generations by making the decisions they, in some sense, will prefer us to have made. But this is a very slippery idea, because our decisions today (right or wrong) will also affect these backward-looking preferences of future people. This is maybe an interesting framing of intergenerational governance — the challenge is to find ‘stable points’ where our decisions today lead to future assessments that we made the right decisions in retrospect. But this raises some questions. For instance, if we’re imagining future people voting on what we do now, do we normalise their votes by their total population? At the extreme, there are no future people because something goes terribly wrong — but in this case, there is nobody around (sidenote: Suppose option A can be expected to double the population by the date at which we’re considering retrospective votes, and option B keeps it at present-day levels. Suppose we expect A to have 60–40 support on A, and B to have 80–20 support on B.) How should we think about (sidenote: Choosing A recommends A, and choosing B recommends B) and (sidenote: Choosing A recommends B, and choosing B recommends A.)
- How can we reward political appointees who are tasked with representing future generations, if the effects of their decisions might only play out long after they make them, but we nonetheless want to incentivise the right decisions? One idea is to find ways to link pension bonuses with outcomes (in a way that is fair and difficult to game). Another idea could be finding ways to link some fraction of salary to the predictions of independent forecasters, or prediction markets.
- On the topic of longtermist institutional reform, you might also look at integrating longerm forecasting practices into government. We know that trained forecasters reliably beat various kinds of ‘expert’ at predicting political events, so it seems like integrating better forecasting methods into government and policy could be very useful. Similar things can be said for prediction markets (Robin Hanson has written extensively on them). These aren’t new thoughts, but you could investigate forecasting and prediction markets from a couple new perspectives: the very long-term future, and existential risk. Both perspectives throw up limits and challenges for standard forecasting methods, but it’s possible to imagine some interesting workarounds. Some really promising work has been done on this already. You could build on the existing work by asking: how might governments best integrate forecasting methods for anticipating the effects of policy on the longterm future, and existential risk?
- This one’s more tangible and quantitative: I would love to see some efforts at answering some simple questions about human ancestry over long time periods. For instance: for two randomly chosen present-day people, how many generations back should I expect to go before finding their most recent common (genealogical) ancestor? How long ago lived the most recent common ancestor of me and at least 50% of the world? At current fertility/movement rates, at what point in the future should I expect that most (>50%) people are my genealogical ancestor? What about 99%? And so on. This excellent paper could be good inspiration.
- In The Precipice, Toby Ord discusses the idea of ‘civilisational virtues’. Just as there are personal virtues, like compassion and equanimity, could large groups of people have virtues, distinct from the virtues of their members? I’m interested to know whether this is indeed a useful perspective, and whether there could be uniquely civilisational virtues — traits which aren’t desirable or applicable at the personal level. If it’s useful to talk about ‘group agents’, what makes a group more or less agentic? Is agency at the group level desirable? Furthermore, I’d be interested to explore whether we can talk about groups having beliefs (independently of the beliefs of their members). How might this function differently from individual beliefs? How might this actually shed light on how individual beliefs work? I think this maybe has something to do with ‘social epistemology’.
- I think we should take a bit of time to assess and refine the idea of a ‘long reflection’. How long are we actually talking? How feasible would it be to make this happen — what measures might be required to sustain such a period? If a very long and coordinated reflection doesn’t look feasible, is it nonetheless directionally right? Or is trying to get a long reflection but (sidenote: In particular, I worry about a situation (especially conditioning on not getting crazy transformative AI) where (i) it’s impossible to coordinate and stop defectors, or (ii) impossible to set things up such that in the limit of time it’s true/good ideas that win out. What should we do then? Maybe in that world, a long reflection isn’t even directionally good. Instead, you might be stuck with finding a best guess ($$\approx$$ liberal democracy, cosmopolitanism, consequentialism) and making sure those ideas outcompete nastier ideas ($$\approx$$ totalitarianism, parochialism).)
- What are some good ideas for longtermist visualisations? For a while, I’ve been thinking about which longtermist concepts might make for the most compelling visualisations. The original impetus for this question was seriously considering starting work on a ‘timeline of everything’, showing major events (astronomical, geological, historical) from the Big Bang to the end of time. Users can zoom in and out, much like existing apps that show the (sidenote: Appreciating the vast amount of time ahead of us, and the relatively brief period of time that all of recorded human history makes up, is a key underlying intuition for longtermist arguments.) I’m sure there are tons of other ideas which someone could make (sidenote: Inspiration: Information is Beautiful, Asteroid Close Calls, Nuke map, XKCD Timeline of recent climate, Timelapse of the Future, Deep Time Walk, BBC Timeline of the Far Future, (Old) ChronoZoom, Power of Ten (1977 Wiki).) Some sample ideas: parameter counts in (sidenote: Jaime Sevilla Molina has already done a great job at compiling this dataset.) improvements in algorithmic efficiency of AI models; digital information stored per person; cost to synthesise genetic material over time; philanthropic spending on various cause areas over time; biomass over time of humans versus farmed animals versus wild animals (or just other mammals); oldest institutions in the world; indicators of globalisation, connectedness, and cosmopolitanism; indicators of stagnation.
Philosophy of mind / psychology
What’s up with hypnosis? Hypnosis is manifestly real. A cruise ship magician with a couple years’ practice can get shy, sensible people to do quite stupid things in front of a large crowd. There are limits: as far as I can tell it’s close to impossible to hypnotise someone against their will, or compel people to do truly dangerous things. But it’s a remarkably powerful phenomenon. For instance, it can lead to lasting changes: phobias can be close to cured with hypnotherapy. It also just seems insanely illuminating and significant. Yet, I’m not really aware of a flourishing study of hypnosis in psychology; nor have I seen it get discussed in philosophy of mind. Why is that? Why can’t we have a field of hypnosis studies? What exactly is going on when someone gets hypnotised and what can that teach us about e.g. how beliefs work? (I realise this one’s more out on a limb)
- I’m also worried that something like hypnosis could be used for quite objectionable ends — especially because you don’t obviously need a person to do it; a screen and headphones might do. So perhaps one day we should start worrying about how to stop that happening.
Here’s something I’m more confident about: I’d love to see a well-funded, interdisciplinary research program on the meta-problem of consciousness. The philosopher Keith Frankish mentioned this when I spoke to him in November, and I was nodding furiously along.
- To a first approximation, the meta-problem is “the problem of explaining why we think that there is a problem of consciousness”. It looks like making progress on the meta-problem could shine considerable light on the first-order ‘hard problem’ of consciousness (roughly, how and why we are phenomenally conscious). Unlike the hard problem, the meta-problem is a question straightforwardly about physical and functional things. For instance, do children share intuitions about consciousness (e.g. inverted spectra)? When in the developmental process do those intuitions come online? Are there cultural differences to these intuitions? Why do philosophy professors make noises and marks on their blackboards to the effect of “there is a hard problem of consciousness”? And so on. This is a big research program which could span developmental psychology, anthropology, computational neuroscience, old school cognitive science, and indeed philosophy of mind. Here’s an example of a really promising bit of research along these lines.
I would also love to see more philosophical work on illusionism as a theory of consciousness. I find illusionism compelling, but I do think that a lot of useful work can be done to clarify precisely what claims it is making. If something seems to be the case, but isn’t, then it’s an illusion. But what exactly seems to be the case but isn’t when it comes to consciousness? Personally, I’m not sure that talk of ineffability, privacy, irreducibility, etc. gets to the nub of this question.
- Mostly, however, I’m interested in the ethical implications of illusionism. Maybe the case for worrying about this goes as follows: reasonable consequentialist theories ground out value, to a large extent, in mental states which are intrinsically valuable — happiness and suffering. Illusionism denies that such mental states exist as we prefer to imagine them. They leave little room for saying that mental states are intrinsically (dis)valuable. Where should we go from here? A good place to start on this is François Kammerer’s ‘The Normative Challenge for Illusionist Views of Consciousness’.
Is introspection really overrated? I vaguely plan to write about this. Introspective processes are often placed on a pedestal as capable of delivering especially deep, robust, knowledge. Indeed, some people say that it’s just incoherent to doubt the results of introspection. And so a lot of weight is put on introspection when we make decisions, and figure out difficult social questions, like whether to stay or break up with someone. But I think what feels like uncovering deep truths is, unnervingly often, just inventing things on the spur of the moment. So our reliance on introspection might be doing harm, and we should think about ways to notice its flaws and supplement it with other ways of learning about ourselves. The book The Mind is Flat does a great job at articulating some of this worry.
- What versions of the orthogonality thesis are true? What evidence do we have for them? I have in mind stronger and weaker ‘versions’.
- How reliable are extant measures of subjective wellbeing? I spent some time a couple years ago wondering whether different people use different scales for subjective wellbeing surveys in systematic ways. I think there are tons of similar questions to ask about these measures. For instance, what processes do people use to translate their ‘actual’ happiness or life satisfaction (an unbounded multidimensional mess) into a single, bounded, number? What is the ‘zero’ point for where life is not worth living (from the perspective of the person living it)? How many people live below it? More philosophically: is life satisfaction just a crude indicator of total positive versus negative affect over time?
- How do we operationalise ‘autonomy’ in the context of building superintelligent AI, and wanting to preserve human autonomy? Imagine just for a second that we could build a powerful AI that ‘knows what’s best’ for us in a range of scenarios, and could just arrange things just so. Even if that AI were perfectly aligned with our own values, I think we would still want to trade off some of the advantages and convenience of ‘babysitter AI’ to keep hold of autonomy: to contravene its decisions, shape our own values, and learn from our own foibles. But what does autonomy actually mean here? Where’s the cutoff between strong ‘nudging’ and manipulation?
- I think you could run a similar question for authenticity.
- Similarly, how and when is it appropriate to talk about the intent of an AI agent? Is intentions talk translatable to just talk about beliefs and incentives / reward functions? For which systems is it appropriate to take an ‘intentional stance’? Which AI systems really mark a difference between direct and oblique intent? Note there is a decently big philosophical literature on intentions.
- Does complexity matter fundamentally? It’s difficult to know what to think about ‘complexity science’ (I think. Lot of disparate things fall under that umbrella). Plus, complexity doesn’t have an obvious and agreed-upon measure (in the way that e.g. entropy). Instead, there’s a bunch of more-or-less formal definitions spanning computer science and the social sciences. That said, even on rough/intuitive notions of complexity, I think it’s notable that more valuable states often seem to be more complex than less valuable ones. In particular, I want to say that (sidenote: This might conflict with an aggregative consequentialism that says that the value of some big region is the sum of the value of its constituent parts. If we decide some mental state is good, let’s ‘tile the universe’ with it. But ‘tiling the universe’ would not create a complex structure; and perhaps this partially explains why it seems bad to me.) But how could this be made precise?
Ambitious / speculative / potentially stupid
- How can we get (welfare) economics and ethics to listen to one another? This one’s very vague, and likely I’ve just Dunning-Krugered myself here. But for a while I’ve been a bit surprised by a couple things. First, questions in the more theoretical side of ethics (think Parfit, Broome) begin to resemble questions in economics. Perhaps a lot of confusion in ethics could be resolved by bringing the tools of economics to bear on even more such questions. Second, from the very little I know about e.g. welfare economics, I’ve been surprised by how shy economists are about making fairly conservative philosophical assumptions which might let (sidenote: For instance, as far as I can tell, it’s still rare to assume you can do interpersonal comparisons of utility in welfare economics and social choice theory. So you can talk about which options dominate others, and which resource allocations fall on a Pareto frontier, etc., but a great deal is left open and undecided. This is commendably modest, but in practice I think the gaps get filled in by politics, which is less committed to making sober, numerically literate, assessments.).
- What are the ethical implications of the many-worlds interpretation of quantum mechanics? In other words, if we learned that MWI / the ‘Everett interpretation’ were true, should that change how we should act, ethically speaking? One obvious avenue you could explore is whether there is a sense in which MWI implies that there are many more moral patients at successive times in a way which implies we should massively front-load harms and delay goods (such that far fewer people experience harms and far more experience goods)? Or, because there is a measure over worlds, (sidenote: Yes, I know, this is cartoonish language.) If yes, can this idea be extended to cases where we have computations running on thinner vs thicker wires, etc.? Another avenue is to look at whether MWI implies things for your attitudes to risk. In a crude sense, you could think of MWI translating subjective probabilities of determinate outcomes (this coin will definitely either come up H or T) into a spread of actual outcomes (this coin will come up both H and T). Does risk aversion make sense when this happens? Also, you could consider the implications for personal identity. Parfit talks a lot about the possibility of ‘fissioning’ into multiple people, and what that implies. But MWI says that this is actually (and constantly) happening. Finally, my impression is that there are still some open questions about decision-theoretic approaches to getting probabilities (sidenote: I tweeted about this recently, reflecting on this excellent interview with David Wallace.)
- How can we model ‘epistemic risks’? The world probably becomes much riskier (and just worse) when its (sidenote: Credit to Carla Zoe Cremer for these ideas.) We’re familiar with stories where the way that e.g. social networks are structured, certain content is prioritised, etc., can lead to quite harmful feedback dynamics (e.g. ‘echo chambers’, online radicalisation, polarisation). Can we come up with simple (computational) models of these dynamics and learn something interesting from them? What could worst-case ‘cascades’ look like, where harmful falsehoods win out over (sidenote: Also, what interventions could these models recommend? One interesting feature of e.g. designing social media platforms is that the network structure is endogenous — it can be changed. Is there a way to set up the system/structure to make it more robust to epistemic risks, like ‘cascades’ of false information?)
- How big a deal is space debris? The US Space Surveillance Network so far tracks 27,000 pieces of debris in LEO, but a far greater number of smaller pieces remain to be tracked: roughly half a million pieces of debris the size of a marble or large, and 100 million one millimeter or larger. The amount of debris in space will only grow as the number of satellites in LEO is set to more than triple by 2028. Depositing space debris imposes a negative externality on everything else in or passing though that orbit: it makes (critical) damage more likely, and so it becomes costlier to put things in orbit. It also makes it a bit riskier to pass through orbit during a launch. But how risky? How much more expensive? Just how bad is this likely to get? Further: when space debris in LEO becomes dense enough, collisions between bits of junk could cause an avalanche-like cascade called ‘Kessler syndrome’. But is this remotely likely to happen in the foreseeably future?
- What will the economics of space look like? Will space exploration ever pay for itself? I think this one is really important and under-explored. The only space companies that are currently sustaining themselves are in the business of designing, building, operating, and launching satellites (mostly for communications). The rest of the money for space comes from governments (not expecting a return), and a bit comes from seed capital from speculative investors. If venturing into space is to become a growing, self-sustaining activity, it’s got to eventually pay for itself. But how and when could this happen? There are vague expectations that we could mine asteroids and planetary bodies for valuable resources, but what could the supply chains look like? It would be great if someone did more than a surface-level analysis of all this.
- What is the most likely offensive / defensive balance in space? The dynamics of military attack and defence are likely to look very different to Earth’s, for obvious reasons. But different in which ways? I’m not so interested in the specific weapons technologies (the kind of thing that’s hard to forecast), but rather the features of space that could make things systematically different (e.g. you can’t really sneak up on a planet). Gwern’s post on this is a good place to start.
- Finally, I would love to see an accessible summary of things we can confidently say about aliens. In the last decade or so, I think we’ve learned that we can say quite a lot of sensible stuff about aliens, in particular about how ubiquitous they are likely to be. Here and here are good examples of the genre (although I have some misgivings about grabby aliens).
- Which ideas came late? In the history of ideas, some advances are more contigent than others. Cases of ‘multiple discovery’ suggest certain ideas were waiting to be had, and so not especially contingent. Some ideas arrived much earlier than they had any right to, because of a fortuitous insight or unusual individual. But which contingent ideas arrived surprisingly late?
- What can philanthropists learn from the success of the neoliberals? And the immediate caveat that by ‘success’ I mean that the neoliberals successfully implemented their ideas at the highest levels of government. I do not mean that this was a good thing, just that it might be useful to understand how that happened.
- Why aren’t more EAs worried about cybersecurity? Nation states have started conducting very sophisticated cyber attacks, and I think already their extent and damage is quite under-appreciated. It seems very likely to me that many worst-case catastrophes (involving e.g. engineered pandemics, nuclear exchange, disabled infrastructure) will involve cyber attacks as a factor. It also seems like philanthropic spending could make things better, such as by funding or supporting open-source security packages that large parts of the world depend on. Plus there appears to be a cybersecurity skills shortage. Effective altruists have expressed a lot of interest in information security, but my understanding is that (sidenote: For obvious reasons, I’m sure part of the explanation here is that folks are interested, they just don’t yell about it online.) What’s going on here? How could large amounts of philanthropic spending help us be more resilient to the kind of cyber attack that could lead to major catastrophes?
Feel free to email me about any of these!