Roam Research Notes on “SELF-REFINE: Iterative Refinement with Self-Feedback” by Madaan Et. Al

  • Author:: Madaan Et. Al.
  • Source:: link
  • Review Status:: [[complete]]
  • Recommended By:: [[Andrew Ng]]
  • Anki Tag:: self_refine_iterative_refinement_w_self_feedback_madaan_et_al
  • Anki Deck Link:: link
  • Tags:: #[[Research Paper]] #[[prompting [[Large Language Models (LLM)]]]] #[[reflection ([[Large Language Models (LLM)]])]]
  • Summary

    • Overview
      • SELF-REFINE is a method for improving outputs from large language models (LLMs) through iterative self-feedback and refinement. This approach uses the same LLM to generate an initial output, provide feedback, and refine it iteratively without the need for supervised training or additional data.
    • Key Findings
      • Performance Improvement: Evaluations using GPT-3.5 and GPT-4 across seven tasks show that SELF-REFINE improves performance by about 20%. Outputs are preferred by humans and score better on metrics.
      • Complex Task Handling: LLMs often struggle with complex tasks requiring intricate solutions. Traditional refinement methods need domain-specific data and supervision. SELF-REFINE mimics human iterative refinement, where an initial draft is revised based on self-feedback.
      • Iterative Process: The process uses two steps: FEEDBACK and REFINE, iterating until no further improvements are needed.
    • Specific Task Performance
      • Strong Performance:
        • Constrained Generation: Generating a sentence containing up to 30 given concepts. Iterative refinement allows correction of initial mistakes and better exploration of possible outputs.
        • Preference-based Tasks: Dialogue Response Generation, Sentiment Reversal, Acronym Generation. Significant gains due to improved alignment with human preferences.
      • Weaker Performance:
        • Math Reasoning: Difficulty in accurately identifying nuanced errors in reasoning chains.
    • Additional Insights
      • Avoiding Repetition: SELF-REFINE avoids repeating past mistakes by appending the entire history of previous feedback in the REFINE step.
      • Role-based Feedback: Suggestion to improve results by having specific roles for feedback, like performance, reliability, readability, etc.
        • Related Method: Providing a scoring rubric to the LLM with dimensions over which they should evaluate the output.
      • Specific Feedback Importance: Results are significantly better with specific feedback compared to generic feedback.
      • Iteration Impact: Results improve significantly with the number of iterations (i.e., feedback-refine loops) but with decreasing marginal improvements for each loop. In some cases, like Acronym Generation, quality could improve in one aspect but decline in another. Their solution was to generate numeric scores for different quality aspects, leading to balanced evaluation.
      • Model Size Impact: SELF-REFINE performs well for different model sizes, but for a small enough model (Vicuna-13B), it fails to generate feedback consistently in the required format, often failing even with hard-coded feedback.
    • Relevant [[ChatGPT]] conversations: here, here, here

Roam Notes on The Batch Newsletter (Andrew Ng) – We Need Better Evals for LLM Applications

  • Author:: [[Andrew Ng]]
  • Source:: link
  • Review Status:: [[complete]]
  • Anki Tag:: andrew_ng_the_batch_we_need_better_evals_for_llm_apps
  • Anki Deck Link:: link
  • Tags:: #[[Article]] #[[Large Language Models (LLM)]] #[[evals]] #[[[[AI]] Agents]] #[[Retrieval Augmented Generation (RAG)]]
  • Summary

    • Evaluating Generative AI Applications: Challenges and Solutions
      • Challenges in Evaluation:
        • Evaluating custom AI applications generating free-form text is a barrier to progress.
        • Evaluations of general-purpose models like LLMs use standardized tests (MMLU, HumanEval) and platforms (LMSYS Chatbot Arena, HELM).
        • Current evaluation tools face limitations such as data leakage and subjective human preferences.
      • Types of Applications:
        • Unambiguous Right-or-Wrong Responses:
          • Examples: Extracting job titles from resumes, routing customer emails.
          • Evaluation involves creating labeled test sets, which is costly but manageable.
        • Free-Text Output:
          • Examples: Summarizing customer emails, writing research articles.
          • Evaluation is challenging due to the variability of good responses.
          • Often relies on using advanced LLMs for evaluation, but results can be noisy and expensive.
      • Cost and Time Considerations:
        • [[evals]] can significantly increase development costs.
        • Running [[evals]] is time-consuming, slowing down experimentation and iteration.
      • Future Outlook:
        • Optimistic about developing better evaluation techniques, possibly using agentic workflows such as [[reflection ([[Large Language Models (LLM)]])]].
    • Richer Context for RAG (Retrieval-Augmented Generation)
      • New Development:
        • Researchers at Stanford developed RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval). Link to paper here.
        • RAPTOR provides graduated levels of detail in text summaries, optimizing context within LLM input limits.
      • How RAPTOR Works:
        • Processes documents through cycles of summarizing, embedding, and clustering.
        • Uses SBERT encoder for embedding, Gaussian mixture model (GMM) for clustering, and GPT-3.5-turbo for summarizing.
        • Retrieves and ranks excerpts based on cosine similarity to user prompts, optimizing input length.
      • Results:
        • RAPTOR outperformed other retrievers on the QASPER test set.
      • Importance:
        • Recent LLMs can process very long inputs, but it is costly and time-consuming.
        • RAPTOR enables models with tighter input limits to access more context efficiently.
      • Conclusion:
        • RAPTOR offers a promising solution for developers facing challenges with input context length.
        • This may be a relevant technique to reference if you get around to implement [[Project: Hierarchical File System Summarization using [[Large Language Models (LLM)]]]]
    • Relevant [[ChatGPT]] conversations: here, here

Roam Research Notes on Dwarkesh Patel Conversation with Sholto Douglas & Trenton Bricken – How to Build & Understand GPT-7’s Mind

  • Author:: [[Dwarkesh Patel]], [[Trenton Bricken]], and [[Sholto Douglas]]
  • Source:: link
  • Review Status:: [[complete]]
  • Anki Tag:: dwarkesh_douglas_bricken_build_and_understand_gpt7_mind
  • Anki Deck Link:: link
  • Tags:: #[[Video]] #[[podcast]] #[[Large Language Models (LLM)]]
  • {{[video]: https://www.youtube.com/watch?v=UTuuTTnjxMQ}}
  • [[[[Large Language Models (LLM)]] context length]] (0:00 – 16:12)
    • Importance of context length is underhyped. Throwing a bunch of tokens in context can create similar improvements to [[evals]] as big increases in model scale.
    • [[sample efficiency ([[reinforcement learning]])]]: the ability to get the most out of every sample. E.g. playing PONG, humans understand immediately, while modern reinforcement learning algorithms need 100,000 times more data so they are relatively sample inefficient.
    • Because of large [[[[Large Language Models (LLM)]] context length]], LLMs may have more sample efficiency than we give them credit for – [[Sholto Douglas]] mentioned [[evals]] where the model learned an esoteric human language that wasn’t in its training data.
    • [[Sholto Douglas]] mentions a line of research suggesting [[in-context learning]] might be effectively performing [[gradient descent]] on the in-context data (link to paper). It’s basically performing a kind of meta-learning – learning how to learn.
      • Large [[[[Large Language Models (LLM)]] context length]] creates risks since it can effectively create a whole new model if it’s really doing [[gradient descent]] on the fly.
      • [[Sholto Douglas]] suggests that figuring out how to better induce this meta-learning in pre-training will be important for flexible / adaptive intelligence.
    • [[Sholto Douglas]] suggests that current difficulties with [[[[AI]] Agents]]’s long-term planning are not because of a lack of long [[[[Large Language Models (LLM)]] context length]] – it’s more about the reliability of the model and needing more [[nines of reliability]] since these agents chain a bunch of tasks together, and even a small failure rate implies a large failure rate when you sample many times.
      • The idea behind the NeurIPS paper about emergence being a mirage (link) is related to this idea of [[nines of reliability]] – theres a threshold where you get enough nines of reliability and it looks like a sudden capability when you look at certain metrics but it was actually always there to begin with but the model was not reliable enough to see it. There are apparently better evals now like [[HumanEval]] (link) that have “smoother” evaluations that get around this issue.
      • In my mind this raises the question – if you have a big enough [[[[Large Language Models (LLM)]] context length]], wouldn’t you just leverage that rather than decomposing into a bunch of tasks? #[[Personal Ideas]]
        • No – The nature of longer tasks is that you need to break them down and subsequent tasks depend on previous tasks, so longer tasks performed by [[[[AI]] Agents]] will always need to be broken down into multiple calls.
        • However, larger context for any given call would improve reliability. For example, with each task call to the model, you could build up a large in-context history of what the model has done to give it more context, and you could of course push in more information specific to what that task is trying to solve.
      • Developing [[evals]] for long-horizon tests will be important to understand impact and capabilities of [[[[AI]] Agents]]. [[SWE-Bench]] (link) is a small step in this direction, but GitHub issues is still a sub-hour task.
    • Many people speak of [[quadratic attention costs]], as a reason we can’t have long context windows – but there are ways around it. See this [[Gwern Branwen]] article.
    • [[Dwarkesh Patel]] makes an interesting hypothesis wondering whether learning in context (i.e. the “forward pass” where the model has been pre-trained and predictions are made based on input data) may be more efficient as it resembles how humans actively think and process information as they acquire it rather than passively absorb it.
      • Not sure how far you can push the analogies to the brain – as [[Sholto Douglas]] says, birds and airplanes both achieve the same end but use very different means. However, [[Dwarkesh Patel]]’s point sounds like [[in-context learning]] may be analogous to the frontal cortex region of the brain (responsible for complex cognitive behavior, personality expression, decision-making, moderating social behavior, working memory, speech production), while the pre-trained weights (calculated in the “backward pass” that trains the model via [[backpropagation]]) are analogous to the other regions (responsible for emotional regulation and processing, sensory processing, memory storage and retrieval). See this [[ChatGPT]] conversation.
    • The key to these models becoming smarter is [[meta learning]], which you start to achieve once you pass a certain scale threshold of [[pre-training]] and [[[[Large Language Models (LLM)]] context length]]. This is the key difference between [[GPT-2]] and [[GPT-3]].
    • [[ChatGPT]] conversations related to this section of the conversation: here and here
  • [[Intelligence]] is just associations (16:12 – 32:35)
    • [[Anthropic AI]]’s way of thinking about [[transformer model]]
      • Think of the [[residual ([[neural net]])]] as it passes through the neural network to predict the next token like a boat floating down a river that takes in information streams coming off the river. This information coming in comes from [[attention heads ([[neural net]])]] and [[multi-layer perceptron (MLP) ([[neural net]])]] parts of the model.
      • Maybe what’s happening is early in the stream it processes basic, fundamental things, in the model it’s adding information on ‘how to solve this’, and then in the later stages doing the work to convert it back to an output token.
      • The [[cerebellum]] behaves kind of like this – inputs route through it but they can also go directly to the end point the cerebellum “module” contributes to – so there are indirect and direct paths where it can pick up information it wants and add it in.
        • The [[cerebellum]] is associated with fine motor control, but the truth is it lights up for almost any task in a [[fMRI]] scan, and 70% of your neurons are there.
        • [[Pentti Kanerva]] developed an associative memory algorithm ([[Sparse Distributed Memory (SDM)]]) where you have memories, want to store them, and retrieve them to get the best match while dealing with noise / corruption. Turns out if you implement this as an electrical circuit it looks identical to the core [[cerebellum]] circuit. (Wikipedia link)
    • [[Trenton Bricken]] believes most intelligence is [[pattern matching]] and you can do a lot of great pattern matching with a hierarchy of [[associative memory]]
      • The model can go from basic low-level associations and group them together to develop higher level associations and map patterns to each other. It’s like a form of [[meta-learning]]
      • He doesn’t really state this explicitly, but the [[attention ([[neural net]])]] mechanism is a kind of associative memory learned by the model.
      • [[associative memory]] can help you denoise (e.g. recognize your friend’s face in a heavy rainstorm) but also pick up related data in a completely different space (e.g. the alphabet – seeing A points to B, which points to C, etc.)
      • It should be “association is all you need”, not “attention is all you need”
    • Relevant [[ChatGPT]] conversations: here, here, here, here, and here
  • [[[[Intelligence]] explosion]] and great researchers (32:35 – 1:06:52)
    • This part of the discussion explores whether automating AI researchers can lead to an intelligence explosion in a way that economists overlook (and they are apparently the ones with the formal models on [[[[Intelligence]] explosion]])
    • [[compute]] is the main bounding constraint on an [[[[Intelligence]] explosion]]
    • To me, it’s interesting that most people seem to think their job won’t be fully automated. People often agree that AI could make them much more productive, but the idea that it would completely automate their job they are skeptical of. It’s always other people’s jobs that are supposedly fully automatable (i.e. the jobs you have much less information about and context about what they do). People tend to overlook physical constraints, social constraints, and the importance of “taste” (which I would define as a human touch or high level human guidance to align things so they’re useful to us). This is probably why economists have been able to contribute the most here – all day long they’re thinking about resource constraints and the implications of those constraints. I mean, a common definition of economics is “studying the allocation of resources under constraints” #[[Personal Ideas]]
    • [[Sholto Douglas]] suggests the hardest part of an AI researcher’s job is not writing code or coming up with ideas, but paring down the ideas and shot calling under imperfect information. Complicating matters is the fact that things that work at small scale don’t necessarily work at large scale, as well as the fact that you have limited compute to test everything you dream of. Also, working in an collaborative environment where a lot of people are doing research can slow you down (lower iteration speed compared to when you can do everything yourself).
    • “ruthless prioritization is something which I think separates a lot of quality research from research that doesn’t necessarily succeed as much…They don’t necessarily get too attached to using a given sort of solution that they are familiar with, but rather they attack the problem directly.” – [[Sholto Douglas]]
    • Good researchers have good engineering skills, which enable them to try experiments really fast – their cycle time is faster, which is key to success.
    • [[Sholto Douglas]] suggests that really good data for [[Large Language Models (LLM)]] is data that involved a lot of reasoning to create. The key trick is somehow verifying the reasoning was correct – this is one challenge with generating [[synthetic data]] from LLMs.
    • [[Dwarkesh Patel]] makes an interesting comparison of human language being [[synthetic data]] data that humans create, and [[Sholto Douglas]] adds that the real world is like a built-in verifier of that data. Doesn’t seem like a perfect analogy, but it does occur to me that some of the best “systems” in the world have some kind of built-in verification: capitalism, the scientific method, democracy, evolution, traditions (which I think of as an evolution of memes – the good ones stick).
    • Question I have is the extent to which [[compute]] will always be a constraint. Seems like it will obviously always be required to some extent, but I wonder what these guys think of the likelihood of some kind of model architecture or training method that improves [[statistical efficiency]] and [[hardware efficiency]] so much that, say, you can train a GPT-4 on your laptop in a day?
    • Relevant [[ChatGPT]] conversation here
  • [[superposition]] and secret communication (1:06:52 – 1:22:34)
    • When your data is high-dimensional and sparse (i.e. any given data point doesn’t appear very often), then your model will learn a compression strategy called [[superposition]] so it can pack more features of the world into it than it has parameters. Relevant paper from [[Anthropic AI]] here.
    • This makes interpretability more difficult, since when you see a [[neuron ([[neural net]])]] firing and try to figure out what it fires for, it’s confusing – like firing for 10% of every possible input.
      • This is related to the paper that [[Trenton Bricken]] and team at [[Anthropic AI]] put out called Towards Monosemanticity, which found that if you project the activations into a higher-dimensional space and provide a sparsity penalty, you get very clean features and everything starts to make more sense.
    • They suggest that [[superposition]] means [[Large Language Models (LLM)]] are under-parametrized given the complexity of the task they’re being asked to perform. I don’t understand why this follows.
    • [[knowledge distillation]]: the process of transferring knowledge from a large model to a smaller one
    • Puzzle proposed by [[Gwern Branwen]] (link): [[knowledge distillation]] gives smaller models better performance – why can’t you just train these small models directly and get the same performance?
      • [[Sholto Douglas]] suggests it’s because distilled models get to see the entire vectors of probabilities for what the next token is predicted to be. In contrast, training just gives you a one hot encoded vector of what the next token should have been, so the distilled model gets more information or “signal”.
      • In my mind, it’s not surprising that [[knowledge distillation]] would be more efficient given a certain amount of training resources. But do researchers find it’s better given any amount of training for the smaller model? That seems much less intuitive, and if that’s the case, what is the information being “sent” to the smaller model that can’t be found through longer training?
    • [[adaptive compute]]: spending more cycles thinking about a problem if it is harder. How is it possible to do this with [[Large Language Models (LLM)]]? The forward pass always does the same compute, but perhaps [[chain-of-thought (CoT)]] or similar methods are kind of like adaptive compute since they effectively produce more forward passes.
    • [[chain-of-thought (CoT)]] has been tested to have some strange behaviour, such as giving the right answer even when the chain of thought reasoning is patently wrong or giving the wrong answer it was trained to give and then provide a plausible sounding but wrong explanation. E.g. this paper and this paper
    • Relevant [[ChatGPT]] conversations: here
  • [[[[AI]] Agents]] and true reasoning (1:22:34 – 1:34:40)
    • [[Dwarkesh Patel]] raised question of whether agents communicating via text is the most efficient method – perhaps they should share [[residual ([[neural net]])]] streams.
      • [[Trenton Bricken]] suggests a good half-way measure would be using features you learn from [[Sparse Dictionary Learning (SDL)]] – more internal access but also more human interpretable.
    • Will the future of [[[[AI]] Agents]] be really long [[[[Large Language Models (LLM)]] context length]] with “[[adaptive compute]]” or instead will it be multiple copies of agents taking on specialized tasks and talking to one another? Big context or [[division of labour]]?
      • [[Sholto Douglas]] leans towards more agents talking to each other, at least in the near term. He emphasizes that it would help with interpretability and trust. [[Trenton Bricken]] mentions cost benefits as well since individual agents could be smaller and [[fine-tuning]] them makes them accurate.
      • Maybe in the long run the dream of [[reinforcement learning]] will be fulfilled – provide a very sparse signal and over enough iterations [[[[AI]] Agents]] learn from it. But in the shorter run, these will require a lot of work from humans around the machines to make sure they’re doing what we want.
    • [[Dwarkesh Patel]] wonders whether language is actually a very good representation of ideas as it has evolved to optimize human learning. [[Sholto Douglas]] adds that compared to “next token prediction”, which is a simple representation, representations in [[machine vision]] are more difficult to get right.
    • Some evidence suggests [[fine-tuning]] a model on generalized tasks like math, instruction following, or code generation enhances language models’ performance on a range of other tasks.
      • This raises the question in my mind – are there other tasks we want [[Large Language Models (LLM)]] to do where we might achieve better results by [[fine-tuning]] on a seemingly unrelated area? Like if we want a model to get better at engineering, should we fine tune on constructing a lego set, since so many engineers seem to have played with lego as a kid? What does real-world empirics about learning and performance tell us about where we should be fine-tuning? #[[Personal Ideas]]
    • Relevant [[ChatGPT]] conversations: here and here
  • How [[Sholto Douglas]] and [[Trenton Bricken]] got into [[AI]] research (1:34:40 – 2:07:16) #[[Career]]
    • [[Trenton Bricken]] has had significant success in interpretability, contributing to very important research and has only been in [[Anthropic AI]] for 1.5 years. He attributes this success to [[luck]], ability to execute on putting together and quickly testing existing research ideas already lying around, headstrongness, willingness to push through when blocked where others would give up, and willingness to change direction.
    • [[Sholto Douglas]] agrees with those qualities for success (hard work, agency, pushing), but also adds that he’s benefited from being good at picking extremely high-leverage problems.
    • In organizations you need people that care and take direct responsibility to get things done. This is often why projects fail – nobody quite cares enough. This is one purpose of consulting firms like [[McKinsey]] ([[Sholto Douglas]] started there) – allows you to “hire” people you wouldn’t otherwise be able to for a short window where they can push through problems. They also are given direct responsibility as consultants which speaks to his first point.
    • [[Sholto Douglas]] also hustled – worked from 10pm-2am and 6-8 hours a day on the weekends to work on research and coding projects. [[James Bradbury]] (who was at [[Google]] but now at [[Anthropic AI]]) saw [[Sholto Douglas]] was asking questions online that he thought only he was interested in, saw some robotics stuff on his blog, and then reached out to see if he wanted to work there. “Manufacture luck”
      • Another advantage of this fairly broad reading / studying he was doing was it gave him the ability to see patterns across different subfields that you wouldn’t get by just specializing in say, [[Natural Language Processing]].
    • One lesson here emphasized by [[Dwarkesh Patel]] is that the world is not legible and efficient. You shouldn’t just go to jobs.google.com or whatever and assume you’ll be evaluated well. There are other, better ways to put yourself in front of people and you should leverage that. Seems like it’s particularly valuable to do if you don’t have a “standard” background or look really good just on paper with degrees from Stanford or whatever. Put yourself out there and demonstrate you can do something at a world-class level.
      • This is what [[Andy Jones]] from [[Anthropic AI]] did with a paper on scaling laws and board games – when he published this, both Anthropic and [[OpenAi]] desperately wanted to hire him.
      • Another example is [[Simon Boehm]], who wrote a blog post which in [[Sholto Douglas]] view is the reference for optimizing a CUDA map model on a GPU.
      • “The system is not your friend. It’s not necessarily actively against you or your sworn enemy. It’s just not looking out for you. So that’s where a lot of proactiveness comes in. There are no adults in the room and you have to come to some decision for what you want your life to look like and execute on it.” -[[Trenton Bricken]]
      • “it’s amazing how quickly you can become world-class at something. Most people aren’t trying that hard and are only working the actual 20 hours or something that they’re spending on this thing. So if you just go ham, then you can get really far, pretty fast” – [[Trenton Bricken]]
    • Relevant [[ChatGPT]] conversation here
  • Are [[features]] the wrong way to think about [[intelligence]] (2:07:16 – 2:21:12)
    • [[Dwarkesh Patel]] and [[Trenton Bricken]] explore what a feature is in these large neural networks. A “feature” in a standard logistic regression model is quite clear and explicit – it’s just one of the terms in the regression.
    • [[ChatGPT]] provides a good answer here that helps resolve the confusion in my mind. It still makes sense to think of the model in terms of features, except in a [[neural net]], the features are learned rather than being explicitly specified. Each layer in a neural net can learn an increasingly complex and abstract set of features.
    • What would be the standard where we can say we “understand” a model’s output and the reasons it did what it did, ensuring it was not doing anything duplicitous?
      • You need to find features for the model at each level (including attention heads, residual stream, MLP, attention), and hopefully identify broader general reasoning circuits. To avoid deceptive behaviour, you could flag features that correspond to this kind of behaviour.
    • Relevant [[ChatGPT]] conversations: here, here, and here
  • Will [[[[AI]] interpretability]] actually work on superhuman models (2:21:12 – 2:45:05)
    • One great benefit of these [[Large Language Models (LLM)]] in terms of interpretability is they are deterministic, or you can at least make them deterministic. It’s like this alien brain you can operate on by ablating any part of it you want. If it does something “superhuman”, you should be able to decompose it into smaller spaces that are understandable, kind of like how you can understand superhuman chess moves.
    • Essentially, [[Trenton Bricken]] is hopeful that we can identify “bad” or “deceptive” circuits in [[Large Language Models (LLM)]] and essentially lobotomize them in those areas.
    • One interesting way he suggests of doing this is fine-tuning a model to have bad behaviour, and then use this bad model to identify the parts of the feature space that have changed.
    • There are similarities of features across different models that have been found. E.g. there are [[Base64]]-related features that are very common that fire for and model Base64 encoded text (common in URLs).
      • Similarity is measured using [[cosine similarity]] – which is a measure of similarity between two non-zero vectors defined in an inner product space. Takes a value in [-1, 1], where -1 represents vectors in the opposite direction, 0 represents orthogonal vectors, and 1 represents identical vectors. Formula: (A • B) / (||A||||B||)
    • [[curriculum learning]]: training a model in a meaningful order from easy examples to hard examples, mimicking how human beings learn. This paper is a survey on this method – it seems to come with challenges and it’s unclear whether it’s currently used much to train models, but it’s a plausible avenue for future models to use to improve training.
    • [[feature splitting]]: the models tend to learn however many features it has capacity for that still span the space of representation. E.g. basic models will learn a “bird” feature, while bigger models learn features for different types of birds. “Oftentimes, there’s the bird vector that points in one direction and all the other specific types of birds point in a similar region of the space but are obviously more specific than the coarse label.”
      • The models seems to learn [[hierarchy]] – which is a powerful model for understanding reality and organizing a bunch of information so it is sensible and easily accessible. #[[Personal Ideas]]
    • [[Trenton Bricken]] makes the distinction between the [[weights ([[neural net]])]] that represent the trained, fixed parameters of the model and the [[activations ([[neural net]])]] which represents the actual results from making a specific call. [[Sholto Douglas]] makes the analogy that the weights are like the actual connection scheme between neurons, and the activations are the current neurons lighting up on a given call to the model. [[Trenton Bricken]] says “The dream is that we can kind of bootstrap towards actually making sense of the weights of the model that are independent of the activations of the data”.
    • [[Trenton Bricken]]’s work on [[[[AI]] interpretability]] uses a sparse autoencoding method which is unsupervised and projects the data into a wider space of features with more detail to see what is happening in the model. You first feed the trained model a bunch of inputs and get [[activations ([[neural net]])]], then you project into a higher dimensional space.
      • The amount of detail you want to determine from a feature is determined by the [[expansion factor ([[neural net]])]] which represents how many times bigger the dimensionality is of the space you’re projecting to compared to the original space. E.g. if you have 1000 neurons and projecting to a 2000 dimensional space, the expansion factor is 2. The amount of features you “see” in the space you’re projecting to depends on the size of this expansion factor.
    • [[neuron ([[neural net]])]] can be polysemantic, meaning that they can represent multiple meanings or functions simultaneously. This polysemy arises because of “[[superposition]],” where multiple informational contents are superimposed within the same neuron or set of neurons. [[Trenton Bricken]] mentions that if you only look at individual neurons without considering their polysemantic nature, you can miss how they might code multiple features due to their superposition. Disentangling this might be the key to understanding the “role” of the Experts in [[mixture of experts (MoE)]] used in the recent [[Mistral ([[AI]] company)]] model – they could not determine the “role” of the experts themselves, so it’s an open question.
    • [[ChatGPT]] notes here
  • [[Sholto Douglas]] challenge for the audience (2:45:05 – 3:03:57)
    • A good research project [[Sholto Douglas]] challenges the audience with is to disentangle the neurons and determine the roles of the [[mixture of experts (MoE)]] model by [[Mistral ([[AI]] company)]], which is open source. There is a good chance there is something to discover here, since image models such as [[AlexNet]] have been found to have specialization that you can clearly identify.
  • Rapid Fire (3:03:57 – 3:11:51)
    • One rather disappointing point they make is that a lot of the cutting edge research on issues like multimodality, long-context, agent, reliability is probably not being published if it works well. So, published papers are not necessarily a great source to get to the cutting edge. This raises teh question – how do you get to the cutting edge without working inside one of these tech companies?
      • [[Sholto Douglas]] mentions that academia and others outside the inner circle should work more on [[[[AI]] interpretability]], which is legible from the outside, and places like [[Anthropic AI]] publishes all its research. It also typically doesn’t require a ridiculous amount of resources.

Notes on The Kimball Group Reader Chapter 1: The Reader at a Glance

  • Author:: [[Ralph Kimball]], [[Margy Ross]]
  • Reading Status:: #complete
  • Review Status:: #[[complete]]
  • Tags:: #books #[[dimensional modeling]]
  • Source:: link
  • Roam Notes URL:: link
  • Anki Tag:: kimball_group_reader kimball_group_reader_ch_1
  • Anki Deck Link:: link
  • Setting up for Success
    • 1.1 Resist the Urge to Start Coding ([[Ralph Kimball]], DM Review, November 2007) (Location 944)
      • Before writing any code or doing any modelling or purchasing related to your data warehouse, make sure you have a good answer to the following 10 questions:
        • [[Business Requirements]]: do you understand them? (Most fundamental and far-reaching question)
        • [[Strategic Data Profiling]]: are data assets available to support business requirements?
        • [[Tactical Data Profiling]]: Is there executive buy-in to support business process changes to improve data quality?
        • [[Integration]]: Is there executive buy-in and communication to define common descriptors and measures?
        • [[Latency]]: Do you know how quickly data must be published by the data warehouse?
        • [[Compliance]]: which data is compliance-sensitive, and where must you have protected chain of custody?
        • [[Data Security]]: How will you protect confidential or proprietary data?
        • [[Archiving Data]]: How will you do long-term archiving of important data and which data must be archived?
        • [[Business User Support]]: Do you know who the business users are, their requirements and skill level?
        • [[IT Support]]: Can you rely on existing licenses in your organization, and do IT staff have skills to support your technical decisions?
    • 1.2 Set Your Boundaries ([[Ralph Kimball]], DM Review, December 2007) (Location 1003) #[[Business Requirements]] #[[setting boundaries]]
      • This article is a discussion of setting clear boundaries in your data warehousing project to avoid taking on too many requirements.
  • Tackling DW/BI Design and Development
    • This group of articles focuses on the big issues that are part of every DW/BI system design. (Location 1071)
    • 1.3 Data Wrangling ([[Ralph Kimball]], DM Review, January 2008) (Location 1074) #[[data wrangling]] #[[data extraction]] #[[data staging]] #[[change data capture]]
      • [[data wrangling]] is the first stage of the data pipeline from operational sources to final BI user interfaces in a data warehouse. It includes [[change data capture]], [[data extraction]], [[data staging]], and [[data archiving]] (Location 1076)
      • [[change data capture]] is the process of figuring out exactly what data changed on the source system that you need to extract. Ideally this step would be done on source production system. (Location 1080). Two approaches:
        • Using a change_date_time field in the source: a good option, but will miss record deletion and any override of the trigger producing the change_date_time field. (Location 1089)
        • Production system daemon capturing every input command: This detects data deletion but there are still DBA overrides to worry about. (Location 1089)
        • Ideally you will also get your source production system to provide a reason data changed, which tells you how the attribute should be treated as a [[slowly changing dimension (SCD)]]. (Location 1098) #[[change data capture]]
        • If you can’t do [[change data capture]] on source production system, you’ll have to do it after extraction, which means downloading larger data sets. (Location 1103)
          • Consider using [[cyclic redundancy checksum (CRC)]] to significant improve performance of the data comparison step here.
      • [[data extraction]]: The transfer of data from the source system into the DW/BI environment. (Location 1117)
        • Two main goals in the [[data extraction]] step:
          • Remove proprietary data formats
          • Move data into [[flat files]] or [[relational tables]] (eventually everything loaded into relational tables, but flat tiles can be processed very quickly)
      • [[data staging]]: [[Ralph Kimball]] recommends staging ALL data: save the data the DW/BI system just received in original target format you chose before doing anything else to it. (Location 1126)
      • [[data archiving]]: this is important for compliance-sensitive data where you have to prove data received hasn’t been tampered with. Techniques here include using a [[hash code]] to show data hasn’t changed. (Location 1129)
    • 1.4 Myth Busters ([[Ralph Kimball]], DM Review, February 2008) (Location 1135) #[[dimensional modeling]]
      • Addresses various myths related to [[dimensional modeling]]
      • Myth: A dimensional model could be missing key relationships that exist only in a true relational view. (Location 1142)
        • In fact, dimensional models contain all the data relationships that normalized models have.
      • Myth: dimensional models are not sufficiently extensible and do not accommodate changing [[business requirements]]. (Location 1149)
        • It’s the opposite: normalized models are much harder to change when data relationships change. [[slowly changing dimension (SCD)]] techniques provide the basis for models to meet changing [[business requirements]].
      • Myth: dimensional models don’t capture data at sufficient level of granularity / detail (Location 1177)
        • In fact, models should capture measurement events in [[fact tables]] at the lowest possible grain.
    • 1.5 Dividing the World ([[Ralph Kimball]], DM Review, March 2008) (Location 1188)
      • Two main entities in [[dimensional modeling]] (Kimball estimates 98% of data can be immediately and obviously categorized as one of these):
        • [[dimension ([[dimensional modeling]])]]: the basic stable entities in our environment, such as customers, products, locations, marketing promotions, and calendars. In end user BI tools, dimensions are primarily used for constraints and row headers.
        • [[fact ([[dimensional modeling]])]]: Numeric measurements or observations gathered by all of our transaction processing systems and other systems. In end user BI tools, dimensions are primarily used for computations.
          • [[fact table grain]]: Description of measurement in physical, real-world terms – a description of what each row in the [[fact table]] represents. (Location 1209) There is sometimes a temptation to add facts not true to the grain to shortcut a query, but this often introduces complexity and confusion for business users. (Location 1218)
          • A [[fact ([[dimensional modeling]])]] should be additive whenever possible – it should make sense to add facts across records. A common example here is storing extended price (i.e. price * quantity) instead of just price in a fact table where the measurement is retail sale (Location 1224)
      • A distinct characteristic of [[dimensional modeling]] is not using [[normalized data]]. Normalized models are great in transaction processing systems, but they are not understandable by business users. Dimensional models, correctly designed, contain exactly the same data and reflect the same business rules, but are more understandable. [[understandability]] is a central goal of a BI system used by business users. (Location 1249)
    • 1.6 Essential Steps for the Integrated [[Enterprise Data Warehouse (EDW)]] ([[Ralph Kimball]], DM Review, April 2008 and May 2008) (Location 1254) #[[data integration]]
      • This section provides an overall architecture for building an integrated [[Enterprise Data Warehouse (EDW)]] which supports [[Master Data Management (MDM)]] and and has the mission of providing a consistent business analysis platform for an organization. (Location 1258)
      • Essential act of the [[Enterprise Data Warehouse (EDW)]] is [[drilling across]]: gathering results from separate [[business process subject area]]s and combine them into a single analysis. (Location 1282)
      • A key prerequisite to developing the [[Enterprise Data Warehouse (EDW)]] is a significant commitment and support from top-level mangement on the value of data integration. (Location 1303)
      • Having an existing [[Master Data Management (MDM)]] project is a good sign of executive buy-in for data integration, and significantly simplifies data warehouse [[data integration]]. (Location 1308)
      • [[conformed dimensions]] and [[confirmed facts]] provide the basis for [[data integration]] (Location 1316)
        • [[conformed dimensions]]: two dimensions are conformed if they contain one or more common fields whose contents are drawn from the same domains. (Location 1318) Typical examples: customer, product, service, location, employee, promotion, vendor, and calendar. (Location 1367)
        • [[conformed facts]]: numeric measures that have the same business and mathematical interpretations so that they may be compared and computed against each other consistently. (Location 1320)
      • [[enterprise data warehouse (EDW) bus matrix]]: two-dimensional matrix with [[business process subject area]] on the vertical axis and [[dimension tables]] on horizontal axis. (Location 1324) An X in the matrix represents where a subject area uses a dimension. It helps you prioritize development of separate subject areas and identify possible scope of [[conformed dimensions]]. "The columns of the bus matrix are the invitation list to the conformed dimension design meeting." (Location 1333) This is an important item to send to senior management to review before conformed dimension design meetings. "If senior management is not interested in what the bus matrix implies, then to make a long story short, you have no hope of building an integrated EDW." (Location 1335)
        • Note that the different stakeholders don’t have to give up their domain specific private attributes that they need – stakeholders just need to agree on the [[conformed dimensions]]. (Location 1341)
        • Even when you get senior management full buy-in, there is a lot of operational management involved in the [[Enterprise Data Warehouse (EDW)]], including two abstract figures: the [[dimension manager]] (builds and distributes a conformed dimension to the rest of the enterprise) and the [[fact provider]] (downstream client to the dimension manager who receives and utilizes the conformed dimension, almost always while managing one or more fact tables within a subject area). (Location 1347)
    • 1.7 Drill Down to Ask Why [[Ralph Kimball]], DM Review, July 2008 and August 2008 (Location 1481) #[[decision making]]
      • Important to understand how your data warehousing system drives decision-making, not just your technical architecture.
      • [[Bill Schmarzo]] architecture for decision making, aka [[analytic application process]]: (Location 1489)
        1. Publish reports.
        2. Identify exceptions.
        3. Determine causal factors. Seek to understand the “why” or root causes behind the identified exceptions. Main ways you might do this: #[[causality]] #[[determining causality]]
          • Get more detail
          • Get a comparison
          • Search other data sets
          • Search the web for information about the problem
        4. Model alternatives. Provide a backdrop to evaluate different decision alternatives.
        5. Track actions. Evaluate the effectiveness of the recommended actions and feed the decisions back to both the operational systems and DW, against which published reporting will occur, thereby closing the loop.
    • 1.8 Slowly Changing Dimensions [[Ralph Kimball]], DM Review, September 2008 and October 2008 (Location 1557) #[[slowly changing dimension (SCD)]]
      • The Original Three Types of [[slowly changing dimension (SCD)]] cover all the responses required for a revised or updated description of a dimension member (Location 1569)
        • [[type 1 slowly changing dimension (SCD)]]: Overwrite
        • [[type 2 slowly changing dimension (SCD)]]: Add a New Dimension Record
        • [[type 3 slowly changing dimension (SCD)]]: Add a New Field
    • 1.9 Judge Your BI Tool through Your Dimensions – [[Ralph Kimball]], DM Review, November 2008 (Location 1650) #[[BI tools]] #[[BI tool selection]] #[[dimension tables]]
      • [[dimension tables]] implement the [[UI]] of your BI system: they provide the labels, the groupings, the drill-down paths.
      • This article describes [[requirements]] a BI tool should be able to meet with dimensions:
        • Assemble a BI query or report request by first selecting [[dimension table attributes]] and then selecting [[facts (dimensional modelling)]] to be summarized.
        • [[drilling down]] by adding a row header
        • Browse a dimension to preview permissible values and set constraints
        • Restrict the results of a dimension browse with other constraints in effect
        • [[drilling across]] by accumulating measures under labels defined by conformed dimension attributes
    • 1.10 [[fact tables]] – [[Ralph Kimball]], DM Review, December 2008 (Location 1707)
      • [[fact tables]] contain the fundamental measurements of the enterprise and are the target of most data warehouse queries.
      • Design rules for [[fact tables]]:
        • Stay true to the [[fact table grain]] – take care in defining the [[grain (dimensional modelling)]] – what a single record in the fact table represents. This is the first and most important design step. It ensures the [[foreign keys]] in the fact table are grounded and precise.
        • Build up from the lowest possible [[fact table grain]]. This ensures you have the most complete set of [[dimension tables]] that can describe the fact table and enables detailed [[drilling down]] for the user.
      • 3 types of [[fact tables]]: (Location 1741)
        • [[transaction grain [[fact table]]]]: Measurement taken at a single instance (e.g. each cash register beep). Transactions can happen after a millisecond or next month or never – they’re unpredictably sparse or dense.
        • [[periodic snapshot grain [[fact table]]]]: Facts cover a predefined span of time. Powerful guarantee: all reporting entities will appear in each snapshot, even if there is no activity – it’s predictably dense and applications can rely on certain key combinations being available.
        • [[accumulating snapshot grain [[fact table]]]]: Rows represent a predictable process with a well-defined beginning and end (e.g. order processing, claims processing).
    • 1.11 Exploit Your [[fact tables]] – [[Ralph Kimball]], DM Review, January/February 2009 (Location 1765)
      • This article describes basic ways to exploit the 3 main fact table designs in the front room and in the back room.
      • Front Room: [[aggregate navigation]] – choosing to give the user pre-aggregated data at run time, without without the end user knowing the difference. Seamlessly provide aggregated and detailed atomic data.
      • Front Room: [[drilling across]] Multiple Fact Tables at Different Grains – you can do this as long as you choose [[conformed dimensions]] for the answer set row headers that exist for all the fact tables in your integrated query.
      • Front Room: Exporting Constraints to Different Business Processes – building connections to other [[business process subject area]] in the [[UI]] so you can explore related data in a single click or swipe.
      • Back Room: [[fact table surrogate keys (FSKs)]] – sometimes you want to do this for one of the following benefits
        • Uniquely and immediately identify single fact records.
        • FSKs assigned sequentially so a load job inserting new records will have FSKs in a contiguous range.
        • An FSK allows updates to be replaced by insert-deletes.
        • An FSK can become a foreign key in a fact table at a lower grain.

Notes on “Why Take Notes” by Mark Nagelberg

  • Author:: [[Mark Nagelberg]]
  • Source:: link
  • Reading Status:: [[complete]]
  • Review Status:: [[complete]]
  • Anki Tag:: nagelberg_why_take_notes
  • Anki Deck Link:: link
  • Blog Notes URL:: link
  • Tags:: #[[Spaced Repetition Newsletter]] #[[Blog Posts]] #[[PKM]] #[[note-taking]] #[[triple-pass system]] #retrieval #elaboration #[[knowledge management]] #[[articles]]
  • Notes

    • Two pillars of the "Triple-Pass System":
      • Note-taking
      • Spaced repetition
    • Why note-taking is necessary:
      • Preparing for [[Spaced Repetition]]
        • You don’t want to add directly to spaced repetition on first read – you’ll add too much unnecessary information or miss important context.
      • [[retrieval]] and [[elaboration]] practice
        • Reviewing, consolidating, and connecting your notes involves both retrieval and elaboration, which are beneficial for learning.
      • Computer-aided information [[retrieval]] and [[idea generation]]
        • Your notes can store more information and detail.
        • Digital search tools make it easy to look up information in your notes as well as find unexpected connections and insights you wouldn’t get from memory alone.

Notes on “3 Things I Wish I did as a Junior Dev” by Theo Browne

  • {{[video]: https://www.youtube.com/watch?v=1rC4cTRZeWc}}
  • Author:: [[Theo Browne]]
  • Reading Status:: [[complete]]
  • Review Status:: [[complete]]
  • Tags:: #Video #programming #learning #[[Career]]
  • Blog Notes URL:: https://www.marknagelberg.com/notes-on-3-things-i-wish-i-did-as-a-junior-dev-by-theo-browne/
  • Roam Notes URL:: link
  • Anki Tag:: theo_browne_3_junior_dev_tips
  • Anki Deck Link:: link
  • Notes

    • Overview: [[Theo Browne]] talks about the 3 main tactics he used when he was a new developer to level up extremely fast.
    • Tip 1: Try to Get On Call
      • Extremely valuable to see how things go wrong, and how they are fixed when they do.
    • Tip 2: You Don’t Learn Codebases in the Code Tab on GitHub. You Learn Codebases on the Pull Request Tab on GitHub. #[[pull requests]] [[GitHub]]
      • It helps you get critical [[context]] to see how code changes, how teams work in a codebase, what features are being developed, and why. This is what helps you build a mental map around the codebase to become a successful contributor.
    • Tip 3: Interview More #[[interviews]]
      • Do more interviews at other companies to see where you stand, but more importantly, do interviews yourself of prospective employees. You learn a lot about how good a developer you are, what expectations are, and what makes a good engineer. The more interviews you do on both sides, the more you understand the field overall.

Notes on “The Year of Fukuyama” by Richard Hanania

  • Title:: The Year of Fukuyama
  • Author:: [[Richard Hanania]]
  • Reading Status:: #complete
  • Review Status:: #[[complete]]
  • Tags:: #articles #[[politics]] #[[democracy]] #[[political science]]
  • URL:: https://richardhanania.substack.com/p/the-year-of-fukuyama
  • Source:: #instapaper
  • Roam Notes URL:: https://www.marknagelberg.com/notes-on-the-year-of-fukuyama-by-richard-hanania/
  • Anki Tag:: hanania_year_of_fukuyama
  • Anki Deck Link:: link
  • Notes

    • Many incorrectly misunderstand [[Francis Fukuyama]] as saying nothing will ever happen again. His argument was not that there would be no wars or genocide, but there would be no serious alternative to liberal [[democracy]]. (View Highlight) #[[Ankified]]
    • Before 2022, experts were bullish on some non-democratic states: #[[Ankified]]
      • Experts have spoken seriously about the advantages of the “[[[[China]] Model]]”: technocratic skill and political meritocracy over voting and mobilized citizenry. (View Highlight)
        • Their response to [[COVID-19]] was often trotted out as making the case for the model. E.g. [[New York Times]] reporting that life in [[China]] was back to normal in September 2020, compared to the West. (View Highlight)
      • Experts also were optimistic about [[Russia]]’s economic and geopolitical prospects (believing they will become a mid-tier European power). (View Highlight)
    • In 2022, it seems these threats to liberal democracy have collapsed in different ways, suggesting Western societies are far more robust. (View Highlight) #[[Ankified]]
      • [[China]] is sticking stubbornly with [[Zero Covid]] strategy, and is taking draconian measures to enforce it. This makes absolutely no sense in any [[cost-benefit analysis]], given vaccines and the new contagious variants. You could argue other terrible things they do, such as their treatment of [[Uighurs]], is "rational" and doesn’t prevent them maintaining growth and influence. But Zero Covid is simply stupid, bad strategy. (View Highlight) [[Peter Thiel]] said China is limited by it’s autistic and profoundly uncharismatic nature and [[Richard Hanania]] sees Zero Covid as evidence this is true: "I used to think that China could be the kind of autist that builds SpaceX. Instead, it’s the kind that is afraid to look strangers in the eye and stays up all night playing with his train collection." (View Highlight)
      • [[China]] is now more hostile to free markets which is what helped it succeed in the first place (see disappearing billionaires and overnight destruction of entire industries). It’s bad for government control and gets in the way of serving the state. (View Highlight)
      • [[Russia]] made a major blunder entering [[Ukraine]], which will make it certain to be poor and backwards for years as the West cuts it off. (View Highlight)
        • "It’s easy to mock Ukraine as a “current thing.” But we shouldn’t trivialize the strength of the Western reaction to the Russian invasion. This isn’t like the rise of zhe/zir pronouns or some new DEI initiative. Western leaders, with the support of both public and elite opinion, came together and formed a united front against an instance of international aggression, and helped a nation practically everyone thought would collapse or become a satellite of its neighbor maintain its independence. These societies did all this while having to make massive economic sacrifices, with countries in Europe wondering whether they will even have enough energy to heat their homes in the winter." (View Highlight)
    • As a result, "normie theories of [[democracy]]" seem to be correct (i.e. democracy provides checks and balances, peaceful transfer of power, peaceful correction of mistakes, and gives citizens a voice) (View Highlight) #[[Ankified]]
      • [[China]] failed because it was too [[risk]] averse. [[Russia]] failed because it’s too risk loving. In both cases, they failed because they "involve a governing elite that is willing and able to drag a public towards making massive sacrifices for a fundamentally irrational goal." (View Highlight) In a [[democracy]], flawed ideas like this typically don’t have the power of the state behind them for long.
      • "critics of democracy have to keep bringing up [[Lee Kuan Yew]] because there have been so few like him" (View Highlight)
      • Like [[Tyler Cowen]], always as "are you long or short the market?" People that say democracy is crumbling in the West are never actually short the market: events like [[January 6th]] are in fact evidence of strength, [[wokeness]] is not going to have the impact some say it will. (View Highlight)
      • The world will continue to increasingly look like the West, because there simply is no other viable option. (View Highlight)

Notes on “The First Room-Temperature Superconductor Has Finally Ben Found by sciencenews.org”

  • Title:: The First Room-Temperature Superconductor Has Finally Been Found
  • Author:: [[sciencenews.org]]
  • Recommended By:: [[Tyler Cowen]]
  • Reading Status:: #complete
  • Review Status:: #complete
  • Tags:: #articles #superconductor #technology #innovation #[[new technology]]
  • URL:: https://www.sciencenews.org/article/physics-first-room-temperature-superconductor-discovery
  • Source:: #instapaper
  • Roam Notes URL:: link
  • Anki Tag:: science_news_room_temp_superconductor
  • Anki Deck Link:: link
  • Notes

    • Scientists reported the discovery of the first room-temperature [[superconductor]], after more than a century of waiting. (View Highlight)
    • Superconductors transmit electricity without resistance, allowing current to flow without any energy loss. But all superconductors previously discovered must be cooled to very low temperatures, making them impractical. (View Highlight) #Ankified
    • If a room-temperature [[superconductor]] could be used at atmospheric pressure (the new material only works at very high pressure), it could save vast amounts of [[energy]] lost to resistance in the [[electrical grid]]. And it could improve current technologies, from [[MRI machines]] to [[quantum computers]] to [[magnetically levitated trains]]. Dias envisions that humanity could become a “superconducting society.” (View Highlight)
    • It’s a big advance, but practical applications still a long way off.

Roam Notes on “Patrick Collison in conversation with Tyler Cowen | Full Q&A | Oxford Union Web Series”

https://www.youtube.com/watch?v=wfdRF_krbp8
  • Author:: [[Tyler Cowen]] and [[Patrick Collison]]
  • Source:: link
  • Recommended By:: [[Tyler Cowen]]
  • Tags:: #technology #progress
  • Roam Notes URL:: link
  • {{[video]: https://www.youtube.com/watch?v=wfdRF_krbp8}}
  • (0:29) [[Tyler Cowen]]: As the next big biomedical technology breakthroughs come, are you concerned that increased life expectancy would result in calcification of institutions by entrenching incumbents?
    • [[Patrick Collison]]: It’s a problem to be solved, but not convincing because proposing the inverse "can we ensure everyone dies at age 80?" seems to clearly be "no".
  • (1:35) [[Tyler Cowen]]: To what extent do you think the attraction of progress is "feel / aesthetics" or giving people what they want?
    • PC: Correlation between happiness and GDP is about .78. So progress really does drive satisfaction. That suggests it’s more about the outcome rather than the process, but my intuition is that it’s the process of generating progress itself that is the relevant question.
  • 3:43 [[Tyler Cowen]]: You’ve written both optimistic and pessimistic visions for our path forward with technology. What is your underlying model?
  • (5:50) [[Tyler Cowen]]: The [[mRNA vaccines]] work, and there was at least 25 years where there was no marketplace adoption. All the sudden paradise rains down during [[COVID-19]] – maybe this is how progress works and we shouldn’t be so pessimistic?
    • [[Patrick Collison]]: You could argue the opposite – the fact that we needed a pandemic to finally get to commercialization is an indicator of systemic problems. The fact they were so ready to deploy, indicates the extent of the problem.
  • (7:43) [[Tyler Cowen]]: What is the most misleading statistic and what is the most underrated statistic for measuring progress?
    • [[Patrick Collison]]: Self-reported happiness is important but a lot of the comparisons you want to perform with it are fraught or misleading. Intertemporal comparisons lead to strange conclusions.
    • (10:15) [[Tyler Cowen]]: part of me thinks total [[population]] may be the ultimate measure of progress, which would not be good for [[Japan]]. Everyone admires small countries that are well run, but consider [[Brazil]] – obviously lots of problems, not as well run, but it’s produced many people.
  • (12:30) [[Patrick Collison]]: Culture is very important for determining progress. If you look at [[the Scottish Enlightenment]], they were very obsessed with things like [[culture]], [[norms]], and mindset, which seems old-fashioned now.
  • (13:30) [[Patrick Collison]]: [[Africa]] has a promising future because of the internet: the people there are suddenly able to compete there on the same level as other places in the world. They also have a significantly growing and young population.
    • [[Tyler Cowen]]: Another advantage – since there are many more countries there, they can run more experiments.
    • [[Patrick Collison]]: They understand the importance or progress better than many westerners, who tend to now have a complacent, postmodern view that it’s not that important.
  • (20:40) [[Patrick Collison]]: [[Ireland]] has a bit of an inferiority complex, so it doesn’t view itself as the best at everything, but this kind of attitude can help stoke progress.
  • (22:04) [[Patrick Collison]]: [[Mathematica]] is one of the most underrated achievements of our age. #programming
    • It’s been getting steadily better over multiple decades. Programming languages don’t innovate much after they’re released, partially because they’re [[open source software]] which can make it harder to make significant changes. Mathematica shows that a multi-decadal software project is totally sensible – it’s improving at a faster rate now than it ever has.
    • Mathematica is like [[Stripe]] in that they are both sort of programming languages – one for computing, one for financial infrastructure. Developer productivity is the primary focus for both. [[Stephen Wolfram]] is also admirable and ambitious. He doesn’t believe in libraries – he believes that your programming language should just do all the things! It’s like he’s building the Library of Alexandria in the programming language.
  • (26:00) [[Tyler Cowen]]: You showed an early interest in meta-programming languages such as [[Lisp]]. Why, and what does that show about your thought generally? #programming #[[functional programming]]
    • [[Patrick Collison]]: Two things:
      • In computing we’re stuck in these local maxima and there’s an entrenched status quo. The cost is probably much greater than people realize. Off the beaten path projects like Lisp and Mathematica helped to understand the design space and what was possible. #learning
      • Lisp is a programming language for individuals. It takes seriously the question "how do you make a single individual as enabled and productive as possible?" E.g. "reader macros" where you define on the fly the actual syntax of the language. To other programmers this is a disaster – how do you have a large project where you get a bunch of people to work on random syntax you defined? [[Stripe]] takes this individual view: how can we make it possible for one person to do build a business with financial payments in one evening?
  • (29:20) [[Tyler Cowen]]: Why is Stripe a [[writing]] company? And how does this spring from your love of [[Lisp]] and [[Mathematica]]?
    • [[Patrick Collison]]: If you take ideas seriously, you have to become a writing culture. You want to find the best solution, not something that "just works". We’re still debating fundamental questions at Stripe that have been around for years. To make progress on that, you have to be a writing culture. If you don’t write ideas down extensively or specifically, it’s hard to say that they’re wrong and you can’t make progress.
  • (42:16) [[Patrick Collison]] the prevalence of [[open office plans]] has a lot to do being able to shuffle around people easily in a high growth company. Three unique strategies of [[Stripe]] in terms of creating an optimal [[work environment]]:
    • Move teams quickly (every 3-6 months switch to a new location)
    • Move unrelated teams close together (for serendipity, creating a warm atmosphere)
    • Making the entire physical space as connected as possible (e.g. central stairwells to get as much of serendipitous interaction as possible).
  • (48:45) [[Patrick Collison]]: It’s actually hard to get funding at top universities with large endowments. A lot of the best [[Fast Grants]] applications were actually from people from top universities, so there are potentially high returns to improving funding for the best researchers. #[[research funding]]
  • (52:20) [[Tyler Cowen]] How should we better run funding institutions, and why is there so much [[conformism]] in universities / nonprofit / philanthropy, and how does all that tie together? #[[research funding]]
    • [[Patrick Collison]]: for science institutions, more structural diversity. In terms of which work is being funded, what the field delineations are, different models for how careers work or where work is done. Find all of the axes where you could try new and different things. A lot of people don’t realize how monochromatic it is – so much is downstream of institutions like [[NIH]] – researchers understand how stifling this is, but don’t speak out about it because they rely on the funding.

For access to my shared Anki deck and Roam Research notes knowledge base as well as regular updates on tips and ideas about spaced repetition and improving your learning productivity, join “Download Mark’s Brain”.

Roam Notes on “Revolt of the Public and the Crisis of Authority in the New Millennium” by Martin Gurri

  • Title:: Revolt of the Public and the Crisis of Authority in the New Millennium
  • Author:: [[Martin Gurri]]
  • Recommended By:: [[Austen Allred]]
  • Reading Status:: #complete
  • Review Status:: #[[third pass]]
  • Tags:: #books #information #media #authority #elites #[[the public]] #[[the internet]]
  • URL:: link
  • Source:: #kindle
  • Roam Notes URL:: link
  • Anki Tag:: gurri_public_revolt
  • Anki Deck Link:: link
  • Notes

    • Overview
      • The internet is transforming our world by dramatically increasing the volume of information and destroying elite quasi-monopolies on information. The result is a clash between the public (networked, egalitarian, bottom-up) and authority (elites, top-down, hierarchical).
      • The public, once a much more passive entity that would blindly accept direction from the top, has increased power in the networked age of the internet and can openly challenge elite narratives. Elites have not accepted this reality and continue to try to silence the public. Clashes between authority and the public will continue until something changes. The resulting turbulence puts much at risk, including liberal democracy itself.
    • 1. PRELUDE FOR A TURBULENT AGE (Location 85)
      • Supply of information has exploded in unprecedented ways in recent years and this decreases authority of any one source. (Location 117)
      • How Walter Cronkite became Katie Couric and the audience became the public (Location 146)
        • "Uncertainty is an acid, corrosive to authority. Once the monopoly on information is lost, so too is our [[trust]]…proof for and against approaches infinity, a cloud of suspicion about cherry-picking data will hang over every authoritative judgment." (Location 154) Disparities between elite interests and public interests become crystal clear. #uncertainty
        • The mass audience transformed into [[vital communities]]: groups of wildly disparate size gathered organically around a shared interest or theme. They are amateurs, educated non-elites and information suddenly began to circulate at this level. (Location 193) #Ankified
        • Ultimately this liberation of information changed the relationship between the public and authority in almost every domain, and this change in the relationship is the main theme of the book (Location 205).
      • I christen the new age and other definitional illusions (Location 207)
        • [[the public]]: amateurs fractured into vital communities, each clustered around an “affair of interest” to the group #Ankified
        • [[authority]]: trained professionals, with access to hidden knowledge, perched on top of a specialized hierarchy. Usually achieved thier position through difficult accreditation, and are reluctant to listen to those who haven’t gone through the same hoops. (Location 247) Lasting authority comes from institutions that speak on their behalf. (Location 254). E.g. government, corporations, financial institutions, universities, mass media, politicians, scientific research industry, etc. #Ankified
        • [[information]] doesn’t grow linearly, it experiences huge sudden changes or "Waves" that transform the landscape: (Location 271) #Ankified
          • [[1st Wave (information growth)]]: The invention of writing
          • [[2nd Wave (information growth)]]: Development of the alphabet
          • [[3rd Wave (information growth)]]: The printing press and moveable type
          • [[4th Wave (information growth)]]: Mass media
          • [[5th Wave (information growth)]]: Gurri proposes we are in this new wave right now – information technology
    • 2. HODER AND WAEL GHONIM (Location 291)
      • A twenty-something in Toronto opens a new continent of expression for Iranians (Location 305)
        • [[Hossein Derakhshan (Hoder)]], better known by his blog name “Hoder,” is an influential Iranian blogger that the regime saw as a real threat because he started a blogging revolution among Iranians. In 2010 he was sentenced to 19 1/2 years in prison for blogging, but was pardoned in 2014. Gurri uses his example throughout this chapter (Location 292)
        • “The Dictator’s Dilemma”: for security, dictators must restrict communications to a minimum, but they also need prosperity to make their rule legitimate, and this can only be attained by the open exchange of information. (Location 332) #Ankified
        • "Bloggers, and in general all dabblers in digital communication, are often accused of insulting sacred things: presidents, religion, property rights, even the prerogatives of a democratic majority. They speak when there should be silence, and utter what should never be said. They trample on the sanctities, in the judgment of the great hierarchical institutions which for a century and half have controlled, from the top down, authoritatively, the content of every public conversation. The idea is not that some forbidden opinion or other has been spoken. It is the speaking that is taboo. It’s the alien voice of the amateur, of the ordinary person, of the public, that is an abomination to the ears of established authority." (Location 394)
        • Some people in authority are despots and thugs, but what’s relevant is their belief they have a unique legitimacy to speak about their domain, and challenges to this are a threat to this moral order "which must be crushed utterly in the name of all that is good and true" (Location 399). Examples:
          • News media failing economically, but describing it as a danger to democracy rather than just threatening their livelihood.
          • Current desparate claims of [[public health professionals]] trying to silence outsiders with reasonable opinions, but claiming it’s because of "[[misinformation]]" and "dangerous" views.
      • A burning man on Facebook lights the way for political change in Tunisia (Location 440)
        • This section focuses on a Tunisian uprising, which started when [[Mohamed Bouazizi]], a street vendor, set himself on fire after humiliation by regime officials. (Location 465)
        • One insight from the event is incredible redunancy in the transmission of information, which means authoritarians can’t really shut it down. You can shut down pieces, even the entire internet, but not what Gurri calls the [[information sphere]], which refers to the broader space that includes internet, social media, mass media, and more. (Location 491) It can’t really be blocked by government, and it’s usually what determines the outcome of a political conflict. #Ankified
      • A Google employee in Dubai schedules an Egyptian revolution as a Facebook Event (Location 494)
        • "If you were to ask me to name the most significant geopolitical transformations since the fall of the Soviet Union, the 2011 uprising in Egypt, which followed close on the heels of Tunisia’s and repeated the same pattern, would rank very near the top." (Location 501) #Ankified
        • [[Wael Ghonim]] is a central character in the uprising. He created a Facebook page "We Are All [[Khaled Said]]", who was a young man beaten to death by thugs in the Mubarak regime. Images of Khaled’s face after the beating were used in the marketing campaign against the regime. Ghonim created a Facebook Event calling for protests. #Ankified
        • He attracted a huge audience, despite the fact that internet penetration in Egypt was around 20%. It turns out, this is enough to enter the consciousness of the public, and researchers like [[Roland Schatz]] have estimated the tipping point of awareness for something to diffuse and gain widespread attention is about 15%. (Location 560) #Ankified
    • 3. MY THESIS (Location 690)
      • A war of the worlds, deduced from the devil’s excrement (Location 707)
        • "My thesis is a simple one. We are caught between an old world which is decreasingly able to sustain us intellectually and spiritually, maybe even materially, and a new world that has not yet been born. Given the character of the forces of change, we may be stuck for decades in this ungainly posture. You who are young today may not live to see its resolution." (Location 708) The two sides in this conflict are: #Ankified
          • [[authority]]: Industrial, top-down, hierarchical institutions of authority that have dominated globally for a century and a half. Slow, plodding, inflexible.
          • [[the public]]: fluid, networked, flexible, fast, unsteady in purpose
        • Changes in ownership and availability of information is what has stoked this conflict. The [[5th Wave (information growth)]] has networked and connected the public through digital devices. #Ankified
      • The center cannot hold and the border has no clue what to do about it (Location 755)
        • You can also categorize the two warring groups of authority and the public as [[the Center]] and [[the Border]] (terms employed by Mary Douglas and Aaron Wildavsky in another context)
          • [[the Center]] expects and protects the status quo (Location 764)
          • [[the Border]] is composed of “sects” or “networks”—voluntary associations of equals. Their purpose is to oppose the Center, and have no intention to actually rule, govern, or develop policy-this would imply rank or hierarchy and the Border is opposed to this. Opposition provides unity for the border. (Location 765) #Ankified
            • Sect: a group of people with somewhat different religious beliefs (typically regarded as heretical) from those of a larger group to which they belong. #Ankified
        • "Viewed from within this scheme, the stories of the last chapter appear in a new light. Hoder, Wael Ghonim, and Shawn Fanning emerged as sectarian heroes of the digital Border, striking at the forces of monopoly and centralization. Ahmadinejad, Mubarak, and Jack Valenti each represented a mighty hierarchy of the traditional Center, slow-turning yet implacable, perfectly willing to smash the individual to preserve the system." (Location 775)
        • The Center is failing (e.g. 2008 financial crisis, intelligence in Iraq), and the fractured, sectarian public criticizes, mocks, and magnifies, leading to perpetual distrust and conflict, especially since the public cannot govern or solve issues. (Location 798)
        • It’s uncertain what will happen, and unlikely that any particular group will "win". Gurri’s greatest concern is for the future of liberal democracy – it is part of the battleground. (Location 832)
      • [[Cyber-utopians]], [[cyber-skeptics]], [[cyber-pessimists]], and how all their sound and fury signifies very little (Location 836)
        • [[the public]] wasn’t really possible until the printing press – before that it was more of an inchoate lump managed by elites in authority. Two conditions required for a public to exist: self-consciousness (irritation or dissatisfaction to pry it apart from the elites) and a means of communication for the public to voice its thoughts and opinions. (Location 837) #Ankified
        • [[cyber-utopians]]: See digital media as a boost to human collaboration and democracy.
        • [[cyber-pessimists]]: Find many ills in the internet—the corruption of our culture, governments spying on their citizens.
          • Gurri is skeptical about these claims: "As analysis, the exhortations of the pessimists hover somewhere between pointless and trivially true. Of course dictatorships wish to spy on dissidents, just as dissidents seek to avoid detection—a game made vastly more difficult for those in power by the proliferation of digital hiding-places. Of course dictatorships wish to manipulate media of all kinds to influence opinion. In the industrial age, however, they did so boldly and officially, from authority, while under the new dispensation despots must try to impersonate the public to have any hope of influencing it." (Location 891)
          • [[Malcolm Gladwell]] would fit into this category. He thinks “If you’re taking on a powerful and organized establishment you have to be a hierarchy.” i.e. you have to be trained professionals for political change. Gurri suggests this has been contradicted by [[5th Wave (information growth)]]. (Location 885)
      • [[homo informaticus]], or how choice can bring down governments (Location 921)
        • How can information influence political power? (Location 922)
        • The predecessor to [[homo informaticus]] is [[unmediated man]] – lacked access to media, likely illiterate, and probably didn’t have ability to travel far – only information channels are the people around him. #Ankified
          • "The single most important aspect of this information environment was that so very little was new. The range of interests was narrow, the set of sources small." (Location 945)
          • Authorities only needed to control the community to stay legitimate, and unmediated man is limited in thinking about alternative stories. Authority receives little feedback or dissent from them.
        • [[homo informaticus]]: information man – we are all him – "end products of an evolutionary process involving the spread of education, expanded levels of wealth and security, and improved means of communication." #Ankified
          • They’re informed, literate, and have access to a variety of media. They’re exposed to the larger world. They may access information that subverts the legitimacy of the elite. This is why authoritarian regimes must deploy costly and elaborate state media; but they can only do so much. As sources increase -> greater chance of dissonance with the regime’s story, and the first step towards potential revolution. (Location 932)
            • He can be more easily influenced by [[demonstration effects]]: Information influencing actions by revealing something previously unknown or believed impossible. (Location 995)
          • "the rise of Homo informaticus places governments on a razor’s edge, where any mistake, any untoward event, can draw a networked public into the streets, calling for blood. This is the situation today for authoritarian governments and liberal democracies alike." (Location 1050)
          • Homo informaticus sounds a lot like [[Tyler Cowen]] "infovore" in his book The Age of the Infovore.
    • 4. WHAT THE PUBLIC IS NOT (Location 1083) #[[the public]]
      • It’s hard to define the public, so he uses [[Nassim Nicholas Taleb]] “[[subtractive knowledge]]” method to characterize complex systems: rather than assert what the public is, explain what it is not. Think chipping away at the stone until a portrait emerges. (Location 1098) #Ankified
        • "The public is not the people, but likes to pretend that it is" (Location 1106)
          • "this is true in all circumstances, everywhere. Since, on any given question, the public is composed of those self-selected persons interested in the affair, it possesses no legitimate authority whatever, and lacks the structure to enforce any authority that might fall its way." (Location 1135)
        • **"The public is not the masses, but was once buried alive under them" **(Location 1174) #[[the masses]]
          • The industrial age led to [[the masses]] becoming organized into gigantic hierarchies for every domain of activity. This buried the public as it served to benefit the hierarchy. (Location 1194) #Ankified
          • "The eighteenth-century public was minute but highly active. The public in the industrial age was immense but bullied into a reactive posture. The masses absorbed the hundreds of millions of ordinary persons who entered history in the nineteenth century, and placed them under the command of structures which allowed few authentic decisions, few real choices of opinion and action." (Location 1243)
        • "The public is not the crowd, but the two are in a relationship (it’s complicated)" (Location 1282) #[[the crowd]] #Ankified
          • "Members of the public tend to be dispersed, and typically influence events from a distance only, by means of “soft” persuasion: by voicing and communicating an opinion." (Location 1287)
          • "A crowd, on the contrary, is always manifest, and capable of great physical destructiveness and ferocity. It is a form of action which submerges the desires of many individuals under a single rough-hewn will." (Location 1291)
          • The public can create a crowd, but also a crowd can create its own public. E.g. Pope John Paul II trip to communist Poland in 1979 – the crowds provided [[demonstration effects]] that created a public of anti-communist resistance. People may have joined the crowd for religious reasons, but it inspired anti-communism, which wasn’t the original intent of the crowd. (Location 1303)
    • 5. PHASE CHANGE 2011 (Location 1406)
      • Elites never trusted the public, but what has changed is the public increasingly distrusts authority and has more power to translate that into action. (Location 1413)
      • Theme for this chapter: "At some moment of 2011, the script went awry. Toxic levels of distrust sickened democratic politics. People began to mobilize for “real democracy,” and denied that their elected representatives represented them. They were citizens of liberal democracies, but they demanded something different. They wanted radical change: and the great mystery, casting a shadow beyond 2011, was what this change away from current democratic practices might look like." (Location 1426) Gurri provides evidence of this change by pointing to various events that occurred in 2011 that share some important characteristics. #Ankified
      • Complex systems (e.g. society, politics) tend to experience sudden, dramatic changes. They accumulate noise, and the forces holding them together diverge silently under the surface. Eventually the dam breaks (e.g. [[Soviet Union]]). (Location 1434)
      • The limits of outrage, or the sound of a silent scream (Location 1448)
        • What now? If the old elite institutions are despised and destroyed, what will replace it? The public never provides clear answers to this question. Solutions often involve rules and hierarchy, which the sectarian public opposes. (Location 1515) #alternatives
      • The sources of outrage viewed from below, viewed from above (Location 1554)
        • The level of outrage among the public public is often way out of line with their standard of living, which is often high. Two complementary perspectives that explain the outrage (Location 1556): #Ankified
          • From below / ground level: revolt explained by failing of ruling institutions.
          • From above / birds eye history view: revolt propelled by nihilism – self-destructive contempt for the world, a complete rejection of the institutions leading to a desire to burn it all down, so we can have a better world. The "better world" part is always vague and unexplained, because you can’t know the results of experimental or alternative histories. Nihilism is all about negation, and it tends to be self-defeating – if they have their way, they will destroy themselves. This is "a political pathology frequently encountered in the wake of the Fifth Wave". Gurri’s definition of [[nihilism]] – The will to destruction, including self-destruction, for its own sake, with a frivolous disregard for consequences. (Location 1668)
      • How a tent city in Tel Aviv became a circus of middle class discontent (Location 1685)
        • Focus of this chapter is on the tent city protests Israel in the summer of 2011. These protests began on Facebook, among the usual university educated, young, affluent [[Daphni Leef]] – a common demographic in these 5th wave events (Location 1691). She posted a facebook event to pitch tents in the city after she found she could not afford an apartment within [[Tel Aviv]]. This caught on significantly and received widespread public support, changing the political landscape in Israel. They often had contradictory political fantasies, and they had the nihilistic attitude of destroying their own roots. (Location 1737) #Ankified
        • The Israel protestors ultimately wanted the government to make things right, somehow. They had no plans to do this nor did they really understand what it meant. It’s government’s job to figure this out. (Location 1754) #alternatives
        • The protests ultimately led to some significant political concessions, but demonstrators didn’t see it that way – incremental changes were seen as obstructionism or a bribe. Anything positive or specific was a threat. (Location 1774) #Ankified
      • Occupy Wall Street and the baffling politics of negation (Location 1797)
        • Occupy Wall Street fits into the typical 5th Wave revolt for 3 reasons: similar demographic and behavioral characteristics, drive from negation (no coherent demands, lots of accusation), and they lived virtually (Location 1806)
      • London in August, or the recurring question of [[nihilism]] (Location 1947)
        • This section revolves around the story of Mark Duggan who was shot dead in London in 2011. This led to protests, which broke out into 4 days of violent riots and looting, with 5 deaths. (Location 1951) These were called "The BlackBerry Riots" by the Economist, due to the use of BlackBerry Messaging Service among participants. #Ankified
        • "Belief that political power could switch off the [[information sphere]] was shown to be more than an aging dictator’s hallucination. It was a persistent delusion of [[the Center]]." (Location 2026)
          • Gurri’s point here is supported by the expert reaction to the [[COVID-19]] pandemic. There are constant calls to fight and squash "misinformation". The problem is, what is misinformation and who gets to define it? Early on, experts opposed people using masks. They also denounced any suggestion that COVID escaped from a [[China]] lab, and as of [[May 6th, 2021]], this is one hypothesis that can’t be eliminated.
        • Gurri believes these protestors used similar arguments other [[5th Wave (information growth)]] protests made, and took them to their logical conclusion – they turned violent and destructive, descending into [[nihilism]]. "The British rioters acted as if the government, the police, and the law lacked legitimacy." In other protests in 2011, the rebels waffled on this: "Most were the children of the comfortable middle class, too interested in the drama of the moment to accept the implications of their own rhetoric. " (Location 2033) #Ankified
          • "They behaved as if desirable things were part of the natural order, like the grass under their feet. Detestable systems of authority only stood in the way." (Location 2046)
      • What Guy Fawkes’s mask can teach us about the turmoil in 2011 (Location 2053)
        • "Fascination with a [[revenge]] melodrama offered a hint about how the young transgressors of 2011 viewed themselves—and what they imagined they were doing." (Location 2057)
          • They were self-dramatizers: "The disconnection between their words and their actions, between their understanding of effects and their indifference to causes, can be explained by this trait." (Location 2088)
        • The ideal world of the protestors and their expectation of government was incredibly high in scope and also ill-defined. They believed government could work miracles. (Location 2102)
        • "That was the most profound consequence of 2011: sowing the seeds of distrust in the democratic process. You can condemn politicians only for so long before you must reject the legitimacy of the system that produced them. The protests of 2011 openly took that step, and a considerable segment of the electorate applauded." (Location 2139) #democracy
    • 6. A CRISIS OF AUTHORITY (Location 2249) #authority
      • [[authority]]: flows from legitimacy and monopoly. The public needs to heed and trust them to some extent or else they’re simply not an authority. Authority must rely on [[persuasion]], because [[force]] destroys [[trust]]. (Location 2255) #Ankified
      • The crisis of authority has resulted from the visible gap between expert competence claims and their actual performance. This gap was always there, but the public now is hyper-aware of it. (Location 2287)
      • If science is the modern deity, then the public is on the verge of deicide (Location 2305) #science
        • Science is facing this crisis of authority. E.g. [[peer review]] – presupposes reviewers are independent and can evaluate manageable data, and both of these are called into question. (Location 2356). Gurri gives [[climategate]] event as an example.
        • Science was once a sectarian "[[the Border]]" practice, and since 1919 with Einstein, it’s become part of [[the Center]] with large bureaucracies. (Location 2367)
        • The lack of trust in science may be understated and ready to explode, since most people think Einstein when they think of science and not the bureaucracy and politics that pervade science today. On specific issues you can see the public’s distrust. (Location 2458). Expectations about scientists and their prestige is inflated, and now they are exposed due to expectations not meeting reality.
          • A disturbing example of this is the Italian Scientists that went to jail because they didn’t predict an earthquake. They were really convicted because "they had been unwilling to admit, in public, to the degree of uncertainty which science imposed on them" (Location 2511). [[authority]] claims certainty, and works hard to avoid being perceived as being uncertain. #uncertainty
      • The panic of the experts, or how those who thought they knew didn’t (Location 2537)
        • The economic crisis of 2008 is a key moment in the public loss of trust in experts. [[Alan Greenspan]] is a key example of an expert that fell from grace. (Location 2545).
        • The left blame lack of regulation, the right blame government involvement, and they’re both right. The bigger issue is that nobody saw it coming, and the gap between expectations of experts and their actual abilities was exposed. (Location 2666)
      • A corporate bum’s rush, or the economic ramifications of the [[5th Wave (information growth)]] (Location 2683) #Business #[[private sector]]
        • The business world seems to be not only surviving, but thriving, in the networked age. Why? (Location 2688)
        • [[markets]] are pure [[trial and error]] and this works to their advantage: "The trial part of trial and error entails mostly error, unless the set of trials is large and competitive enough to produce a possible success, and the system is smart and [[agile]] enough to recognize success and reward it." (Location 2764)
          • There is an incredible amount of [[churn]] in companies. Many fail. "average lifespan of a company on the S&P 500 has declined from 67 years in the 1920s to 15 years today." (Location 2745) This churn gives the public what it wants – companies that don’t give them what they want fail. (Location 2762)
        • In contrast, many institutions that have been less successful in this environment face single-trial process or define success hierarchically from [[authority]] (they ignore [[the public]]). They explain away and double down on failure. This doesn’t work in the modern age where the public can question everything. (Location 2769)
        • It’s not that businesses are smarter; it’s that [[capitalism]] is well-equipped to adapt and failing companies are replaced. Individual companies are usually not great at change, because [[innovation]] fundamentally threatens [[authority]] of powerful people and groups in corporations, despite lip service to "culture of innovation" and "thinking like a startup".
      • Uncertainty, impermanence, and other symptoms of life without authority (Location 2807)
        • Symptoms of the crisis of [[authority]]
          • excessive expectations among the public which are encouraged by the authorities
          • Elite loss of control over the story told about their performance
          • Alternative centers of authority
          • Impermanence – a lack of inevitability is bad for authority because if people doubt it’s permanence the authority is reduced (Location 2860)
        • "You would expect, in a time of uncertainty, a landscape crowded with frauds and con artists peddling positive formulas for happiness, love, sex, good health, and better government. You would expect, too, the most trivial assertions to be attended with much noise and thunder: absent authority, every message must be shouted to have a hope of being heard. Stridency will infect every mode of communication, but will be most disruptive of political rhetoric. Just to keep an audience, politicians and commentators will have to scream louder and take more aggressive positions than the competition." (Location 2855)
        • Impermanence may lead to increased [[religiosity]]. E.g. in the Middle East, Islamist groups prospered where secular Arab authoritarians wobbled. [[Christianity]] in [[China]] is growing. #religion #Ankified
          • "For the governing classes and articulate elites of the world, this turn to religion is both appalling and incomprehensible—but this is a denial of human nature. If the City of Man becomes a passing shadow, people will turn to the City of God." (Location 2889)
    • 7. THE FAILURE OF GOVERNMENT (Location 3024) #government #[[government failure]]
      • Why is government rhetoric completely out of line with what government can reasonably achieve? (Location 3044)
      • How JFK won by failing while Obama succeeded his way to defeat (Location 3058)
        • The Bay of Pigs invasion was a major failure for [[JFK]]. He was forgiven by the media, which is surprising from the modern perspective – he would have been pilloried today. Obama had his governing majority shattered in Congress in 2010 due to failed stimulus, and the public was much less forgiving. (Location 3059)
        • For [[government failure]], you need two things: something that happens that’s perceived as a failure, and a ruptured relationship between government and governed. (Location 3243)
      • How Brasilia and Cabrini Green became Dodd-Frank and the EU Constitution (Location 3281)
        • [[Brasilia]] – a new capital built from nothing in [[Brazil]]. Government set grand expectations, and it didn’t come close to meeting those expectations (Location 3307). #Ankified
        • It’s a great example of a [[high modernism]] project: grand government projects that aim to make the world anew. Authoritarian examples include the [[Great Leap Forward]]. High modernism efforts of today would include the Obama stimulus, which took 1,000 pages to describe and costing $800 billion. #Ankified
        • [[late modernist]] government sometimes attempts [[high modernism]] level of ambition. Despite being failures to meet their objectives, high modernism dazzled elites and public since elites could control the story of these projects they weren’t failures but "epic activity, high drama, reaching for the stars." (Location 3349) That simply doesn’t work anymore with a fractured and connected public. #Ankified
        • "[[high modernist]] government was an austere prophet, demanding the destruction of the muddled present to make room for the perfect future. [[late modernist]] government is more like a kindly uncle, passing out chocolate chip cookies to his favorite nieces and nephews. He doesn’t wish to transform them. He just wants them to be happy—most particularly, with him." (Location 3362)
        • [[late modernist]] policies try to address every little injustice and become recognized for it, but it necessarily spreads itself too thin and becomes ineffective. It also presumes everything can be solved by government (Location 3390) It takes [[high modernist]] claims to achieving anything, and also adds claims they can intervene anywhere to promote happiness. (Location 3396) This typically fails, killing legitimacy of government.
      • Paul Ormerod and why most things fail (Location 3415)
        • "At some point around the turn of the new millennium, elites lost control of information, and power arrangements began to flip. Assured of the public’s wrath, elected governments have acted, or failed to act, motivated by a terror of consequences. Legitimacy was equated with the deflection of blame, and the aim of governing became to exhibit a lack of culpability." (Location 3420)
        • Politicians will eventually be tempted to engage in the type of [[negation]] the public uses. Obama engaged in this type of negation rhetoric. (Location 3530)
      • [[Barack Obama]] and the joys of [[negation]] (Location 3536)
        • "There is a [[democrat’s dilemma]] that is no less perilous than the dictator’s. Politicians must promise the impossible to get elected. Elected officials must avoid meaningful action at all costs." (Location 3541) #Ankified
          • Obama dealt with this balance by eventually avoiding meaningful action, while engaging in rhetoric that made him sound like a righteous outsider calling out the corrupt establishment. He embraced public negation and fanned the flames. (Location 3556)
          • "Barack Obama, I believe, represented a new and disconcerting development in democratic politics: the conquest of the Center by the Border, and the rise of the sectarian temper to the highest positions of power." (Location 3649)
          • "The accusatory style of government must be understood as a pathological development, a deformation, brought about by the underground struggle between the public and authority." (Location 3672)
    • 8. NIHILISM AND DEMOCRACY (Location 3778) #nihilism #democracy
      • [[nihilism]] is a logical conclusion of the forces hypothesized in the book – the system bleeds legitimacy, and there will ineviteably be people who argue it should be put out of its misery. (Location 3793)
      • Portrait of the nihilist as the sum of our negations (Location 3976)
        • The nihilist sees "[[government failure]]" as lying and cheating. (Location 3984). The nihilist is loud and irreconcilable. He turns to violence in physical contact, and online he is eager to find ideas to "hack, expose, paralyze the institutions that run the world".
        • The nihilist is now connected digitally to coordinate with other nihilists just as destructive. (Location 3990)
        • The disturbing thing about the nihilist is not what he is, but where he comes from: he’s a child of [[privilege]], he is a beneficiary of the system he comes from, he’s not marginalized. (Location 4007) #Ankified
          • "He’s healthy, fit, long-lived, university-educated, articulate, fashionably attired, widely traveled, well-informed. He lives in his own place or at worst in his parents’ home, never in a cave. He probably has a good job and he certainly has money in his pocket. In sum, he’s the pampered poster boy of a system that labors desperately to make him happy, yet his feelings about his life, his country, democracy—the system—seethe with a virulent unhappiness." (Location 4012)
        • He expects perfection, and any deviation from his expectation triggers his urge to destroy. (Location 4042) The nihilist appears once "[[privilege]] is felt to be natural, a matter of birth rather than previous effort". (Location 4065)
        • "Every great institution is justified by a story." (Location 4090) "Such stories aren’t surface gloss. They influence our behavior directly. This is why paper sometimes beats scissors: soft words ignite powerful historical memories, and the public takes to the streets." (Location 4096) #stories
    • 9. CHOICES AND SYSTEMS (Location 4205)
      • This chapter tries to answer what to do – what can the public, government, you, and me do to deal with the turbulence? (Location 4207)
      • If structure is destiny then the personal will trump the political (Location 4238) #[[personal sphere]]
        • [[personal sphere]] – "This is the circle of everyday life, experienced directly, in all its local specificity. Here the choices meaningful to an individual get generated: spouse, children, friends, career, faith." (Location 4238) #Ankified
        • The problem we face is not with [[democracy]], it’s with outlandish government claims that constantly fail to meet expectations. (Location 4284) #expectations #[[under-promise over-deliver]] #Ankified
          • [[the public]] must also update expectations about what democratic government can deliver. (Location 4303)
        • The solution is not [[direct democracy]], but a return to the [[personal sphere]]. Returning [[choice]] to the personal sphere will allow issues to be addressed directly with local knowledge, with lots of [[trial and error]]. Then personal failure doesn’t take down the entire system. (Location 4294) #Ankified
      • Telescopic philanthropy, or the politics of the impossible (Location 4305) #[[personal sphere]]
        • We need less reliance on broad indicators like [[GDP]] to evaluate performance. "Numbers like the GDP fulfill a rhetorical function. They partake of the prestige of science, appearing superior to the confused jumble of reality as actually experienced. They sustain the high modernist claim that we can know at a glance the truth about vast systems." (Location 4342) #[[high modernism]] #quantification
          • "But we know that we don’t know. The number is an illusion. If I lose my job, I understand what this signifies in all the intimate details, because I have direct access to my [[personal sphere]]. If I am told that the unemployment rate went up from 5.1 to 5.6 percent over the last month, I have no idea what this signifies. I lack access to the reality behind the number." (Location 4344)
          • "In the end, the most persuasive story wins, not the highest score." (Location 4354) #stories
          • People then confuse personal and statistical, leading to more [[negation]]. "I may hold down an excellent job, but the failure of the stimulus to meet its targets infuriates me." (Location 4358)
        • [[Charles Dickens]] illustrates the issues with overemphasis on the public rather than private sphere with Mrs. Jellyby of Bleak House – a character that spent much time working to improve the lives of others while completely neglecting her own children. "[[telescopic philanthropy]]—the trampling of the personal sphere for the sake of a heroic illusion." (Location 4364) #Ankified
          • "A telescopic philanthropist, from the moral heights, would call this selfishness or escapism. Yet selfishness, it seems to me, would entail the demand that the government meet all my needs. Escapism would mean burying my personal responsibilities under a concern for the brotherhood of man." (Location 4384)
        • Control and satisfaction can only be found in the [[personal sphere]]. (Location 4382)
        • You can engage in politics and government, form opinions and act, among other things. What you should not do is demand certainty of complexity or expect statistics will ordain the future. (Location 4390)
      • Advice to the prince, or the art of government in societies of distrust (Location 4393) #government #[[public administration]]
        • Government can try to will the world back to before the internet, but this is doomed to fail. The alternative is government to retain some control by moving information online in a way that the public can interact with it. As much as possible, create "[[open government]]". (Location 4452) #Ankified
          • Shorter, readable laws. (Location 4459)
          • Work on drafts out in the open, online
          • Reduce pseudo-technical jargon
        • This could create incentives for persuasiveness in place of current incentives for opaqueness. They could get a more productive feedback loop from the public. It could also demystify government and set public expectations properly. (Location 4465)
    • 10. FINALE FOR SKEPTICS (Location 4536)
      • If my story has been fiction, the null hypothesis must be true (Location 4575) #predictions
        • this section discusses what we expect to see if the thesis is true, vs what you would see from the null hypothesis.
        • If the thesis is true:
          • "Additional higher-level effects include a progressive loss of inhibition by the public in its attacks on authority, the rise of anti-establishment political groups, and the possibility, lurking in the shadows, of the nihilist and his fever dream of annihilation." (Location 4613)
          • Government will make it a priority to defend itself against the public. (Location 4618)
          • "In democracies, elected officials will be tempted to gain favor by distancing themselves from the democratic process." (Location 4621)
        • If the null hypothesis it true:
          • "A political environment safely entrenched within the processes of the industrial age. Government actions and policies are sheathed with authority and persuasiveness, while government failures implicate specific politicians or parties but never the system as a whole. You should expect, under such conditions, for political life to be characterized by continuity rather than disruption. Protests occur, but they target specific rather than systemic issues. Public opinion will be more forgiving—even, on occasion, as gentle as it was with JFK over the Bay of Pigs." (Location 4636)
          • Opposition is loyal, shares assumptions of people in power and sits within the political system. (Location 4641)
          • Information belongs to institutions and remains under their influence. (Location 4645)
      • The future’s uncertain but the present is always here (Location 4663)
        • "Books that interpret events sooner or later will be falsified by events: you just hope it’s later." (Location 4672) #predictions
      • The old democracies and the new structure of information (Location 4797)
        • History can be driven by [[negation]], not just contradiction. This means that, even though there is no better alternative to liberal democracy, it could still be in trouble due to forces of negation pushing along despite there being no clear alternative.
    • 11. (Addition to 2018 Edition) TRUMP, BREXIT, AND FAREWELL TO ALL THAT (Location 4958)
      • The Revolt of the Public was first published in June 2014. This chapter is an extension written in 2018 providing some reconsiderations. (Location 4960)
      • His main reconsideration is that "The great unraveling of the institutions has proceeded faster, further, and deeper than I imagined possible in 2014." (Location 5012)
      • Eternal surprise of the elites, or the world turned upside-down (Location 5026)
        • "From start to finish, the 2016 presidential race can best be understood as the political assertion of an unhappy and highly mobilized public. In the end, Trump was chosen precisely because of, not despite, his apparent shortcomings. He is the visible effect, not the cause, of the public’s surly and mutinous mood." (Location 5072)
        • Trump was lucky in his moment. His strategy wouldn’t have worked in 1980, 1990, or 2000.
      • The Russians are coming, the Nazis are here, and everywhere you look there’s Donald Trump (Location 5265)
        • "This is how the global elite class and many others interpret what I have called the revolt of the public: as the death of democracy and a descent into authoritarian darkness." (Location 5276)
        • The idea that we should control social media to tame the revolt of the public is misguided. People often praise [[China]] for some of their capabilities and even their censorship. "The regime in China survives on economic prosperity, which demands the free flow of information. But sooner or later, the economy will begin to wobble—should that information be allowed to flow?"
          • "China’s elites are riding a tiger and know it. Whatever the future brings to this antiquated power structure, it is no more likely than North Korea or Cuba to provide the escape route from liberal democracy in the twenty-first century." (Location 5316)
        • [[Vladamir Putin]] is significantly overrated in his capabilities and success as an authoritarian (Location 5327)
        • "I don’t see authoritarian rulers prospering under current conditions." (Location 5380) #[[To Ankify]]
          • Democracies are drifting toward dysfunction and paralysis, not authoritarianism. (Location 5453)
        • Politicians dilemma: if they get into office as an anti-establishment person, they can continue to criticize institutions but they’ll damage the economy and their popularity; if they compromise with the elites, they lose credibility with their base. Trump’s rhetoric can be seen as a way to escape this – loud and vulgar negation rhetoric that is often unrelated to policy. (Location 5530) #Ankified
        • "For all the sound and fury about [[fake news]], not a shred of evidence exists that they influenced the election outcome." (Location 5617)
      • The fate of the industrial elites and the uncertain future of liberal democracy (Location 5881)
        • "The recovery of truth requires the restoration of trusted authority. At the moment, that is nowhere in sight. The question before us is whether the current elite class can ever resume that function." (Location 5903)
        • The defeat of ISIS demonstrates the ability of bipartisan elites working together with confidence to defeat the nihilists. (Location 5906)
        • Elites must bridge the gap with the public, and to do this they have to have a positive vision that counteracts the negation of the public and includes them in the vision. (Location 5912) Unfortunately democratic elites have so far shown no interest in trying to reach the public. They go the opposite direction by trying to restore distance and silence the public rather than persuading.
        • [[devolution]] may be part of the answer: "the federal government is now an agent of division and polarization, state and local government, as well as certain private entities, can become rallying points of community. The negation of the nation-state must mean either anarchy or devolution to the city-state." (Location 5978)
        • What we really need is the emergence of a legitimate elite class. (Location 6086)
      • How is a legitimate hierarchy formed? (Location 6091) #hierarchy
        • Hierarchy forms naturally in human interaction in day-to-day life. Exemplary people are bestowed it through their example, and healthy societies assign authority this way. What is this quality that helps them rise? It "isn’t power, or wealth, or education, or even persuasiveness. It’s integrity in life and work." #Ankified
        • "Modern government’s original sin is pride. It was erected on a boast—that it can solve any “problem,” even to fixing the human condition—and it endures on a sickly diet of utopian expectations. We now know better." (Location 6143) #Ankified
        • Two qualities to look for in elites: [[honesty]] and [[humility]]. (Location 6147) Another key virtue: [[courage]]. This is required because "truth must be spoken even when it hurts the speaker or the audience. Distance must be reduced to a minimum, even at the risk of physical danger. (Location 6153) #Ankified

For access to my shared Anki deck and Roam Research notes knowledge base as well as regular updates on tips and ideas about spaced repetition and improving your learning productivity, join “Download Mark’s Brain”.