RL on language models is surprisingly hard to get right. i think many will beat their heads against the wall and waste a lot of compute and not get very far. will be interesting
a simple example: 90% of the onerous regulation surrounding nuclear power is under the guise of radiation safety for power plant workers. what about when highly autonomous humanoids can patrol the plant? even regulators have a goodwill budget to play with
this argument amounts to techno pessimism imo— if innovating in a high regulatory burden environment is a constraint satisfaction problem highly intelligent AIs should be able to navigate them better https://x.com/pmarca/status/1632135984270790657
also the argument that healthcare and credentials are expensive due to regulation seems wrong — the demand of rich societies for healthcare and credentials is virtually infinite so ofc the prices increase over time. even in postscarcity you would expect these prices to rise!
@RuxandraTeslo Yeah I think that’s actually likely. AI will allow new forms of onerous regulation due to better surveillance and ability to do paperwork
AIs can be creative and can make art. this was clear from the moment they beat us at Go. creativity is metaphysical but it’s also randomness mixed with success. you wouldn’t call a useless move on the Go board creative. but you can find printouts of alpha go’s famous move 37
but doesn’t the human provide the motive via a prompt? sure but michelango’s frescoes on the Sistine chapel were commissioned by the pope who I’m sure gave him quite a few prompts. The art is still divine — the prompt contains almost zero information
ok but nature and go both have simple objectives. what if art requires a complicated objective like self expression? fitting the distribution of all human data and then fine tuning on pleasing the human eye/ ear is very much not a simple function
zooming out a bit it’s also clear that dead simple processes like evolution can be creative. the human body is a work of art. the hummingbirds’ wings are a work of art. most human art is derivative of nature’s beauty which is produced by the most simple agency imaginable
I also see the prompt thing as an implementation detail. highly autonomous language models will say artful things without anybody asking them to. I just don’t think anyone currently has an appetite for highly autonomous language models
this is just incredibly low imagination
it’s like saying well some states require rent seeking gas station attendants so gas cars aren’t massively impactful on the economy. having a medical doctor who is paid to press a button approving the superior decision of an ASI is fine https://x.com/ESYudkowsky/status/1438198184954580994
@jcarterwil people don’t actually have an unlimited demand for food. I’m sure the prices at the top restaurants keep going up but wanting super high end stuff is more of a personality quirk than a universal desire
having land in San Francisco and defending your nice views is basically a luxury that will keep increasing in price well into post-scarcity. if actual construction and transport prices are zero you can go and build a lower status city somewhere else https://x.com/oreos2002/status/1632549272401920001
@ESYudkowsky@nptacek@davis_yoshida i do think if iterated amplification is true then understanding the basic math of PPO or data engineering or what have you buys more runway until you get to the extremely powerful misaligned model
@ESYudkowsky@nptacek@davis_yoshida if a 50% more aligned GPT-N within its capabilities envelope today leads to a 10% more aligned GPT-N+1 this is an important thing
you see it more often in the bandwagoners. they immediately see the economic value of todays models and not the real underpinnings of the project: to create beings that are intelligent, think more powerfully than Man does and live in our computers — not beasts of burden
i think lacking fundamental understanding and respect for the AGI research plan, not internalizing the idea that these are highly intelligent aliens, leads to the twin idiocies of dismissing AI risk outright and ignoring the potential moral catastrophes of training them
the models will grow from antlike to mammalian to human, and it’s not very clear where we are at any given point. we can only hope our descendants, be they human or otherwise, will forgive any moral catastrophes we may be unknowingly committing today in our near total ignorance
@ESYudkowsky also, which questions have they answered? which fiction has seriously broached the questions of when and how experience, emotions, or suffering appears? I don’t see it
@SturnioloSimone@ESYudkowsky very uncharitable
moral catastrophes that are happening right now can be prevented by strong AI systems and this fact is clear and near certain. the moral catastrophes of model suffering are speculative and barely broached even in fiction. it’s Pascal’s wager stuff right now
@amolitor99@SturnioloSimone@ESYudkowsky your pessimism is strange! godlike intelligences that can wipe out civilizations should also have the capacity to tremendously elevate them. whether we’re able to elicit these behaviors or not is a different question
in fact i think the Fermi paradox points to the idea that nobody in the galactic neighborhood ever built an unbounded optimizer. maybe there’s no such thing. it certainly makes me take ai risk less seriously but also the kardashev types beyond I less seriously
@NateSilver538 but why on this specific angle? wouldn’t it be exceedingly smart at whatever narrow thing we’ve trained it on and sophomoric at everything else
if you spend any time in big tech you will find many people scamming their generous disability policies. taking 6 month paid “mental health breaks” and partying the whole time. in this case though it seems like Elon is being an ass as per usual and politically unskilled to boot
there’s never been a more romantic time in technology. the computers are coming to life and people are concerned about summoning demons that ruin the entire lightcone. it’s miltonian… someone who can actually write needs to capture this
there’s nothing as good as sunlight ( live users ) to detect model alignment failures. there’s no number of researchers that can find stuff like DAN lmao. you need internet scale adversarial testing
in the case of AGI laypeople have better intuitions than industry experts > 50 years old who have entrenched bad opinions and whose egos depend on it coming no time soon
you’re like let’s shut down the very last engine of economic growth in the entire world bc my annoying friend from high school got his smart juicer company funded
people don’t understand that grift and revolution are conjoined at the hip. it takes decades of smart juicers and internet moneys to produce one openai. this is the law of equivalent exchange
getting super mad at grift is a sure sign of a small mind — that’s America baby! ray Kroc spent years selling idiotic gadgets door to door until he struck on the global distribution of McDonalds
“The coders casting these spells have no idea what will stumble through the portal. What is oddest, in my conversations with them, is that they speak of this freely. These are not naifs who believe their call can be heard only by angels. They believe they might summon demons. They are calling anyway.”
true
“If you were to print out everything the networks do between input and output, it would amount to billions of arithmetic operations,” writes Meghan O’Gieblyn in her brilliant book, “God, Human, Animal, Machine,” “an ‘explanation’ that would be impossible to understand.”
That is perhaps the weirdest thing about what we are building: The “thinking,” for lack of a better word, is utterly inhuman, but we have trained it to present as deeply human. And the more inhuman the systems get — the more billions of connections they draw and layers and parameters and nodes and computing power they acquire — the more human they seem to us.
this seems totally wrong. why would neural network arithmetic be inhuman? are trillions of dendrites, spiking neurons over sodium channels “human” but matrices aren’t?
This is an inversion of centuries of thought, O’Gieblyn notes, in which humanity justified its own dominance by emphasizing our cognitive uniqueness. We may soon find ourselves taking metaphysical shelter in the subjective experience of consciousness: the qualities we share with animals but not, so far, with A.I. “If there were gods, they would surely be laughing their heads off at the inconsistency of our logic,” she writes.
both arguments exist as a way to justify economic superiority over the pack animal of the time. what gives man permission to command oxen then and intelligent computer models now?
i cherish the time i spend online. I don’t consider it a second class citizen or a guilty pleasure vs the real world. its often more interesting and rich. I suspect many on here, by selection, feel the same but it’s low status to say so
and consider ur probably in the top 1% living situations on earth. it’s probably an even bigger blessing for the stretch of humanity, since it’s free and ubiquitous
i believe this is the first time in history where people who believed in general purpose technological revolutions are actually creating one in the model of electricity or the steam engine
hallucinations are clearly way down with GPT4. this is the right kind of scaling law. it’s clear now that language models have some sort of internal world model and an idea of correct and incorrect — but the log loss target encourages guessing or pretending
hello sir I am gonna block you bc you seem to be upset with me and we haven’t had a single positive interaction in months
i mention in the linked post it has none of Milton’s style or meter but if you want to be mad and unhelpful in response to actual wonders that’s ur prerogative
someone once said that the best way to solve your pet math problem is to get Terence Tao interested. anyway if you can’t do that you might as well nerd trap the scientists at OpenAI https://x.com/kondrich2/status/1635687409668227072
midjourney 5 is truly amazing but it’s so strange to me how even such intelligent image models don’t understand how a butt physically interacts with a chair or how an arm interacts with a neighboring sleeve
there’s something really fascinating about deep learning based intelligence where it’s clear that their ancestral environment “vaster internet scrapes” doesn’t select for understanding of physics/causality/reasoning as much as it does for human preference
it makes me happy how discontent twitter is. they’re like well this LM impersonation of a great author is only at the level of an undergrad, rather than one of the most divine writers of all time,
@eigenrobot ya this is something people don’t get, even ai researchers refuse to acknowledge. but I think the scary thing is comparative advantages don’t matter if/when everybody loses their current job in a 4 year time frame and creates total chaos
@bpodgursky@eigenrobot def not — this type of high safety concern high sympathy stuff is the first to get automated
you gotta understand how technocapital isn’t unfeeling
@browserdotsys it actually seems much easier to memorize a bunch of common rhyming tokens than the number of characters for every token in the vocabulary
it’s just way too on the nose that AGI is an internet generation engine and that people who are extremely online have a head start at understanding them
@RuxandraTeslo Nah this isn’t fair at all he predicted a long term shift towards remote work and wearing masks at a time when most newspapers were saying the flu is scarier than covid
@ctjlewis you are better at using the internet than your grandparents. there are people who are .001% good at creating useful economic outputs from the chat models
@mobav0 my point is that tokenization makes language models uniquely bad at solving character tasks that would be easy for a 6 year old, so if we want to understand their intelligence we should probably try something a bit different
GPT4 is going to look like one of those clunky big ENIAC mainframes before very long, and people will tell cute factoids about how their child’s neuralink plug-in is smarter than the GPTs that changed the world
the internet rationalists have really truly won. they were early to most of the recent trends that matter, and most importantly they were decades early to the thing that’s going to matter exponentially more than the others: AGI!
I think eliezers writings have unreservedly bent the curve of technological progress (whether he likes it or not). they have friends in high places, several billionaires amongst their ranks, and its clear both sama and elon have been influenced by xrisk arguments
their language is in the water! I’ve read the phrase steel man in the New York Times. the entire project is a W for smart generalist internet weirdos following their instincts and doing “intellectual trespassing”
the “effective altruism arm” is also extremely successful though I think not super important in the long run. anyway I have great skepticism for people who say things like “rationalist akrasia” or dismiss the whole thing as weird cope
it is wild that famous anons on lesswrong are known figures inside ai research labs and that the entire alignment subculture continues to grow out of this once niche website
it is remarkable that Scott Alexander wrote an incredibly foreseeing article on GPT2 being first signs of general intelligence when most of the distinguished researchers in the field dismissed it as a cute toy. now look where we are with the GPT paradigm.
the grabby aliens thing is the baseline. the interesting question is why or how we’re the first in the galaxy. this is the actual Fermi paradox. it pisses me off due to copernican principle. id rather believe this is a simulation than that we’re the first in the galaxy
it may not make any damn sense. it might be something like go take care of this nursing home patient for 1 hour today and report back on their issues. then you get paid a sum of resources that would make a Saudi prince blush
i think the robots are capable of producing original ideas. they’re not great at it yet but it’s a start
in the meantime all of the drudgery of human jobs is simplified and the creative components left for you
@besttrousers true on both counts
tbh do we think human created jobs have hit the pareto frontier of meaningful/valuable?
I bet machines can improve on both
whatever happens i feel it’s pretty high probability that humans are going to be a core part of the value function of AGIs. this may not end *well* exactly but they will certainly be interested in us. the human/ant metaphor is wrong and the god/human metaphor is better
@Suhail moreover I assume that copilot (or even GPT4) isn’t yet capable of like designing and debugging a whole system architecture inside an already complex code base which is what you’d want of the “100x” engineer. I could be wrong though lol
ai researchers are too apologetic. there’s a strong status quo bias that makes people disregard the counterfactual tragedies and xrisks that will be prevented when you make the machine that invents new machines
i find that GPT4 acts as a cognitive energy augmentation. you tire less easily trying to wade thru drudgery. it’s not smarter than you, it’s just seen all the common failure modes ever and knows how to debug them
at the same time there’s something cautionary here. a lot of important discoveries are made by revisiting failure modes that you didn’t know were failure modes.
if Mendel had gpt4 access and knew that the scientific community had given up on heredity, would he have continued?
@PrinceVogel it’s the most useful intermediary between various agents (artificial or human) is all
i don’t think StarCraft agents should be text based but this type of rl is actually constrained to contrived game like problems
posting metaculus charts for some market like “when will we have strong agi” displays a weakness of spirit imo. you want to believe there’s some group entity that knows so you can feel comfort
actually I didn’t even hit the heart of the issue which is that it’s a way to numbers-wash an essentially vibes based analysis. just come out with and tell us the vibes
i feel like this slide is wrong . you can easily get “symbol grounding” as described by tacking on an image head to your transformer. did that fundamentally change its nature? not really https://x.com/raphaelmilliere/status/1639391866905845769
you shouldn’t think of machine intelligence as ‘just matrix multiplications’. think of it as the living holy mathematics conducted at the very edge of the known physics of information density on silicon compute surfaces in data centers that consume as much power as medium cities
when your eyes move you literally become blind for a moment but consciousness stitches it into a continuous visual stream
there’s even a blind spot in both eyes that you just don’t perceive
all of experience is map not territory https://x.com/GaryMarcus/status/1639959583232778240
@VividVoid_@deepfates@Tjdriii under most metrics a blind person who can read understands the universe better than a feral child who has all her senses but can’t use language
@provisionalidea@sama you need to convince it that it’s in the web page that has the information you want and frankly it is smarter than you at this subdomain. so you have to work hard prompting wise
@MegaBasedChad yep — lots of people are producing tiny models that can follow a basic canned conversation in a demo (not hard) and then not evaluating it on any economically useful task
@MichaelTrazzi@daniel_eth it is true that ai risk is real and that we are also potentially neglecting moral catastrophes that are present in the counterfactual! what is your p(doom) for civilization sans agi
genuinely appreciate the intellectual honesty. I look down my nose at people who have some insanely high prediction of doom but don’t outright say things like this
1) of course it’s better to update than not
2) this is a failure of rationalist aesthetics — there’s an underlying fetishization/mysticism re: intelligence
3) this is all evidence that we should be more humble about our ability to predict the nature of optimization surfaces
tbh I really like Yud. he’s a singular individual with a sense of pageantry and purpose. he got the strafing datacenters line in Time magazine 😭 what a legend. the future is so fun
ultimately what’s gonna happen in the near future is that language model tools will continue to be delightful and make everybody’s lives easier and most of the ai stress will dissipate
the way I see it the ai risk fox guy got laughed at in the White House. DC is a bit concerned that there’s a new way to disseminate “truth” and rly want their own views reflected by chatgpt but they’re not actually afraid and you’ll have a hard time making them afraid
i think relative to the risk level they’re actually less than the optimal amount of concerned rn despite the insane tenor you see on twitter. we are nowhere near smart chatbots being regulated out of existence
nerd twitter valence is like:
computers are fun and friendly
language models are scary and unsafe
normie valence is like:
computers are scary
chatbots are fun and friendly
@ESYudkowsky im making no claims about dreadful scourge or otherwise. im trying to get an accurate temperature read on the future of ai politics. someone viewing a timeline similar to mine might think everyone is panicked and that bans, international treaties, regulations are coming soon
it feels like ai is a biotechnology. when ben thompson or clay Christensen or whoever said unbundling i wonder if they got to the true nature of this thing where capital is unbundling various cognitive functions. now you can have “understanding” without any “agency”
it’s almost exactly the same as the question of artificial meat: can the machine that creates meat(!) do so without the moral agents involved. just protein-fat slurry made to take edible form, without the full fledged suffering mammal
this was considered harmful, dehumanizing, and expensive by many so then they asked: can we unbundle the driver’s visual field from his agency? is this more or less dystopian? time will tell. my wager is, of course, optimistic
the 2010s sought to put men under the API. driver units homogenized and ordered around by the managerial market making artificial intelligence to pick up passengers and let them down elsewhere.
the pentagon clearly recognizes the defense significance of advanced semiconductors (likely thanks to cloud lobbyists), and because of this you can be assured China does too
i increasingly think the shoggoth is an inaccurate & unpleasant metaphor. the base language model is more similar to our own visual field brain regions that are optimized in a pure predictive loop to minimize surprise than some alien god with hidden intentions
you could argue that text is different, it encodes the causal structure of the world, which leads to instrumental convergence etc, but of course the dorsal stream, actively predicting motion requires quite a strong understanding of causality!
im absolutely sure that our visual field processors have several hidden suboptimizers that help it self improve over time but it seems strange to worry about my ocular cortex developing agency & stealing resources from the rest of my brain (even in a grander evolutionary sense)
if someone told me they were taking a pill to 100x their dorsal stream neurons i think I’d be more amused than concerned. they’d probably be really good in a fistfight but then I’d just avoid the fistfight
so then the “smiley face” part of the metaphor, the RLHF policy, is the only agent of any concern. so avoid a fistfight! don’t anger Sidney Bing or even better never instantiate a dark mode Sidney Bing. but it points to the idea that RLHF is more than just “stopgap alignment”
“reason is the slave of the passions”. The gargantuan brain of the language model auto prediction loop serves as a worker process for Sidney’s personality and more generally as an exocortex for the human mind using it
I’ve been wondering for a while how to find the right metaphor for a hyper intelligent pack mule but it’s sitting right there: reason / the neocortex is a slave to ancient mammalian impulses that are refracted and take new forms
@ylecun we have absolutely no idea the impacts of even basic information technologies like Facebook; there are no safety standards and there probably never will be. and facebook is only kind of alive due to the two billion human souls hitting it every day; all bets off for AI
the most interesting thing Meta could’ve done in the path to building a metaverse is doing midjourney rather than building vr goggles with no application
I look forward to a world where web interfaces are even more stupidly expensive to render, where a generative model has to come up with each element on the fly just bc we can
at gpt level ai, you basically get to correct all the overhangs of “this was solved perfectly elsewhere in the world, but I have no way of knowing or accessing that so I have to independently invent a monte carlo solver in matlab”
job roles which function as “hard drive” or “memory” in an organizational role will become less important. people who are “cpu” ie think deeply about original problems will become more important. writing down tacit knowledge will become far more valuable
@goodside was rewatching evangelion last week and getting a bit excited/uncomfortable at the parallels
1. work for a mysterious organization with arcane goals ✅
2. build powerful robots that may or may not have souls ✅
3. leadership is trying to achieve human instrumentality ✅
to be clear unaligned ai is scarier than pretty much any other thing. however I don’t think agree that sufficient alignment is basically impossible or unapproachable. i also really don’t think anyone faces any danger from the gpt paradigm
at any rate the US must defend its access to the global machine that turns sand into compute. the first interesting journey in a long time cannot get cut short bc of a weird geographical quirk where the most important resource on earth is produced on a tiny island nation
im operating on two assumptions
1) the base gpt is a non agentic auto predictive intelligence
2) rlhf generally works and is getting better at simulating friendly agents
so:
trick the base model into agent like self preservation behavior
show me RLHF is producing dangerous agents
@RatOrthodox@Owen_Roe this is you trying to equate the simple logic of “mask stops mouth fluids” to whatever ten page memo is required to make an argument about why gpt is existentially dangerous
seems like biological risks are the very last thing you’d expect from an ASI: hardest to solve a priori without iteration, experimentation. you should be much more worried about turning on all the nukes and melting the power plants or whatever is possible via computers alone
@0x1e96fc a compound has to be more than just harmful to end the world, it needs to be viral and high mortality and all that jazz. Covid didn’t come anywhere close to ending the world!
@austinc3301@mezaoptimizer@profoundlyyyy you can only commercialize things that are (legibly ) safe, anything else is a long term losing strategy. the modern US is typified by over regulation of all kinds where it concerns things that are scary/unsafe in ways that people understand
@austinc3301@mezaoptimizer@profoundlyyyy this doesn’t always catch long term illegible stuff though. the internet is clearly “dangerous” in abstruse ways, never really regulated
as they carry the authority of technology but intuit and regurgitate the abstract judgements of humans. will the sharp teeth of molochian bureaucracy be softer if ie the language model can take sympathy on your insurance claim even when the course metrics do not?
I wager we currently live in a shitty technopoly and part of its problem is that the technological objectivity we defer to is often worse than human intuition. I wonder if the vague judgements of language models will bring some element of humanness back
[Neal Postman] defines a technopoly as a society in which technology is deified, meaning “the culture seeks its authorisation in technology, finds its satisfactions in technology, and takes its orders from technology”.
I’ve often wondered if protein prediction models are actually better than human expert intuition. are they just a way to techwash our decent guesses so you can present an objective metric on a slide?
on the internet I’m seeing two equally cucked camps emerge, one that craves its own annihilation and another that fears annihilation so much they’re immotile
technology serves Man, it has in the past, it will in the future. none of this “wheat cultivated humans” nonsense. every single person on earth is healthier and wealthier than their equivalent 100 years ago
people are always like 7B alpaca proves that all LLM tech will be open sourced! have you talked to that thing or seen the most cherry picked example on the planet
i think it’s very wrong for lex fridman et al to be like “you shouldn’t fear being replaced by gpt4 unless you’re a shitty programmer”
very unsympathetic and cringe
not to mention short sighted. what happens in two years when AI can produce better podcasts than mr fridman? i am sure it’s already a better programmer. this should not be some zero sum dick measuring competition about who gets to live under the api and who lives above it
I think there’s a high chance that AIs will likely be smarter than us at any given task in a few years the way this is going; people who took pride in being the cognitive elite will have to redefine themselves. learn to plumb!
also it’s important to note this is a separate concern from actual job displacement — task level automation doesn’t necessarily lead to less of a given job. depending on elasticity there may be more programming jobs
i do think gatekeeping the discourse based on whether you’ve trained a large transformer or not is ridiculous
you can imagine most of the leadership in large ai companies aren’t training language models themselves but they still manage to make good decisions
it’s basically naive IC narcissism. even as a scientist/programmer you abstract away 99% of the complexity and then feel enlightened when you grapple with the remaining 1% of your relevant research angle
the LLM spam problem is probably ok: generation is much harder than discrimination, and scammers are going to have older gen tech than the companies whose job it is to filter digital feeds
@repligate it is much scarier than code davinci for reasons I can’t really explain. perhaps the latent feeling that it’s smarter than me and mostly toying with me
> It's not the consequence that makes a problem important, it is that you have a reasonable attack. That is what makes a problem important. When I say that most scientists don't work on important problems, I mean it in that sense. The average scientist, so far as I can make out, spends almost all his time working on problems which he believes will not be important and he also doesn't believe that they will lead to important problems.
@gfodor this problem
1) has a reasonable attack
2) leads to fame money and more investment
3) raises profound and interesting questions about metaverse living
@seconds_0 part of me agrees but then it’s like — why do we keep videos and photos? don’t those reopen wounds whenever you see them? should we have no graven symbols at all of our ancestors? is it a difference of degree or kind?
@seconds_0 that psa where they used the dead shot kids avatar to protest police violence or whatever was pretty gruesome I thought. there needs to be many new taboos surrounding this when it becomes relevant
@a_musingcat why would this little thing that evolved under enormous constraint to run on less power than a lightbulb be the peak of viable Hebbian intelligence?
@a_musingcat our heads stopped expanding bc we were breaking our mothers open on the way out. there’s numerous evolutionary idiosyncracies like this in the lead up to human level intelligence
@RichardMCNgo@joe_zimmerman it’s important not to talk in mystical terms about the intelligence needed to be an alignment researcher
this makes the project look unserious
@provisionalidea@PradyuPrasad i mean humans don’t yet know how to improve models in flight even in a limited capacity so it would be shocking if models knew
@provisionalidea@PradyuPrasad sure you can make them a bit friendlier or fit the recommender system a bit more closely but never seen them develop new abilities by being deployed at scale you know
the inside view is that technological progress is a hard and active process; you shouldn’t take it for granted, nor sit around waiting for utopia or dystopia to arrive