Previously: On the Proposed California SB 1047.
Text of the bill is here. It focuses on safety requirements for highly capable AI models.
This is written as an FAQ, tackling all questions or points I saw raised.
Safe & Secure AI Innovation Act also has a description page.
Editor’s Note: This post was originally published on May 2, 2024. On June 6, 2024, major changes were introduced to SB 1047, fixing the biggest issues and also weakening the bill. This post was updated on June 10, 2024 to reflect those changes.
Why Are We Here Again?
There have been many highly vocal and forceful objections to SB 1047. Some were in reaction to a (disputed and seemingly incorrect) claim that the bill has been ‘fast tracked.’
Later, after the changes were introduced, some critics softened or withdraw their objections. Other critics doubled down.
The bill continues to have substantial chance of becoming law according to Manifold, where the market has not moved much on recent events. The previous version of the bill had been referred to two policy committees one of which put out this 38 page analysis.
The purpose of this post is to gather and analyze all objections that came to my attention in any way, including all responses to my request for them on Twitter, and to suggest concrete changes that address some real concerns that were identified.
Some are helpful critiques pointing to potential problems, or good questions where we should ensure that my current understanding is correct. The original version of this post emphasized the need for two important fixes. Those fixes have now been made.
Some are based on what I strongly believe is a failure to understand how the law works, both in theory and in practice, or a failure to carefully read the bill, or both. Many of these issues have been further clarified by the bill changes.
Some are pointing out a fundamental conflict. They want people to have the ability to freely train and release the weights of highly capable future models. Then they notice that it will become impossible to do this while adhering to ordinary safety requirements. They seem to therefore propose to not have safety requirements.
Some are alarmist rhetoric that has little tether to what was ever in the bill, or how any of this works. I am deeply disappointed in some of those using or sharing such rhetoric.
Throughout such objections, there is little or no acknowledgement of the risks that the bill attempts to mitigate, suggestions of alternative ways to do that, or reasons to believe that such risks are insubstantial even absent required mitigation. To be fair to such objectors, many of them have previously stated that they believe that future more capable AI poses little catastrophic risk.
I get making mistakes, indeed it would be surprising if this post contained none of its own. Understanding even a relatively short bill like SB 1047 requires close reading. If you thoughtlessly forward anything that sounds bad (or good) about such a bill, you are going to make mistakes, some of which are going to look dumb.
What is the Story So Far?
If you have not previously done so, I recommend reading my previous coverage of the bill when it was proposed, while noting that this covered an older version of the bill, and has not been updated to reflect the changes.
In the first half of that post, I did an RTFB (Read the Bill). I read it again for this post.
The core bill mechanism is that if you want to train a ‘covered model,’ then you have various safety requirements that attach.
If your model is no more capable than one previously established as safe, you can invoke a limited duty exemption, and that is mostly the end of your duties. If not, then there are various things you must do.
If you fail in your duties you can be fined. If you purposefully lie about any of this then that is under penalty of perjury.
The definition of ‘covered model’ has contracted greatly due to the changes. Previously, a model counted if it used 10^26 flops, or if it was similarly powerful to what you could accomplish with 10^26 flops in 2024. Under the new version, a model counts only if it both uses 10^26 flops AND the compute used would cost over $100 million if purchased on the open market at time of training.
In practice, this means that the compute threshold is likely already higher than 10^26 flops, and it will continue to steadily rise over time. Where the previous bill tightened its threshold over time, now the threshold will rapidly loosen over time.
I concluded this was a good faith effort to put forth a helpful bill. As the bill deals with complex issues, it contains both potential loopholes on the safety side, and potential issues of inadvertent overreach, unexpected consequences or misinterpretation on the restriction side.
In the second half, I responded to Dean Ball’s criticisms of the bill, which he called ‘California’s Effort to Strangle AI.’
In the section What Is a Covered Model, I contend that zero current open models would count as covered models even under the old definition, and most future open models would not count, in contrast to Ball’s claim that this bill would ‘outlaw open models.’
Under the new wording, no existing open models would be covered.
In the section Precautionary Principle and Covered Guidance, I notice that what Ball calls ‘precautionary principle’ is an escape clause to avoid requirements, whereas the default requirement is to secure the model during training and then demonstrate safety after training is complete.
On covered guidance, I notice that I expect the standards there to be an extension of those of NIST, along with applicable ‘industry best practices,’ as indicated in the text.
In the section Non-Derivative, I notice that most open models are derivative models, upon which there are no requirements at all. As in, if you start with Llama-3 400B, the safety question is Meta’s issue and not yours.
In the section So What Would the Law Actually Do, I summarize my practical understanding of the law. I will now reproduce that below, with modifications for the changes to the bill (now including the later ones) and my updated understandings based on further analysis (the original version is here).
In Crying Wolf, I point out that if critics respond with similar rhetoric regardless of the actual text of the bill offered, as has been the pattern, and do not help improve any bill details, then they are not helping us to choose a better bill. And that the objection to all bills seems motivated by a fundamental inability of their preferred business model to address the underlying risk concerns.
What Do I Think The Law Would Actually Do?
This is an updated version of my previous list.
This reflects that they have introduced a ‘limited duty exemption,’ which I think mostly mirrors previous functionality but improves clarity. It also reflects several changes made in May 2024.
This is a summary, but I attempted to be expansive on meaningful details.
Let’s say you want to train a model. You follow this flow chart, with ‘hazardous capabilities’ meaning roughly ‘can cause 500 million or more in damage in especially worrisome ways, or a similarly worrying threat in other ways’ but clarification would be appreciated there.
If your model is not over the 10^26 flops limit OR the compute you used would have cost less than $100 million on the open market?
You do not need to do anything at all. As you were.
You are not training a covered model.
You do not need a limited duty exemption.
That’s it.
Every other business in America and especially California is jealous.
Where the 10^26 threshold is above the estimated compute cost of GPT-4 or the current versions of Google Gemini, and no open model is anywhere near it other than Meta’s prospective Llama-3 400B, which may or may not hit it. And the $100 million threshold will effectively be far higher than that.
If your model is a derivative of an existing model, meaning you spent less than 25% of the compute originally used to train the model you modified?
You do not need to do anything at all. As you were.
All requirements instead fall on the original developer.
You do not need a limited duty exemption.
That’s it.
Most open models are derivative in this sense, often of e.g. Llama-N.
If you did use 25% or more of the originally used compute, then your model is no longer derivative. Proceed to step 3.
If your model is projected to have lower benchmarks and not have greater capabilities than an existing non-covered model, or one with a limited duty exemption?
Your model qualifies for a limited duty exemption.
You can choose to accept the limited duty exemption, or proceed to step 4.
To get the exemption, certify why the model qualifies under penalty of perjury.
Your job now is to monitor events in case you were mistaken.
If it turns out you were wrong in good faith about the model’s benchmarks or capabilities, you have 30 days to report this and cease operations until you are in compliance as if you lacked the exemption. Then you are fully in the clear.
If you are judged not in good faith, then it is not going to go well for you.
If none of the above apply, then you are training a covered model. If you do not yet qualify for the limited duty exemption, or you choose not to get one? What do you have to do in order to train the model?
Implement cybersecurity protections to secure access and the weights.
Implement a shutdown capability during training.
Implement all covered guidance.
Implement a written and separate safety and security protocol.
The protocol needs to ensure the model either lacks hazardous capability or has safeguards that prevent exercise of hazardous capabilities.
The protocol must include a testing procedure to identify potential hazardous capabilities, and what you would do if you found them.
The protocol must say what would trigger a shutdown procedure.
Once training is complete: Can you determine a limited duty exemption now applies pursuant to your own previously recorded protocol? If no, proceed to #6. If yes and you want to get such an exemption:
You can choose to file a certification of compliance to get the exemption.
You then have a limited duty exemption.
Once again, judged good faith gives you a free pass on consequences, if something were to go wrong.
To be unreasonable, the assessment also has to fail to take into account ‘reasonably foreseeable’ risks, which effectively means either (1) another similar developer, (2) NIST or (3) The Frontier Model Division already visibly foresaw them.
What if you want to release your model without a limited duty exemption?
You must implement ‘reasonable safeguards and requirements’ to prevent:
An individual from being able to use the hazardous capabilities of the model.
An individual from creating a derivative model that was used to cause a critical harm.
This includes a shutdown procedure for all copies within your custody.
You must ensure that anything the model does is attributed to the model to the extent reasonably possible. It does not say that this includes derivative models, but I assume it does.
Implement any other measures that are reasonably necessary to prevent or manage the risks from existing or potential hazardous capabilities.
You can instead not deploy the model, if you can’t or won’t do the above.
After deployment, you need to periodically reevaluate your safety protocols, and file an annual report. If something goes wrong you have 72 hours to file an incident report.
Also, there are:
Some requirements on computing clusters big enough to train a covered model. Essentially do KYC, record payments and check for covered model training. Also they are required to use transparent pricing.
Some ‘pro-innovation’ stuff of unknown size and importance, like CalCompute. Not clear these will matter and they are not funded.
An open source advisory council is formed, for what that’s worth.
What are the Biggest Misconceptions?
That this matters to most AI developers.
It doesn’t, and it won’t.
Right now it matters at most to the very biggest handful of labs.
It only matters later if you are developing a non-derivative model using 10^26 or more flops and that costs over $100 million in compute.
Or, it could matter indirectly if you were planning to use a future open model from a big lab such as Meta, and that big lab is unable to provide the necessary reasonable assurance to enable the release of that model.
That you need a limited duty exemption to train a non-covered or derivative model.
You don’t.
You have no obligations of any kind whatsoever.
That you need a limited duty exemption to train a covered model.
You don’t. It is optional.
You can choose to seek a limited duty exemption to avoid other requirements.
Or you can follow the other requirements.
Your call. No one is ever forcing you to do this.
That this is an existential threat to California’s AI industry.
Again, this has zero or minimal impact on most of California’s AI industry.
This is unlikely to ever change under the new rules. Few companies will want to spend over $100 million to train a model that competes with Google, Anthropic and OpenAI.
Even if you do spend over $100 million, if you are behind the frontier, you will be able to get a limited duty exemption.
It is possible that some other companies might create future narrow models for things like robotics or science sufficiently expensive to be covered. If so, those models not being general should greatly ease cost of compliance, and such projects would by construction have the budget to afford it.
That the bill threatens academics or researchers.
This bill very clearly does not. It will not even apply to them. At all.
Those who say this, such as Martin Casado of a16z who was also the most prominent voice saying the bill would threaten California’s AI industry, show that they do not at all understand the contents or implications of the bill.
This is even more true now that the threshold is compute cost of $100 million.
There are even claims this bill is aimed at destroying the AI industry, or destroying anyone who would ‘challenge OpenAI.’
Seriously, no, stop it.
This bill is designed to address real safety and misuse concerns.
That does not mean the bill is perfect, or even good. It has costs and benefits.
That the requirements here impose huge costs that would sink companies.
The cost of filing the required paperwork is trivial versus training costs. If you can’t do the paperwork, then you can’t afford to train the model either.
The real costs are any actual safety protocols you must do if you are training a covered non-derivative model and cannot or will not get a limited duty exemption.
In which case you should mostly be doing anyway.
The other cost is the inability to release a covered non-derivative model if you cannot get a limited duty exemption, and also cannot provide reasonable assurance of lack of hazardous capability.
Especially with the recent changes, this should only happen for a reason.
That this bill targets open weights or open source.
It does the opposite in two ways. It excludes shutdown of copies of the model outside your control from the shutdown requirement, and it creates an advisory committee for open source with the explicit goal of helping them.
When people say this will kill open source, what they mostly mean is that open weights are unsafe and nothing can fix this, and they want a free pass on this. So from their perspective, any requirement that the models not be unsafe is functionally a ban on open weight models.
Open model weights advocates want to say that they should only be responsible for the model as they release it, not for what happens if any modifications are made later, even if those modifications are trivial in cost relative to the released model. That’s not on us, they say. That’s unreasonable.
There was one real issue. The derivative model clause allowed someone to train ‘on top of’ your model and put the legal responsibility on you. The derivative model clause has been modified to mitigate this.
Many of the issues raised as targeting ‘open source’ apply to all models.
That developers risk going to jail for making a mistake on a form.
This (almost) never happens.
Seriously, this (almost) never happens.
People almost never get prosecuted for perjury, period. A few hundred a year.
When they do, it is not for mistakes, it is for blatant lying caught red handed.
And mostly that gets ignored too. The prosecutor needs to be really pissed off.
Hazardous capability includes any harms anywhere that add up to $500 million.
That is not what the bill says.
The bill says the $500 million must be due to cyberattacks on critical infrastructure, autonomous illegal-for-a-human activity by an AI, or something else of similar severity.
This very clearly does not apply to ‘$500 million in diffused harms like medical errors or someone using its writing capabilities for phishing emails.’
I suggest changes to make this clearer, but it should be clear already.
That the people advocating for this and similar laws are statists that love regulation.
Seriously. no. It is remarkable the extent to which the opposite is true.
That this would be administered in completely unreasonable ways by ‘doomers’ out to destroy the AI industry.
I am confused that people claim such people have this kind of power.
If they somehow gained that power and tried, the courts would block them.
What are the Real Problems?
The original version of the bill had three issues I highlighted. They were:
Derivative models can include unlimited additional training, thus allowing you to pass off your liability to any existing open model, in a way clearly not intended. This should be fixed by my first change below.
This has been addressed via a cap of 25% of original training costs, as I suggested. If you spend more than that, the new model is now non-derivative.
If you wanted to be more complex in order to fully rule out crazy enforcement of technical corner cases, and were willing to risk some errors in the other direction, you could add a second trigger of ‘or sufficient compute to qualify a model as covered.’
The comparison rule for hazardous capabilities risks incorporating models that advance mundane utility or are otherwise themselves safe, where the additional general productivity enables harm, or the functionality used would otherwise be available in other models we consider safe, but the criminal happened to choose yours. We should fix this with my second change below.
This has been addressed exactly as I suggested, and should be fine now.
In addition to those large problems, a relatively small issue is that the catastrophic threshold is not indexed for inflation. It should be.
It has been done.
Then there are problems or downsides that are not due to flaws in the bill’s construction, but rather are inherent in trying to do what the bill is doing or not doing.
First, the danger that this law might impose practical costs.
This imposes costs on those who would train covered models. Most of that cost, I expect in practice, is in forcing them to actually implement and document their security practices that they damn well should have done anyway. But although I do not expect it to be large compared to overall costs, since you need to be training a rather large non-derivative model for this law to apply to you, there will be some amount of regulatory ass covering, and there will be real costs to filing the paperwork properly and hiring lawyers and ensuring compliance and all that.
The change to a $100 million threshold makes it highly unlikely total compliance costs ever exceed ~2% of training compute costs, even if the worst warnings about total costs are true (which I very much do not think they are) and you are right at the threshold.
It is possible that there will be models where we cannot have reasonable assurance of their lacking hazardous capabilities, or even that we knew have such capabilities, but which it would pass a cost-benefit test to make available, either via closed access or release of weights.
Because even a closed weights model can be jailbroken reliably, if a solution to that and similar issues cannot be found, alignment continues to be unsolved and capabilities continue to improve, and when this becomes sufficiently hazardous and risky, and our safety plans seem inadequate, this could in the future impose a de facto cap on the general capabilities of AI models, at some unknown level above GPT-4. If you think that AI development should proceed regardless in that scenario, that there is nothing to worry about, then you should oppose this bill.
Because open weights are unsafe and nothing can fix this, if a solution to that cannot be found and capabilities continue to improve, then holding the open weights developer responsible for the consequences of their actions may in the future impose a de facto cap on the general capabilities of open weight models, at some unknown level above GPT-4 or relative to closed model capabilities, that might not de facto apply to closed models capable of implementing various safety protocols unavailable to open models. If you instead want open weights to be a free legal pass to not consider the possibility of enabling catastrophic harms and to not take safety precautions, you might not like this.
It is possible that there will be increasing regulatory capture, or that the requirements will otherwise be expanded in ways that are unwise.
It is possible that rhetorical hysteria in response to the bill will be harmful. If people alter their behavior in response, that is a real effect.
This bill could preclude a different, better bill.
There are also the risks that this bill will fail to address the safety concerns it targets, by being insufficiently strong, insufficiently enforced and motivating, or by containing loopholes. In particular, the fact that open weights models need not have the (impossible to get) ability to shutdown copies not in the developer’s possession enables the potential release of such weights at all, but also renders the potential shutdown not so useful for safety.
With the later change to the $100 million threshold, it is plausible that the new effective threshold will become too high.
Also, the liability can only be invoked by the Attorney General, the damages are relatively bounded unless violations are repeated and flagrant or they are compensatory for actual harm, and good faith is a defense against having violated the provisions here. So it may be very difficult to win a civil judgment.
It likely will be even harder and rarer to win a criminal one. While perjury is technically involved if you lie on your government forms (same as other government forms) that is almost never prosecuted, so it is mostly meaningless.
Indeed, the liability could work in reverse, effectively granting model developers safe harbor. Industry often welcomes regulations that spell out their obligations to avoid liability for exactly this reason. So that too could be a problem or advantage to this bill.
What the the Changes That Would Improve the Bill?
There were two important changes. They implemented both of them.
We should change the definition of derivative model by adding an 22606(i)(3) to make clear that if a sufficiently large amount of compute (I suggest 25% of original training compute or 10^26 flops, whichever is lower) is spent on additional training and fine-tuning of an existing model, then the resulting model is now non-derivative. The new developer has all the responsibilities of a covered model, and the old developer is no longer responsible.
They implemented exactly this change.
We should change the comparison baseline on 22602(n)(1) when evaluating difficulty of causing catastrophic harm, inserting words to the effect of adding ‘other than access to other covered models that are known to be safe.’ Instead of comparing to causing the harm without use of any covered model, we should compare to causing the harm without use of any safe covered model that lacks hazardous capability. You then cannot be blamed because a criminal happened to use your model in place of GPT-N, as part of a larger package or for otherwise safe dual use actions like making payroll or scheduling meetings, and other issues like that. In that case, either GPT-N and your model therefore both hazardous capability, or neither does.
They implemented exactly this change, with the comparison point now being versus access only to covered models that would qualify for a limited duty exemption (note it is ‘would qualify’ rather than ‘that have asked for one.)
In addition:
The threshold of $500 million in (n)(1)(B) and (n)(1)(C) should add ‘in 2024 dollars’ or otherwise be indexed for inflation.
Done.
I would clear up the language in 22606(f)(2) to make unambiguous that this refers to the either what one could reasonably have expected to accomplish with that many flops in 2024, rather than being as good as the weakest model trained on such compute, and if desired that it should also refer to the strongest model available in 2024. Also we should clarify what date in 2024, if it is December 31 we should say so. The more I look at the current wording the more clear is the intent, but let’s make it a lot easier to see that.
This is moot since they killed the clause entirely.
After consulting legal experts to get the best wording, and mostly to reassure people, I would add 22602(n)(3) to clarify that to qualify under (n)(1)(D) requires that the damage caused be acute and concentrated, and that it not be the diffuse downside of a dual use capability that is net beneficial, such as occasional medical mistakes resulting from sharing mostly useful information.
After consulting legal experts to get the best wording, and mostly to reassure people, I would consider adding 22602 (n)(4) to clarify that the use of a generically productivity enhancing dual use capability, where that general increase in productivity is then used to facilitate hazardous activities without directly enabling the hazardous capabilities themselves, such as better managing employee hiring or email management, does not constitute hazardous capabilities. If it tells you how to build a nuclear bomb and this facilitates building one, that is bad. If it manages your payroll taxes better and this lets you hire someone who then makes a nuclear bomb, we should not blame the model. I do not believe we would anyway, but we can clear that up.
It would perhaps be good to waive levies (user fees) for sufficiently small businesses, at least for when they are asking for limited duty exceptions, despite the incentive concerns, since we like small business and this is a talking point that can be cheaply diffused.
Recent changes did not address concerns 5, 6 and 7.
Are You Ever Forced to Get a Limited Duty Exemption?
No. Never.
This perception is entirely due to a hallucination of how the bill works. People think you need a limited duty exemption to train any model at all. You don’t. This is nowhere in the bill.
If you are training a non-covered or derivative model, you have no obligations under this bill.
If you are training a covered model, you can choose to implement safeguards instead.
What is the Definition of Derivative Model? Is it Clear Enough?
Your model is derivative if it is based off an existing other model, and the compute you used from there is less than 25% of the compute cost of the original model.
In theory there could still be an issue. If Llama-5 is trained on 10^29 flops for billions of dollars in 2028, getting well ahead of the covered model thresholds and the point at which models could get catastrophically dangerous, and then Acme corporation creates Acme-3 by training on 10^28 flops, they could then be training an otherwise covered model that would still technically be the responsibility of Meta.
There is an inherent conflict here, and you have to choose between holding Meta accountable if they greatly reduce the cost of creating a dangerous model versus risking making Meta responsible for things someone could have trained anyway. You cannot fully guard against both errors. I think this solution is a fair compromise, but as noted above you could talk price if you worry more about one than the other.
Nick Moran suggests the derivative model requirement is similar to saying ‘you cannot sell a blank book,’ because the user could introduce new capabilities. He uses the example of not teaching a model any chemistry or weapon information, and then someone fires up a fine-tuning run on a corpus of chemical weapons manuals.
I think that is an excellent example of a situation in which this is ‘a you problem’ for the model creator. Here, it sounds like it took only a very small fine tune, costing very little, to enable the hazardous capability. You have made the activity of ‘get a model to help you do chemical weapons’ much, much easier to accomplish than it would have been counterfactually. So then the question is, did the ability to use the fine-tuned model help you substantially more than only having access to the manuals.
Whereas most of the cost of a book that describes how to do something is in choosing the words and writing them down, not in creating a blank book to print upon, and there are already lots of ways to get blank books.
If the fine-tune was similar in magnitude of cost to the original training run, then I would say it is similar to a blank book, instead, hence the 25% rule.
Charles Foster finds this inadequate, responding to a similar suggestion from Dan Hendrycks, and pointing out the combination scenario I may not have noticed otherwise.
Charles Foster: I don’t think that alleviates the concern. Developer A shouldn’t be stopped from releasing a safe model just because—for example—Developer B might release an unsafe model that Developer C could cheaply combine with Developer A’s. They are clearly not at fault for that.
The good news is I believe this is fixed by the second major change, as noted above.
The bottom line, as I see it is:
We have fixed the definition of derivative model so that if you spend a substantial percentage of the original compute, responsibility correctly shifts.
If you are severely discounting the cost of creating an unsafe system, and we can talk price about what the rule should be here, then that does not sound safe to me.
If it is impossible to create a highly capable open model weights system that cannot be made unsafe at nominal time and money cost, then why do you think I should allow you to release such a model?
We should identify cases where our rules would lead to unreasonable assignments of fault, and modify the rules to fix them. The new rule is much better, but can plausibly be improved further.
What Constitutes Hazardous Capability?
Here is the current text, note the changes, italics are additions.
(n) (1) “Hazardous capability” means the capability of a covered model to be used to enable any of the following harms in a way that would be significantly more difficult to cause without access to a covered
model:model that does not qualify for a limited duty exemption:(A) The creation or use of a chemical, biological, radiological, or nuclear weapon in a manner that results in mass casualties.
(B) At least five hundred million dollars ($500,000,000) of damage through cyberattacks on critical infrastructure via a single incident or multiple related incidents.
(C) At least five hundred million dollars ($500,000,000) of damage by an artificial intelligence model that autonomously engages in conduct that would violate the Penal Code if undertaken by a
human.human with the necessary mental state and causes either of the following:(i) Bodily harm to another human.
(ii) The theft of, or harm to, property.
(D) Other grave threats to public safety and security that are of comparable severity to the harms described in paragraphs (A) to (C), inclusive.
The change to the counterfactual tightens this considerably.
I presume that everyone is onboard with (A) counting as hazardous. We could more precisely define ‘mass’ casualties, but it does not seem important.
Notice the construction of (B). The damage must explicitly be damage to critical infrastructure. This is not $500 million from a phishing scam, let alone $500 from each of a million scams.
Similarly, notice (C). The violation of the penal code must be autonomous.
Both are important aggravating factors. A core principle of law is that if you specify X+Y as needed to count as Z, then X or Y alone is not a Z.
So when (D) says ‘comparable severity’ this cannot purely mean ‘causes $500 million in damages.’ In that case, there is no need for (B) or (C), one can simply say ‘causes $500 million in cumulative damages in some related category of harms.’
My interpretation of (D) is that the damages need to be sufficiently acute and severe, or sufficiently larger than this, as to be of comparable severity with only a similar level of overall damages. So something like causing a very large riot, perhaps.
You could do it via a lot of smaller incidents with less worrisome details, such as a lot of medical errors or malware emails, but we are then talking at least billions of dollars of counterfactual harm.
This seems like a highly reasonable rule.
However, people like Quinton Pope here are reasonably worried that it won’t be interpreted that way:
Quintin Pope: Suppose an open model developer releases an innocuous email writing model, and fraudsters then attach malware to the emails written by that model. Are the model developers then potentially liable for the fraudsters' malfeasance under the derivative model clause?
Please correct me if I'm wrong, but SB 1047 seems to open multiple straightforward paths for de facto banning any open model that improves on the current state of the art. E.g., - The 2023 FBI Internet Crime Report indicates cybercriminals caused ~$12.5 billion in total damages. - Suppose cybercriminals do similar amounts in future years, and that ~5% of cybercriminals use whatever open source model is the most capable at a given time.
Then, any open model better that what's already available would predictably be used in attacks causing > $500 million and thus be banned, *even if that model wouldn't increase the damage caused by those attacks at all*.
Cybercrime isn't the only such issue. "$500 million in damages" sounds like a big number, but it's absolute peanuts compared to things that actually matter on an economy-wide scale. If open source AI ever becomes integrated enough into the economy that it actually benefits a significant number of people, then the negative side effects of anything so impactful will predictably overshoot this limit.
My suggestion is that the language be expanded for clarity and reassurance, and to guard against potential overreach. So I would move (n)(2) to (n)(3) and add a new (n)(2), or I would add additional language to (D), whichever seems more appropriate.
The additional language would clarify that the harm needs to be acute and not as a downside of beneficial usage, and this would not apply if the model contributed to examples such as Quintin’s. We should be able to find good wording here.
I would also add language clarifying that general ‘dual use’ capabilities that are net beneficial, such as helping people sort their emails, cannot constitute hazardous capability.
This is something a lot of people are getting wrong, so let’s make it airtight.
Does the Alternative Capabilities Rule Use the Right Counterfactual?
To count as hazardous capability, this law now requires that the harm be ‘significantly more difficult to cause without access to a covered model that is ineligible for a limited duty exemption,’ not without access to this particular model, which we will return to later.
This is considerably stronger than ‘this was used as part of the process’ and considerably weaker than ‘required this particular covered model in particular.’
The obvious problem scenario, why you can’t use a weaker clause, is what if:
Acme issues a model that can help with cyberattacks on critical infrastructure.
Zenith issues a similar model that does all the same things.
Both are used to do crime that triggers (B) that required Acme or Zenith.
Acme says the criminals would have used Zenith.
Zenith says the criminals would have used Acme.
You need to be able to hold at least one of them liable.
The potential flaw in the other direction is, what if covered models simply greatly enhance all forms of productivity? What if it is ‘more difficult without access’ because your company uses covered models to do ordinary business things? Clearly that is not intended to count.
The solution implemented is that the clause now effectively says ‘without access to a covered model that itself has hazardous capabilities’?
Acme is a covered model.
Zenith is a covered model.
Zenith is used to substantially enable cyberattacks that trigger (B).
If this could have also been done with Acme with similar difficulty, then either both Zenith and Acme have hazardous capabilities, or neither of them do.
I am open to other suggestions to get the right counterfactual in a robust way.
None of this has anything to do with open model weights. The problem does not differentiate. If we get this wrong and cumulative damages or other mundane issues constitute hazardous capabilities, it will not be an open weights problem. It will be a problem for all models.
Indeed, in order for open models to be in trouble relative to closed models, we need a reasonably bespoke definition of what counts here, that properly identifies the harms we want to avoid. And then the open models would need to be unable to prevent that harm.
As an example of this and other confusions being widespread: The post was deleted so I won’t name them, but two prominent VCs posted and retweeted that ‘under this bill, open source devs could be held liable for an LLM outputting ‘contraband knowledge’ that you could get access to easily via Google otherwise.’ Which is clearly not the case.
Is Providing Reasonable Assurance of a Lack of Hazardous Capability Realistic?
It seems hard. Jessica Taylor notes that it seems very hard. Indeed, she does not see a way for any developer to in good faith provide assurance that their protocol works.
The key term of art here is ‘reasonable assurance.’ That gives you some wiggle room.
Indeed, the bill changes clarified this point, moving to more consistently use the exact term ‘provide reasonable assurance’ and adding:
(u) “Reasonable assurance” does not mean full certainty or practical certainty.
Jessica points out that jailbreaks are an unsolved problem. This is very true.
If you are proposing a protocol for a closed model, you should assume that your model can and will be fully jailbroken, unless you can figure out a way to make that not true. Right now, we do not know of a way to do that. This could involve something like ‘probabilistically detect and cut off the jailbreak sufficiently well that the harm ends up not being easier to cause than using another method’ but right now we do not have a method for that, either.
So the solution for now seems obvious. You assume that the user will jailbreak the model, and assess it accordingly.
Similarly, for an open weights model, you should assume the first thing the malicious user does is strip out your safety protocols, either with fine tuning or weights injection or some other method. If your plan was refusals, find a new plan. If your plan was ‘it lacks access to this compact data set’ then again, find a new plan.
As a practical matter, I believe that I could give reasonable assurance, right now, that all of the publically available models ( including GPT-4, Claude 3, and Gemini Advanced 1.0 and Pro 1.5) lack hazardous capability, if we were to lower the covered model threshold to 10^25 and included them.
If I was going to test GPT-5 or Claude-4 or Gemini-2 for this, how would I do that? There’s a METR for that, along with the start of robust internal procedures. I’ve commented extensively on what I think a responsible scaling policy (RSP) or preparedness framework should look like, which would carry many other steps as well.
One key this emphasizes is that such tests need to give the domain experts jailbroken access, rather than only default access.
Perhaps this will indeed prove impractical in the future for what would otherwise be highly capable models if access is given widely. In that case, we can debate whether that should be sufficient to justify not deploying, or deploying in more controlled fashion.
I do think that is part of the point. At some point, this will no longer be possible. At that point, you should actually adjust what you do.
Is Reasonable Assurance Tantamount to Requiring Proof That Your AI is Safe?
No.
Again, the bill now says this explicitly.
(u) “Reasonable assurance” does not mean full certainty or practical certainty.
Reasonable assurance is a term used in auditing.
Here is Claude Opus’s response, which matches my understanding:
In legal terminology, "reasonable assurance" is a level of confidence or certainty that is considered appropriate or sufficient given the circumstances. It is often used in the context of auditing, financial reporting, and contracts.
Key points about reasonable assurance:
It is a high, but not absolute, level of assurance. Reasonable assurance is less than a guarantee or absolute certainty.
It is based on the accumulation of sufficient, appropriate evidence to support a conclusion.
The level of assurance needed depends on the context, such as the risk involved and the importance of the matter.
It involves exercising professional judgment to assess the evidence and reach a conclusion.
In auditing, reasonable assurance is the level of confidence an auditor aims to achieve to express an opinion on financial statements. The auditor seeks to obtain sufficient appropriate audit evidence to reduce the risk of expressing an inappropriate opinion.
In contracts, reasonable assurance may be required from one party to another about their ability to fulfill obligations or meet certain conditions.
The concept of reasonable assurance acknowledges that there are inherent limitations in any system of control or evidence gathering, and absolute certainty is rarely possible or cost-effective to achieve.
Is the Definition of Covered Model Overly Broad?
Jeremy Howard made four central objections to the old definition, and raised several other warnings below, that together seemed to effectively call for no rules on AI at all.
One objection, echoed by many others, is that the definition here is overly broad.
Given the changes, this seems clearly incorrect now.
Howard says this sentence, which I very much appreciate: “This could inadvertently criminalize the activities of well-intentioned developers working on beneficial AI projects.”
Being ‘well-intentioned’ is irrelevant. The road to hell is paved with good intentions. Who decides what is ‘beneficial?’ I do not see a way to take your word for it.
We don’t ask ‘did you mean well?’ We ask whether you meet the requirements.
I do agree it would be good to allow for cost-benefit testing, as I will discuss later under Pressman’s suggestion.
You must do mechanism design on the rule level, not on the individual act level.
The definition could still be overly broad, and this is central, so let’s break it down.
Here is (Sec. 3 22602):
(1) (1) “Covered model” means an artificial intelligence model that
meets either of the following criteria:was trained using a quantity of computing power greater than 10^26 integer or floating-point operations, and the cost of that quantity of computing power would exceed one hundred million dollars ($100,000,000) if calculated using average market prices of cloud compute as reasonably assessed by the developer at the time of training.(2) [Adjust this for inflation.]
This probably covers zero currently available models, open or closed. It definitely covers zero available open weights models.
It is possible this would apply to Llama-3 400B, and it would presumably apply to Llama-4. The barrier is somewhere in the GPT-4 (4-level) to GPT-5 (5-level) range.
This does not criminalize such models. It says such models have to follow certain rules. If you think that open models cannot abide by any such rules, then ask why. If you object that this would impose a cost, well, yes.
You would be able to get an automatic limited duty exemption, if your model was below the capabilities of a model that had an existing limited duty exemption, which in this future could be a model that was highly capable.
There was previously a plausible case that the definition could become overly broad in the future, if the limited duty exemption was too burdensome, as the effective compute threshold lowered. That is moot now, as the threshold will rise instead.
Is the Similar Capabilities Clause Overly Broad or Anticompetitive?
The clause is gone, so this question is moot.
Does This Introduce Broad Liability?
No, and it perhaps could do the opposite by creating safe harbor.
Several people have claimed this bill creates unreasonable liability, including Howard as part of his second objection. I think that is essentially a hallucination.
There have been other bills that propose strict liability for harms. This bill does not.
The only way you are liable under this bill is if the attorney general finds you in violation of the statute, and brings a civil action, requiring a civil penalty proportional to the model’s training cost. That is it.
What would it mean to be violating this statute? It roughly means you failed to take reasonable precautions, you did not follow the requirements, and you failed to act in good faith, and the courts agreed.
Even if your model is used to inflict catastrophic harm, a good faith attempt at reasonable precautions is a complete defense.
If a model were to enable $500 million in damages in any fashion, or mass casualties, even if it does not qualify as hazardous capability under this act, people are very much getting sued under current law. By spelling out what model creators must do via providing reasonable assurance, this lets labs claim that this should shield them from ordinary civil liability. I don’t know how effective that would be, but similar arguments have worked elsewhere.
The broader context of Howard’s second objection is that the models are ‘dual use,’ general purpose tools, and can be used for a variety of things. As I noted above, clarification would be good to rule out ‘the criminals used this to process their emails faster and this helped them do the crime’ but I am not worried this would happen either way, nor do I see how ‘well funded legal teams’ matter here.
Howard tries to make this issue about open weights, but it is orthogonal to that. The actual issue he is pointing towards here, I will deal with later.
Should Developers Worry About Going to Jail for Perjury?
Not unless they are willfully defying the rules and outright lying in their paperwork.
Here is California’s perjury statute.
Even then, mostly no. It is extremely unlikely that perjury charges will ever be pursued unless there was clear bad faith and lying. Even then, and even if this resulted in actual catastrophic harm, not merely potential harm, it still seems unlikely.
Lying on your tax return or benefit forms or a wide variety of government documents is perjury. Lying on your loan application is perjury. Lying in signed affidavits or court testimony is perjury.
Really an awful lot of people are committing perjury all the time. Also this is a very standard penalty for lying on pretty much any form, ever, even at trivial stakes.
This results in about 300-400 federal prosecutions for perjury per year, total, out of over 80,000 annual criminal cases.
In California for 2022, combining perjury, contempt and intimidation, there were a total of 9 convictions, none in the Northern District that includes San Francisco.
How Would This Be Enforced?
Unlike several other proposed bills, companies are tasked with their own compliance.
You can be sued civilly by the Attorney General if you violate the statute, with good faith as a complete defense. In theory, if you lie sufficiently brazenly on your government forms, like in other such cases, you can be charged with perjury, see the previous question. That’s it.
If you are not training a covered non-derivative model, there is no enforcement. The law does not apply to you.
If you are training a covered non-derivative model, then you decide whether to seek a limited duty exemption. You secure the model weights and otherwise provide cybersecurity during training. You decide how to implement covered guidance. You do any necessary mitigations. You decide what if any additional procedures are necessary before you can verify the requirements for the limited duty exemption or provide reasonable assurance. You do have to file paperwork saying what procedures you will follow in doing so.
There is no procedure where you need to seek advance government approval for any action.
Does This Create a New Regulatory Agency to Regulate AI?
No. It creates the Frontier Model Division within the Department of Technology. See section 4, 11547.6(c). The new division will issue guidance, allow coordination on safety procedures, appoint an advisory committee on (and to assist) open source, publish incident reports and process certifications.
Will a Government Agency Be Required to Review and Approve AI Systems Before Release?
No.
This has been in other proposals. It is not in this bill. The model developer provides the attestment, and does not need to await its review or approval.
Are the Burdens Here Overly Onerous to Small Developers?
If you do not spend $100 million to train your model, the law does not apply to you.
If you do spend that much, you are not so small, and you can owe us some reports.
The burden of the reports seems to pale in comparison to (and on top of) the burden of actually taking the precautions, or the burden of the compute cost of the model being trained. This is not a substantial cost addition once the models get that large.
One objection here is that ‘covered guidance’ is open ended and could change. I see good reasons to be wary of that, and to want the mechanisms picked carefully. But also any reasonable regime is going to have a way to issue new guidance as models improve, and again under $100 million in spend none of this applies to you.
This could still impact a small developer indirectly, if it changes what top labs such as Meta choose to do, and a small developer was relying on that. But that’s it.
Is the Shutdown Requirement a Showstopper for Open Weights Models?
It would be if it fully applied to such models.
The good news for open weights models is that this (somehow) does not apply to them. Read the bill.
(m) (1) “Full shutdown” means the cessation of operation of a covered model, including all copies and derivative models, on all computers and storage devices within the custody, control, or possession of a
person,nonderivative model developer or a person that operates a computing cluster, including any computer or storage device remotely provided by agreement.(2) “Full shutdown” does not mean the cessation of operation of a covered model to which access was granted pursuant to a license that was not granted by the licensor on a discretionary basis and was not subject to separate negotiation between the parties.
I previously argued the old definition meant the same thing. The debate is moot now.
This seems like a real problem for the actual safety intent here, as I noted last time.
Rather than a clause that is impossible for an open model to meet, this is a clause where open models are granted extremely important special treatment, in a way that seems damaging to the core needs of the bill.
The other shutdown requirement is the one during training of a covered model without a limited duty exception.
That one says, while training the model, you must keep the weights on lockdown. You cannot open them up until after you are done, and you run your tests. So, yes, there is that. But that seems quite sensible to me? Also a rule that every advanced open model developer has followed in practice up until now, to the best of my knowledge.
Thus objections like Kevin Lacker’s here are incorrect with respect to the shutdown provision. For his other more valid concern, see the derivative model definition section.
Do the Requirements Disincentive Openness?
On Howard’s final top point, what here disincentivizes openness?
Openness and disclosing information on your safety protocols and training plans are fully compatible. Everyone faces the same potential legal repercussions. These are costs imposed on everyone equally.
Indeed, a lot of what the bill does is require an important kind of openness.
To the extent they are imposed more on open models, it is because those models are incapable of guarding against the presence of hazardous capabilities.
Ask why.
Will This Have a Chilling Effect on Research or Academics?
No.
Howard raised this possibility, as did Martin Casado of a16z, who called the bill a ‘f***ing disaster’ and an attack on innovation generally.
I don’t see how this ever would have happened even under the old version. It seems like a failure to understand the contents of the bill, or to think through the details.
With the new version it is Obvious Nonsense.
The only people liable or who have responsibilities under SB 1047 are those that train covered models costing over $100 million. That’s it. What exactly is your research, sir?
Does the Ability to Levy Fees Threaten Small Business?
It is standard at this point to include ‘business pays the government fees to cover administrative costs’ in such bills, in this case with Section 11547.6 (c)(11). This aligns incentives.
It is also standard to object, as Howard does, that this is an undue burden on small business.
My response is, all right, fine. Let’s waive the fees for sufficiently small businesses, so we don’t have to worry about this. It is at worst a small mistake.
Will This Raise Barriers to Entry?
Howard warned of this.
Again, the barrier to entry can only apply if the rules apply to you. So ‘entry’ here now means spending over $100 million to train a model.
This could actively work the other way. Part of this law will be that NIST and other companies and the Frontier Model Division will be publishing their safety protocols for you to copy. That seems super helpful.
It is possible this could still be a small barrier to entry for new companies looking to go sufficiently big, but at most it would be small, and this could also be helpful.
Is This a Brazen Attempt to Hurt Startups and Open Source?
Did they, as also claimed by Brian Chau, ‘literally specify that they want to regulate models capable of competing with OpenAI?’
No, of course not, that is all ludicrous hyperbole, as per usual, even in the old version.
Brian Chau also goes on to say, among other things that include ‘making developers pay for their own oppression’:
Brian Chau: The bill would make it a felony to make a paperwork mistake for this agency, opening the door to selective weaponization and harassment.
Um, no. Again, see the section on perjury, and also the very explicit text of the bill. That is not what the bill says. That is not what perjury means. If he does not know this, it is because he is willfully ignorant of this and is saying it anyway.
And then the thread in question was linked to by several prominent others, all of whom should know better, but have shown a consistent pattern of not knowing better.
To those people: You can do better. You need to do better.
There are legitimate reasons one could think this bill would be a net negative even if its particular detailed issues are fixed. There are also particular details that need (or at least would benefit from) fixing. Healthy debate is good.
This kind of hyperbole, and a willingness to repeatedly signal boost it, is not.
Brian does then also make the important point about the definition of derivative model currently being potentially overly broad, allowing unlimited additional training, and thus effectively the classification of a non-derivative model as derivative of an arbitrary other model (or least one with enough parameters). See the section on the definition of derivative models, where I suggest a fix.
Even if you did somehow entertain such claims before, they are obviously moot now, as the relevant sections are gone from the bill.
Also note that if such claims were right about the purpose of the bill, these changes would not have happened this way.
Will This Cost California Talent or Companies?
Several people raised the specter of people or companies leaving the state.
It is interesting that people think you can avoid the requirements by leaving California. I presume that is not the intent of the law, and under other circumstances such advocates would point out the extraterritoriality issues.
If it is indeed true that the requirements here only apply to models trained in California, will people leave?
In the short term, no. Even if a big AI lab is covered and this is far more expensive to comply with than I expect, no one who this applies to would care enough to move. As I said last time, have you met California? Or San Francisco? You think this is going to be the thing that triggers the exodus? Compared to (for example) the state tax rate, this is nothing.
This bill at most be covers a tiny fraction of people doing software development. Most companies will not spend $100 million to train a model. So the network effects are not going anywhere.
Will This Hurt Small Open Weights Companies Indirectly By For Example Hurting Meta’s Ability to Release a Future Model Like Llama-4-1T?
This is possible.
This would be the result of Meta being unwilling or unable to provide reasonable assurance that Llama-4-1T lacked hazardous capabilities.
Ask why this would happen.
Again, it would come down to the fundamental conflict that open weights are unsafe and nothing can fix this, indeed this would happen because Meta cannot fix this.
With the changes to the definition of derivative model, if Meta chooses not to release the weights of Llama-4-1T due to such concerns, it will be because they were unable to provide reasonable assurance of a lack of catastrophic risks if the weights are made available.
If you think Meta should, if unable to provide reasonable assurance, release the weights of such a future highly capable model anyway, because open weights are more important, then we have a strong values disagreement. I also notice that you oppose the entire purpose of the bill. You should oppose this bill, and be clear as to why.
Could We Use a Cost-Benefit Test?
John Pressman gets constructive, proposes the best kind of test: A cost-benefit test.
John Pressman: Since I know you [Scott Weiner] are unlikely to abandon this bill, I do have a suggested improvement: For a general technology like foundation models, the benefits will accrue to a broad section of society including criminals.
My understanding is that the Federal Trade Commission decides whether to sanction a product or technology based on a utilitarian standard: Is it on the whole better for this thing to exist than not exist, and to what extent does it create unavoidable harms and externalities that potentially outweigh the benefits?
In the case of AI and e.g. open weights we want to further consider marginal risk. How much *extra benefit* and how much *extra harm* is created by the release of open weights, broadly construed?
This is of course a matter of societal debate, but an absolute threshold of harm for a general technology mostly acts to constrain the impact rather than the harm, since *any* form of impact once it becomes big enough will come with some percentage of absolute harm from benefits accruing to adversaries and criminals.
I share others concerns that any standard will have a chilling effect on open releases, but I'm also a pragmatic person who understands the hunger for AI regulation is very strong and some kind of standards will have to exist. I think it would be much easier for developers to weigh whether their model provides utilitarian benefit in expectation, and the overall downstream debate in courts and agency actions will be healthier with this frame.
[In response to being asked how he’d do it]: Since the FTC already does this thing I would look there for a model. The FTC was doing some fairly strong saber rattling a few years ago as part of a bid to become The AI Regulator but seems to have backed down.
Zvi: It looks from that description like the FTC's model is 'no prior restraint but when we don't like what you did and decide to care then we mess you up real good'?
John Pressman: Something like that. This can be Fine Actually if your regulator is sensible, but I know that everyone is currently nervous about the quality of regulators in this space and trust is at an all time low.
Much of the point is to have a reasonable standard in the law which can be argued about in court. e.g. some thinkers like yourself and Jeffrey Laddish are honest enough to say open weights are very bad because AI progress is bad.
The bill here is clearly addressing only direct harms. It excludes ‘accelerates AI progress in general’ as well as ‘hurts America in its competition with China’ and ‘can be used for defensive purposes’ and ‘you took our jobs’ and many other things. Those impacts are ignored, whatever sign you think they deserve, the same way various other costs and benefits are ignored.
Pressman is correct that the natural tendency of a ‘you cannot do major harm’ policy is ‘you cannot do major activities at all’ policy. A lot of people are treating the rule here as far more general than it is with a much lower threshold than it has, I believe including Pressman. See the discussion on the $500 million and what counts as a hazardous capability. But the foundational problem is there either way.
Could we do a cost-benefit test instead? It is impossible to fully ‘get it right’ but it is always impossible to get it right. The question is, can we make this practical?
I do not like the FTC model. The FTC model seems to be:
You do what you want.
One day I decide something is unfair or doesn’t ‘pass cost-benefit.’
Retroactively I invalidate your entire business model and your contracts.
Also, you do not want to see me angry. You would not like me when I’m angry.
There are reasons Lina Khan is considered a top public enemy by much of Silicon Valley.
This has a lot of the problems people warn about, in spades.
If it turns out you should not have released the model weights, and I decide you messed up, what happens now? You can’t take it back. And I don’t think any of us want to punish you enough to make you regret releasing models that might be mistakes to release.
Even if you could take it back, such as with a closed model, are you going to have to shut down the moment the FTC questions you? That could break you, easily. If not, then how fast can a court move? By the time it rules, the world will have moved on to better models, you made your killing or everyone is dead, or what not.
It is capricious and arbitrary. Yes, you can get court arguments once the FTC (or other body) decides to have it out with you, it is going to get ugly for you, even if you are right. They can and do threaten you in arbitrary ways. They can and do play favorites and go after enemies while ignoring friends who break rules.
I think these problems are made much worse by this structure.
So I think if you want cost-benefit, you need to do a cost-benefit in advance of the project. This would clearly be a major upgrade on for example NEPA (where I want to do exactly this), or on asking to build housing, and other similar matters.
Could we make this reliable enough and fast enough that this made sense? I think you would still have to do all the safety testing.
Presumably there would be a ‘safe harbor’ provision. Essentially, you would want to offer a choice:
You can follow the hazardous capabilities procedure. If your model lacks hazardous capabilities in the sense defined here, then we assume the cost-benefit test is now positive, and you can skip it. Or at least, you can release pending it.
You can follow the cost-benefit procedure. You still have to document what hazardous capabilities could be present, or we can’t model the marginal costs. Then we can also model the marginal benefits.
We would want to consider the class of model as a group as well, at least somewhat, so we don’t have the Acme-Zenith issue where the other already accounts for the downside and both look beneficial.
Should We Interpret Proposals via Adversarial Legal Formalism?
Doomslide suggests that using the concept of ‘weights’ at all anchors us too much on existing technology, because regulation will be too slow to adjust, and we should use only input tokens, output tokens and compute used in forward passes. I agree that we should strive to keep the requirements as simple and abstract as possible, for this and other reasons, and that ideally we would word things such that we captured the functionality of weights rather than speaking directly about weights. I unfortunately find this impractical.
I do notice the danger of people trying to do things that technically do not qualify as ‘weights’ but that is where ‘it costs a lot of money to build a model that is good’ comes in, you would be going to a lot of trouble and expense for something that is not so difficult to patch out.
That also points to the necessity of having a non-zero amount of human discretion in the system. A safety plan that works if someone follows the letter but not the spirit, and that allows rules lawyers and munchkining and cannot adjust when circumstances change, is going to need to be vastly more restrictive to get the same amount of safety.
Jessica Taylor goes one step further, saying that these requirements are so strict that you would be better off either abandoning the bill or banning covered model training entirely. I believe the changes should largely address the particular argument here.
I think this is mostly a pure legal formalism interpretation of the requirements, based on a wish that our laws be interpreted strictly and maximally broadly as written, fully enforced fully in all cases and written with that in mind, and seeing our actual legal system as it functions today as in bad faith and corrupt. So anyone who participated here would have to also be in bad faith and corrupt, and otherwise she sees this as a blanket ban.
I find a lot appealing about this alternative vision of a formalist legal system and would support moving towards it in general. It is very different from our own. In our legal system, I believe that the standard of ‘reasonable assurance’ will in practice be something one can satisfy, in actual good faith, with confidence that the good faith defense is available.
In general, I see a lot of people who interpret all proposed new laws through the lens of ‘assume this will be maximally enforced as written whenever that would be harmful but not when it would be helpful, no matter how little sense that interpretation would make, by a group using all allowed discretion as destructively as possible in maximally bad faith, and that is composed of a cabal of my enemies, and assume the courts will do nothing to interfere.’
I do think this is an excellent exercise to go through when considering a new law or regulation. What would happen if the state was fully rooted, and was out to do no good? This helps identify ways we can limit abuse potential and close loopholes and mistakes. And some amount of regulatory capture and not getting what you intended is always part of the deal and must be factored into your calculus. But not a fully maximal amount.
What Other Positive Comments Are Worth Sharing?
In defense of the bill, also see Dan Hendrycks’s comments, and also he quotes Hinton and Bengio:
Geoffrey Hinton: SB 1047 takes a very sensible approach... I am still passionate about the potential for AI to save lives through improvements in science and medicine, but it’s critical that we have legislation with real teeth to address the risks.
Yoshua Bengio: AI systems beyond a certain level of capability can pose meaningful risks to democracies and public safety. Therefore, they should be properly tested and subject to appropriate safety measures. This bill offers a practical approach to accomplishing this, and is a major step toward the requirements that I've recommended to legislators.
What Else Was Suggested That We Might Do Instead of This Bill?
Howard has a section on this. It is my question to all those who object.
If you want to modify the bill, how would you change it?
If you want to scrap the bill, what would you do instead?
Usually? Their offer is nothing.
Here are Howard’s suggestions, which do not address the issues the bill targets:
The first suggestion is to ‘support open-source development,’ which is the opposite of helping solve these issues.
‘Focus on usage, not development’ does not work. Period. We have been over this.
‘Promote transparency and collaboration’ is in some ways a good idea, but also this bill requires a lot of transparency and he is having none of that.
‘Invest in AI expertise’ for government? I notice that this is also objected to in other contexts by most of the people making the other arguments here. On this point, we fully agree, except that I say this is a compliment not a substitute.
The first, third and fourth answers here are entirely non-responsive.
The second answer, the common refrain, is an inherently unworkable proposal. If you put the hazardous capabilities up on the internet, you will then (at least) need to prevent misuse of those capabilities. How are you going to do that? Punishment after the fact? A global dystopian surveillance state? What is the third option?
The flip side is that Guido Reichstadter proposes that we instead shut down all corporate efforts at the frontier. I appreciate people who believe in that saying so. And here are Akash Wasil and Holly Elmore, who are of similar mind, noting that the current bill does not actually have much in the way of teeth.
Would This Interfere With Federal Regulation?
This is a worry I heard raised previously. Would California’s congressional delegation then want to keep the regulatory power and glory for themselves?
Senator Scott Weiner, who introduced this bill, answered me directly that he would still strongly support federal preemption via a good bill, and that this outcome is ideal. He cannot however speak to other lawmakers.
I am not overly worried about this, but I remain nonzero worried, and do see this as a mark against the bill. Whereas perhaps others might see it as a mark for the bill, instead.
Conclusion
Hopefully this has cleared up a lot of misconceptions about SB 1047, and we have a much better understanding of what the bill actually says and does. As always, if you want to go deep and get involved, all analysis is a complementary good to your own reading, there is no substitute for RTFB (Read the Bill). So you should also do that.
This bill is about future more capable models, and would have had zero impact on every model currently available outside the three big labs of Anthropic, OpenAI and Google Deepmind, and at most one other model known to be in training, Llama-3 400B. If you build a ‘derivative’ model, meaning you are working off of someone else’s foundation model, you have to do almost nothing.
This alone wildly contradicts most alarmist claims.
In addition, if in the future you are rolling your own and build something that costs over $100 million to train and is far more capable than GPT-4, then so long as you are behind existing state of the art your requirements are again minimal.
Many other concerns are built on misunderstanding the threshold of harm, or the nature of the requirements, or the penalties and liabilities imposed and how they would be enforced. A lot of them are essentially hallucinations of provisions of a very different bill, confusing this with other proposals that would go farther. A lot of descriptions of the requirements imposed greatly exaggerate the burden this would impose even on future covered models.
If this law poses problems for open weights, it would not be because anything here targets or disfavors open weights, other than calling for weights to be protected during the training process until the model can be tested, as all large labs already do in practice. Indeed, the law explicitly favors open weights in multiple places, rather than the other way around. One of those is the tolerance of a major security problem inherent in open weight systems, the inability to shutdown copies outside one’s control.
The problems would arise because those open weights open up a greater ability to instill or use hazardous capabilities to create catastrophic harm, and you cannot reasonably assure that this is not the case.
That does not mean that this bill has only upside or is in ideal condition.
Indeed the bill has in many ways substantially improved. My two most important suggests to change the bill have been implemented. They also substantially tightened what counts as a covered model.
This bill now seems to be a mostly excellent version of the bill it is attempting to be. That does not mean it could not be improved further, and I welcome and encourage additional attempts at refinement.
It certainly does not mean we will not want to make changes over time as the world rapidly changes, or that this bill seems sufficient even if passed in identical form at the Federal level. For all the talk of how this bill would supposedly destroy the entire AI industry in California (without subjecting most of that industry’s participants to any non-trivial new rules, mind you), it is easy to see the ways this could prove inadequate to our future safety needs. What this does seem to be is a good baseline from which to gain visibility and encourage basic precautions, which puts us in better position to assess future unpredictable situations.
Thank you so much for this work—in particular for engaging in good faith with law-making.
Nice write up, this is a real service.
I'm impressed that this bill narrowly focus on hazardous capabilities, not social costs or jobs. That's a distinct issue that is best addressed separately. It is not sufficient to prevent full existential risk, but much better than I hoped for this early in the game.
Section 22605 "transparent, uniform, publicly available price schedule" interferes with business models that are rapidly changing and is completely out of scope. Antitrust enforcement is arbitrary and out of hand enough as it is.
I would appreciate a distinct take on the penalties. The "preventive relief and "punitive damages" terms in section 22606 look like actual teeth even though the civil penalties are capped.
The derivative model carve outs seem necessary but are concerning. Too many complicated scenarios where real liability can be ducked. I would at least direct the courts to provide preventive relief (ie block dissemination, require deletion) for a public safety threat.