I think this objectively a point in favor of the "don't regulate AI because the gov always screws things up" crowd. I don't 100% necessarily agree, and I think there probably are good regulations to consider, but it does show the strength of the "the gov will screw it up" card in the AI-policy-deckbuilding-game. And the fallacy that most people have of "regulation" as a Sim City style slider where if you go to 0, it costs no money and doesn't work, and if you push it up to 10 (or 11) it costs lots of money and definitely works.
I think what it does show, as far as strategy, is that if you want good AI regulation, you should not just say those words, but positively say "we want *THIS* good AI regulation with *these* key features" and as-efficient-as-possible-soundbites about what those features are. And also go on to publicly slag (as this post does well) alternative proposals (like this bill) that are obviously dumb "we just heard about this AI thing yesterday" kind of stuff.
AI as a political issue is too new for pols to have implicit understandings of the tradeoffs - unlike say, gun control, where pols understand that pro-gun-control voters will not support "take guns away from the army!" and anti-gun-control voters will not support "give free guns to convicted murderers."
Not sure I agree about the first point. You could also interpret this as the government randomly producing regulation of varying quality until something is passed, and you don't necessarily have control over what is passed, so you should support the better ones to crowd out the worse ones.
I don't think it's random, I think that gov will produce regulation that enhances their power (or that of highly incentivized interest groups) unless even more persuasive interest groups do what you describe here: "you should support the better ones " in order to pre-emptively make it clear that "better regulation" -> "more votes."
To clarify - the quality of the regulation is what's random, presumably because different legislators or interest groups are crafting the different bills, and some are better at it than others.
What is the goal that you believe legislators are pursuing, such that some of them are better at it than others? It's an agent problem. They aren't gonna craft good regulation unless you make them.
In the case of the two bills under discussion, we have:
1. Legislators attempting to protect the public from catastrophic and existential risks.
And
2. Legislators attempting to fulfill the desires of their constituents and lobbying groups by protecting "authentic" human content from imitation by AI.
I might be misreading the bill, but if I’m not section 22949.90.2 may have accidentally banned the sale of analog cameras manufactured after 2026. Consider:
“‘Camera and recording device manufacturer’ means the makers of a device that can record photographic, audio, or video content, including, but not limited to, video and still photography cameras, mobile phones with built-in cameras or microphones, and voice recorders.”
Which include analog cameras. these would be required to be able to provide a provenance watermark, which I don’t think they can do:
“‘Provenance watermark’ means a watermark of authentic content that includes details about the content, including, but not limited to, the time and date of production, the name of the user, details about the device, and a digital signature.”
Also, this doesn’t apply only to digital cameras, or else why would subsection 4 feel the need to call them out specifically?
Am I missing something? (Claude is wishy washy on this, but seems to think the bill would in fact do this)
I don't think it's necessarily impossible for e.g. a film camera to add a watermark, but it would definitely add a lot to the cost.
(This particular watermark isn't required to be imperceptible, indelible, or impossible to counterfeit, unless I'm missing something. If that was required, yeah, I think that would be close enough to impossible to be an effective ban on film cameras.)
I think you're right that it applies to analog cameras. Also to cassette recorders. And maybe notepads.
I agree they could add a watermark, its the nature of the provenance watermark which will be the problem. As a reminder, they must provide
- the time and date of production
- the name of the user
- details about the device
- and a digital signature
#3 would be the easiest, #4 you could probably argue does not apply because the photo is not digital, and I doubt it is coherent enough to be enforceable, but I don’t see how you comply with #1 and #2.
Sorry, maybe it's because I haven't read the whole bill carefully. I don't see anything in 22949.90.2 about what information the watermark has to include (which makes me doubt my own reading, because then what's the point?). It does say that there must be options to choose what types of provenance data are included, but doesn't say that any particular type has to be available.
I see that 22949.90 has definitions of "provenance data" (with non-exhaustive list of what that might include) and "watermark", but I don't interpret that as saying that all of that data must be included in every watermark mandated by the other sections. Maybe they intended it to be implied.
There are film cameras that can put the date in the image. I think they basically have a LED display that gets reflected onto the film by a prism or something. To add the user's name (is that actually mentioned in the law anywhere?) there would have to be some way to input a name, so this is definitely adding a lot of cost if the camera didn't already have some kind of digital UI.
The digital signature is definitely the hardest one, but also the only one that has any point I can see. Assuming it's not ok if it's possible to just copy the signature onto a different picture, you need a hash of the picture itself to encrypt, so you need to put a digital camera inside the film camera. And of course you have the same difficulties as with digital cameras, i.e. make it impractical to forge the signature, which means hiding a secret key inside every camera in a way that can't be reverse engineered.
Sorry, I got a bit nerd-sniped by this question. Probably not worth spending this much time analyzing a bad bill that is (hopefully) never going to be law. Unless California legislators or their staff read the comments in this blog, I guess.
You can find the required information in 22949.90.2, subsection (a), subsection (3), the bit defining a provenance watermark. I assumed when they said you had to include a provenance watermark it meant that you had to include one with all of that.
Also, upon rereading the bill I may have been incorrect, it looks like subsection (b) (1) only applies to digital cameras, so while an analog tape recorder would still be banned, an analog camera might not have to provide a provenance watermark at all.
At least it's only requiring that recording devices provide an *option* to watermark, not that they always add a watermark whether the user wants it or not. I don't think that was clear from Zvi's description.
But if it's just an option, I don't see why it has to be required in all devices. People who want the option enough to pay extra for it could pay extra for it, and people who don't care could still get cheap disposable cameras.
I think the option has to be on every camera though, rather than having it so that you can buy a camera with the option to watermark if you want. So the technical challenge is the same whether the end user wants to use it or not.
Yes, the law says it has to be on every camera. I just don't see what good that does. Maybe they figure people will not buy a camera with watermarking if there's a cheaper one without, but some people will use the watermarking option if it's there? And they want to encourage that, even though there will still be plenty of unwatermarked video out there? I guess that's probably it. Or else they just didn't think too hard about it.
Nor do I, but then, we’ve already established this is a terrible bill, so why shouldn’t it be a bit worse.
“Or else they just didn't think too hard about it.”
I bet this is what happened. I’m pretty sure the bill wasn’t meant to apply to analog cameras at all, and that if this was pointed out to them then they would amend the bill to specifically exclude them, certain digital cameras, and basically all the other tech which was included here by mistake (i.e. someone else pointed out it also applies to tape recorders). This just further highlights the problems with the bill— its so poorly thought out that it accidentally bans unrelated products.
Probably obvious: With watermarking there's always a tradeoff between transparency (i.e. imperceptibility) and robustness (ability to survive changes to the watermarked media, whether it's someone intentionally trying to remove the watermark or changes that happen naturally, like noise added during broadcast or lossy compression).
Since this bill specifies "imperceptible", even "maximally indelible" watermarks would end up being easy to remove. Though this might depend a bit on the question of "imperceptible to whom?", and there's room for some ingenuity.
In audio (the kind of watermarking I used to work on), it's pretty easy to create a watermark that will be inaudible to the vast majority of people (maybe even all people), but will not survive even the most basic changes, like adding a tiny bit of noise or speeding up by 1%. If you're willing to accept that a few people might be able to hear it sometimes (like music mastering engineers and maybe a few audiophiles), it's possible (though not easy) to make it robust to e.g. MP3 compression down to a reasonable bit rate, small changes in speed, band-limited filtering, etc. Those tradeoffs continue all the way up to "completely overwrite the original sound with a carrier signal".
And then there's the fact that some sounds are easier to watermark than others, and questions of how short an audio clip can be before you can't detect the watermark, and how much computation you're willing to spend (which is two separate questions, for adding the watermark and for detecting it), and whether "imperceptible" means you get to compare it with the non-watermarked signal and try to notice the difference, and how much data you actually need to put in the watermark, and how unlikely you want false positives to be, and so on.
For images, it's a bit more of a question of how hard you can look at the image before seeing the difference, although of course trained eyes will see it more easily. Video is maybe a bit easier, assuming you only care about the watermark being imperceptible when watching at normal speed; plus it's normal to have artifacts from compression so you might as well have similar artifacts for the watermark.
Watermarking text is a weird case. If you want it to be nontrivial to remove, I think all you can do is make subtle statistical changes to the word choices, which would (only) be detectible over long windows. I feel like I've heard of LLMs doing this already, but don't know if it would fit the requirements here. Also probably not too hard to remove, using another LLM or just a thesaurus.
There are many companies selling watermark technologies, and if something like this bill passed, they'd all be desperately trying to become the standard. There would be big advantages to standardization, but that probably involves giving someone a monopoly.
So anyway, yeah, I kind of like the idea of all AI-produced media being watermarked, but this bill sucks.
I guess that the whole Digital Rights Management experience has taught us that watermarking do
Ig watermarking music or videos to indicate the copyright owner doesnt work, then watermarking to indicate its AI generated probably isnt going to work either.
I mean, I think I could believe an honest argument of the form "generative Ai creates a bunch of problems that are impossible or economically infeasible to defend against, so we're just going to make it illegal, with very very big criminal penalties for anyone caught with an image generation model."
""Purposes of prevention and detection of crime" would be an exemption of course, so the police officer who arrests Yann LeCun can collect evidence for the prosecution case without themselves becoming a target of prosecution,
It seems to me that, in the absence of good bills, we're likely to get bad ones. Of course there are multiple ways a bill can be "bad"; a simple classification would be (1) bad goals – the bill is trying to accomplish an undesirable / suboptimal thing, (2) bad implementation – the bill uses poor language in service of its goals.
Goals are subjective, so a bill might have goals that I consider to be good and you consider to be bad. Implementation quality is less subjective; sometimes an implementation choice embodies a tradeoff for which different people would have different preferences, but often an implementation flaw is just plan bad for ~everyone.
It seems to me that a lot of the flaws in this particular bill are bad implementation, and I presume that's because it was drafted without participation from anyone having certain types of expertise. Shame on the authors for not seeking out such expertise, but at the same time, it seems like it should be a solvable problem in this case. The idea of watermarking content has been out there for a while; has anyone tried to create high quality model legislation? Perhaps someone should?
(If anyone knows of such work, or is interested in discussing this idea further, drop me a line!)
I guess you could treat AI models without watermarks as being basically like other forms of illegal content, which social media companies already have moderation teams in place to take down etc.
Something like a legal obligation to respond to takedown requests within a specified (and fairly short) time might be workable, I guess.
The only "maximally indelible watermark" I can think of in real life is the candid shots of you riding roller coasters they try to sell you. It seems like the watermark has to be big enough that people can't just take pictures of it with their phone; hence, your face is often obscured (and you really really have to pay to get them.) Is this what CA proposes to do to all AI generated content?
Disc space being fairly cheap, it is maybe not unreasonable to impose a requirement to store everythin g your AI generates.
To make sense, presumably, it is the person who runs the software, not the person who wrote it, who has the archiving requirement. So, if your product is open source its the people who download and run in who have the archiving requirement.
One obvious problem: what if the AI generates something illegal, so youre not allowed to store it?
Well, guess part of this is that you promise your AI has never, ever, generated anything is illegal;, the government can check this by serving you with a search warrant for your stored generations and scanning through them, and, if it turns out that you dont have an archive when served wigth a seaqrch warrant, you go to jail for violating the archiving requirement,
(compare: regulation of investigatory powers act in the UK;big criminal penalties for not decrypting something when the government asks you to, even if everyone agreees the plaintext was probably innocous ... doesn;t matter)
Is it possible to guarantee an AI will never generate anything illegal? As a trivial example, what if I send an LLM copyrighted (or even classified) text and ask them to repeat it back to me?
I'm quite sympathetic to the argument that image generating AI is already effectively illegal under existing laws, without the need for new bills like this one, and the government can put the people responsible for it in jail whenever they feel like it.
in the case of classified information, only an offense if you have a security clearance, typically. (which is why the treatment of Julian Assange has some journalists so worried)
Your readers might, technically speaking, be mishandling yjr classified information by reading it on your substack from a computer that was not secured per requirements....
You shall all just have to imagine it as a hypothetical.
Not a lawyer, but I would think compelling literal speech by humans (and not just politicians running for office!) every time they want to speak on the internet poses some constitutional issues.
Part of me thinks that some clever (for specific values of "clever") intern realized that 5% of Google or Facebook's global revenue would go a very long way towards covering CA's budget deficit, and then they wrote a bill that would make collecting that penalty nearly guaranteed unless everyone stopped using AI forever. Sort of a "You know, the EU is on to something here... and people are scared of AI..." situation. Powerful enough to hit everyone at any time, but vague enough to allow selective enforcement when desired.
Just to introduce myself, I'm a SAG-AFTRA performer. My digital identity is my only source of income. I like this bill.
With no ill intent, I feel compelled to counter this article with my own extremely nerdy view on the bill, which I support so strongly that I am about to travel to Sacramento to support it.
Sincerely,
Erik Passoja
----
This critique on AB-3211 raises several points of contention regarding the bill’s feasibility and potential impact on the tech industry. However, these criticisms overlook the bill’s essential goals of protecting digital identities and ensuring ethical AI use.
My view: if you take away a carpenter's voice, he can still build. But if you take away a voice actor's voice, they will never work again.
Protect us.
Your doctor calls and tells you to double up on your medication... ...but it’s not your doctor.
You call school to pick up your kids, and the school releases them... ...but it’s not you.
President Biden's was deepfaked in a robocall in New Hampshire this January. It took 32 days to catch the culprit.
With digital provenance info, law enforcement would have kicked the door down in minutes.
----
The Nerd Stuff re: the article:
Burden on AI Developers:
The requirement for databases with digital fingerprints and “maximally indelible watermarks” ensures accountability and traceability.
And while it may seem burdensome at first, these measures are crucial to prevent misuse and protect consumers. The development of such standards is necessary to maintain trust in AI technologies.
Conversational AI Disclosures:
The necessity for AI systems to disclose their nature at the start of interactions is compared to cookie notifications.
This comparison is misplaced; ensuring users are aware they are interacting with AI is vital for transparency and consent. This requirement protects users from being deceived by AI systems posing as humans.
Impact on Existing AI Systems:
The claim that AB-3211 retroactively bans existing AI models is an overstatement. The bill provides provisions for existing systems, allowing for compliance through retroactive decoders or proof of incapability to produce inauthentic content. This approach balances innovation with the need for robust safeguards.
Financial Penalties:
Fines of up to $1 million or 5% of global revenue are intended to enforce compliance and deter violations. Large tech companies with significant resources must be held accountable to ensure adherence to ethical standards. These penalties are proportional to the potential harm caused by non-compliance.
The ambiguity surrounding open weights models and the 24-hour reporting requirement are valid concerns. However, these provisions highlight the urgency and importance of addressing vulnerabilities promptly to prevent widespread misuse. The bill’s focus on immediate action aims to protect consumers in a rapidly evolving technological landscape.
The core intent of AB-3211 is to establish necessary safeguards for digital identities in the AI era.
Sure, balancing innovation with regulation is inherently complex, but the protections offered by AB-3211 are essential for ensuring ethical AI use and maintaining public trust. Continuous dialogue between lawmakers, tech industry stakeholders, and the public is crucial for refining and implementing effective legislation.
Don't throw the baby out with the bathwater, so to speak.
Thank you! It's good to understand the other perspective - I'd previously literally not seen anyone in support of this bill.
As I read this, there are essentially two central points.
1. The requirements are not as onerous as I think and the downsides overstated.
2. The situation is so dire that we should be willing to take down generative AI the way we took out Napster, until a solution is found the way we eventually found a solution there via streaming services. A very large blast radius is acceptable.
Mostly I see arguments for #2, and I don't see a need to respond there - I think your points stand, the downsides are real, and people can decide on how bad they likely will get (I talk about these every week) and how much utility we should be willing to sacrifice in their name.
I do see a large factual (world model) disagreement on #1. You seem to think certain technologies exist or are feasible, in ways I don't. And certainly one could rely on #2 only, and say 'I don't care that's your problem, your product sucks and it should die until such time as it stops sucking' but one should do that eyes open.
In particular, if anyone technical thinks that either of the two options for existing LLMs is viable rather than a Can't Happen - showing it cannot produce inauthentic content, or building a 99% accurate detector - I'd be pretty surprised, and ask them to say how they think one could do that.
Posting this up top, because using "reply" cuts it off:
HI Zvi,
Thank you for your thoughtful analysis...I appreciate the opportunity to offer a different perspective on AB-3211.
You've correctly identified two central points in my argument:
(1) The requirements are not as onerous as they might initially appear.
(2) The potential harm from unregulated AI is severe enough to warrant strong measures.
Let me elaborate on point #1, as I believe there's a factual disagreement about the feasibility of the technology required.
First, it's important to understand that much of the necessary technology already exists. Digital watermarking, for instance, has been standard in theatrical motion pictures since 2005, and companies like Digimarc and Verance have amazing products.
So, the challenge lies not in creating new technology, but in implementing and standardizing existing tools across industries. This process will involve:
a) Getting various government agencies to agree on an overall data standard to be watermarked on media
b) Customizing parts of this standard for different sectors (medical, industrial, entertainment, etc.)
c) Allowing technical experts to lead the implementation instead of politicians and lawyers and Luddites
d) Continuously updating security measures to stay ahead of potential hackers
Point (d) is crucial. Just as banks use complex algorithms to detect potential fraud, we need to be equally vigilant in protecting digital identities. The key word = paranoia. Paranoia is good when you deal in security.
Now, regarding the specific requirements for existing LLMs:
Digital Watermarking: advanced versions capable of marking 3D, video, image, audio, and motion capture files already exist and are in use.
Blockchain Integration: This technology can securely store consent information and usage rights.
Consent Matrix: A system for managing granular consent across different use cases and industries (for a performer version, check out www.consentmatrix.app).
Smart Contracts: To ensure proper compensation for authorized use of digital content.
While building this system will require effort, it's not beyond our current technological capabilities. In fact, I've personally experienced demos of some of these technologies (though I'm limited in what I can share due to NDAs).
Regarding the 99% accuracy requirement for detectors, I've worked with a company that can identify voices with 99% accuracy, even in live performances. Even better than Pindrop (https://www.pindrop.com). This suggests that such high accuracy rates are achievable.
I understand your concerns about the potential impact on innovation and existing AI systems. However, I believe we must balance these concerns against the very real threats to individual privacy, security, and livelihood that unchecked AI development poses.
The goal isn't to stifle innovation, but to ensure it proceeds responsibly. Just as DRM helped the music industry adapt to the digital age, these measures can help us navigate the AI revolution while protecting digital identities.
I'm always eager to learn more and hear different perspectives on this issue. If you see alternative solutions that could achieve the same goals, I'd be very interested in discussing them.
My commitment: on behalf of our great union, to protect, manage, and monetize the digital identities of 170,000 SAG-AFTRA performers at the dawn of Artificial Intelligence.
I was making the exact opposite argument, namely that the experience with watermarking for copyright enforcement tells us that the watermark is going to get broken.
Well, watermarking alone won't work. Do consider the clone in the blockchain and a consent matrix. Nothing will work alone, just like your PIN code isn't the only thing protecting your debit card.
I think this objectively a point in favor of the "don't regulate AI because the gov always screws things up" crowd. I don't 100% necessarily agree, and I think there probably are good regulations to consider, but it does show the strength of the "the gov will screw it up" card in the AI-policy-deckbuilding-game. And the fallacy that most people have of "regulation" as a Sim City style slider where if you go to 0, it costs no money and doesn't work, and if you push it up to 10 (or 11) it costs lots of money and definitely works.
I think what it does show, as far as strategy, is that if you want good AI regulation, you should not just say those words, but positively say "we want *THIS* good AI regulation with *these* key features" and as-efficient-as-possible-soundbites about what those features are. And also go on to publicly slag (as this post does well) alternative proposals (like this bill) that are obviously dumb "we just heard about this AI thing yesterday" kind of stuff.
AI as a political issue is too new for pols to have implicit understandings of the tradeoffs - unlike say, gun control, where pols understand that pro-gun-control voters will not support "take guns away from the army!" and anti-gun-control voters will not support "give free guns to convicted murderers."
Yeah I agree with this.
Not sure I agree about the first point. You could also interpret this as the government randomly producing regulation of varying quality until something is passed, and you don't necessarily have control over what is passed, so you should support the better ones to crowd out the worse ones.
I don't think it's random, I think that gov will produce regulation that enhances their power (or that of highly incentivized interest groups) unless even more persuasive interest groups do what you describe here: "you should support the better ones " in order to pre-emptively make it clear that "better regulation" -> "more votes."
To clarify - the quality of the regulation is what's random, presumably because different legislators or interest groups are crafting the different bills, and some are better at it than others.
What is the goal that you believe legislators are pursuing, such that some of them are better at it than others? It's an agent problem. They aren't gonna craft good regulation unless you make them.
In the case of the two bills under discussion, we have:
1. Legislators attempting to protect the public from catastrophic and existential risks.
And
2. Legislators attempting to fulfill the desires of their constituents and lobbying groups by protecting "authentic" human content from imitation by AI.
Apposite to the content, this is the AI podcast episode for this post:
https://open.substack.com/pub/dwatvpodcast/p/rtfb-californias-ab-3211?r=67y1h&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true
I might be misreading the bill, but if I’m not section 22949.90.2 may have accidentally banned the sale of analog cameras manufactured after 2026. Consider:
“‘Camera and recording device manufacturer’ means the makers of a device that can record photographic, audio, or video content, including, but not limited to, video and still photography cameras, mobile phones with built-in cameras or microphones, and voice recorders.”
Which include analog cameras. these would be required to be able to provide a provenance watermark, which I don’t think they can do:
“‘Provenance watermark’ means a watermark of authentic content that includes details about the content, including, but not limited to, the time and date of production, the name of the user, details about the device, and a digital signature.”
Also, this doesn’t apply only to digital cameras, or else why would subsection 4 feel the need to call them out specifically?
Am I missing something? (Claude is wishy washy on this, but seems to think the bill would in fact do this)
I don't think it's necessarily impossible for e.g. a film camera to add a watermark, but it would definitely add a lot to the cost.
(This particular watermark isn't required to be imperceptible, indelible, or impossible to counterfeit, unless I'm missing something. If that was required, yeah, I think that would be close enough to impossible to be an effective ban on film cameras.)
I think you're right that it applies to analog cameras. Also to cassette recorders. And maybe notepads.
I agree they could add a watermark, its the nature of the provenance watermark which will be the problem. As a reminder, they must provide
- the time and date of production
- the name of the user
- details about the device
- and a digital signature
#3 would be the easiest, #4 you could probably argue does not apply because the photo is not digital, and I doubt it is coherent enough to be enforceable, but I don’t see how you comply with #1 and #2.
Sorry, maybe it's because I haven't read the whole bill carefully. I don't see anything in 22949.90.2 about what information the watermark has to include (which makes me doubt my own reading, because then what's the point?). It does say that there must be options to choose what types of provenance data are included, but doesn't say that any particular type has to be available.
I see that 22949.90 has definitions of "provenance data" (with non-exhaustive list of what that might include) and "watermark", but I don't interpret that as saying that all of that data must be included in every watermark mandated by the other sections. Maybe they intended it to be implied.
There are film cameras that can put the date in the image. I think they basically have a LED display that gets reflected onto the film by a prism or something. To add the user's name (is that actually mentioned in the law anywhere?) there would have to be some way to input a name, so this is definitely adding a lot of cost if the camera didn't already have some kind of digital UI.
The digital signature is definitely the hardest one, but also the only one that has any point I can see. Assuming it's not ok if it's possible to just copy the signature onto a different picture, you need a hash of the picture itself to encrypt, so you need to put a digital camera inside the film camera. And of course you have the same difficulties as with digital cameras, i.e. make it impractical to forge the signature, which means hiding a secret key inside every camera in a way that can't be reverse engineered.
Sorry, I got a bit nerd-sniped by this question. Probably not worth spending this much time analyzing a bad bill that is (hopefully) never going to be law. Unless California legislators or their staff read the comments in this blog, I guess.
You can find the required information in 22949.90.2, subsection (a), subsection (3), the bit defining a provenance watermark. I assumed when they said you had to include a provenance watermark it meant that you had to include one with all of that.
Also, upon rereading the bill I may have been incorrect, it looks like subsection (b) (1) only applies to digital cameras, so while an analog tape recorder would still be banned, an analog camera might not have to provide a provenance watermark at all.
At least it's only requiring that recording devices provide an *option* to watermark, not that they always add a watermark whether the user wants it or not. I don't think that was clear from Zvi's description.
But if it's just an option, I don't see why it has to be required in all devices. People who want the option enough to pay extra for it could pay extra for it, and people who don't care could still get cheap disposable cameras.
I think the option has to be on every camera though, rather than having it so that you can buy a camera with the option to watermark if you want. So the technical challenge is the same whether the end user wants to use it or not.
Yes, the law says it has to be on every camera. I just don't see what good that does. Maybe they figure people will not buy a camera with watermarking if there's a cheaper one without, but some people will use the watermarking option if it's there? And they want to encourage that, even though there will still be plenty of unwatermarked video out there? I guess that's probably it. Or else they just didn't think too hard about it.
“ I just don't see what good that does”
Nor do I, but then, we’ve already established this is a terrible bill, so why shouldn’t it be a bit worse.
“Or else they just didn't think too hard about it.”
I bet this is what happened. I’m pretty sure the bill wasn’t meant to apply to analog cameras at all, and that if this was pointed out to them then they would amend the bill to specifically exclude them, certain digital cameras, and basically all the other tech which was included here by mistake (i.e. someone else pointed out it also applies to tape recorders). This just further highlights the problems with the bill— its so poorly thought out that it accidentally bans unrelated products.
Probably obvious: With watermarking there's always a tradeoff between transparency (i.e. imperceptibility) and robustness (ability to survive changes to the watermarked media, whether it's someone intentionally trying to remove the watermark or changes that happen naturally, like noise added during broadcast or lossy compression).
Since this bill specifies "imperceptible", even "maximally indelible" watermarks would end up being easy to remove. Though this might depend a bit on the question of "imperceptible to whom?", and there's room for some ingenuity.
In audio (the kind of watermarking I used to work on), it's pretty easy to create a watermark that will be inaudible to the vast majority of people (maybe even all people), but will not survive even the most basic changes, like adding a tiny bit of noise or speeding up by 1%. If you're willing to accept that a few people might be able to hear it sometimes (like music mastering engineers and maybe a few audiophiles), it's possible (though not easy) to make it robust to e.g. MP3 compression down to a reasonable bit rate, small changes in speed, band-limited filtering, etc. Those tradeoffs continue all the way up to "completely overwrite the original sound with a carrier signal".
And then there's the fact that some sounds are easier to watermark than others, and questions of how short an audio clip can be before you can't detect the watermark, and how much computation you're willing to spend (which is two separate questions, for adding the watermark and for detecting it), and whether "imperceptible" means you get to compare it with the non-watermarked signal and try to notice the difference, and how much data you actually need to put in the watermark, and how unlikely you want false positives to be, and so on.
For images, it's a bit more of a question of how hard you can look at the image before seeing the difference, although of course trained eyes will see it more easily. Video is maybe a bit easier, assuming you only care about the watermark being imperceptible when watching at normal speed; plus it's normal to have artifacts from compression so you might as well have similar artifacts for the watermark.
Watermarking text is a weird case. If you want it to be nontrivial to remove, I think all you can do is make subtle statistical changes to the word choices, which would (only) be detectible over long windows. I feel like I've heard of LLMs doing this already, but don't know if it would fit the requirements here. Also probably not too hard to remove, using another LLM or just a thesaurus.
There are many companies selling watermark technologies, and if something like this bill passed, they'd all be desperately trying to become the standard. There would be big advantages to standardization, but that probably involves giving someone a monopoly.
So anyway, yeah, I kind of like the idea of all AI-produced media being watermarked, but this bill sucks.
I guess that the whole Digital Rights Management experience has taught us that watermarking do
Ig watermarking music or videos to indicate the copyright owner doesnt work, then watermarking to indicate its AI generated probably isnt going to work either.
I mean, I think I could believe an honest argument of the form "generative Ai creates a bunch of problems that are impossible or economically infeasible to defend against, so we're just going to make it illegal, with very very big criminal penalties for anyone caught with an image generation model."
LIke life in jail. With no research exemption.
""Purposes of prevention and detection of crime" would be an exemption of course, so the police officer who arrests Yann LeCun can collect evidence for the prosecution case without themselves becoming a target of prosecution,
"What did you think 'Butlerian Jihad' meant? Vibes? Papers? Essays?"
Why do bad bills happen to good people?
It seems to me that, in the absence of good bills, we're likely to get bad ones. Of course there are multiple ways a bill can be "bad"; a simple classification would be (1) bad goals – the bill is trying to accomplish an undesirable / suboptimal thing, (2) bad implementation – the bill uses poor language in service of its goals.
Goals are subjective, so a bill might have goals that I consider to be good and you consider to be bad. Implementation quality is less subjective; sometimes an implementation choice embodies a tradeoff for which different people would have different preferences, but often an implementation flaw is just plan bad for ~everyone.
It seems to me that a lot of the flaws in this particular bill are bad implementation, and I presume that's because it was drafted without participation from anyone having certain types of expertise. Shame on the authors for not seeking out such expertise, but at the same time, it seems like it should be a solvable problem in this case. The idea of watermarking content has been out there for a while; has anyone tried to create high quality model legislation? Perhaps someone should?
(If anyone knows of such work, or is interested in discussing this idea further, drop me a line!)
"This service is known to the state of California to contain inauthentic content" stickers on everything.
I guess you could treat AI models without watermarks as being basically like other forms of illegal content, which social media companies already have moderation teams in place to take down etc.
Something like a legal obligation to respond to takedown requests within a specified (and fairly short) time might be workable, I guess.
The only "maximally indelible watermark" I can think of in real life is the candid shots of you riding roller coasters they try to sell you. It seems like the watermark has to be big enough that people can't just take pictures of it with their phone; hence, your face is often obscured (and you really really have to pay to get them.) Is this what CA proposes to do to all AI generated content?
Disc space being fairly cheap, it is maybe not unreasonable to impose a requirement to store everythin g your AI generates.
To make sense, presumably, it is the person who runs the software, not the person who wrote it, who has the archiving requirement. So, if your product is open source its the people who download and run in who have the archiving requirement.
One obvious problem: what if the AI generates something illegal, so youre not allowed to store it?
Well, guess part of this is that you promise your AI has never, ever, generated anything is illegal;, the government can check this by serving you with a search warrant for your stored generations and scanning through them, and, if it turns out that you dont have an archive when served wigth a seaqrch warrant, you go to jail for violating the archiving requirement,
(compare: regulation of investigatory powers act in the UK;big criminal penalties for not decrypting something when the government asks you to, even if everyone agreees the plaintext was probably innocous ... doesn;t matter)
Is it possible to guarantee an AI will never generate anything illegal? As a trivial example, what if I send an LLM copyrighted (or even classified) text and ask them to repeat it back to me?
I'm quite sympathetic to the argument that image generating AI is already effectively illegal under existing laws, without the need for new bills like this one, and the government can put the people responsible for it in jail whenever they feel like it.
Though, I'm not sure your examples work ....
in the case of classified information, only an offense if you have a security clearance, typically. (which is why the treatment of Julian Assange has some journalists so worried)
I'm not sure whether or not Zvi has a clearance, so I won't demonstrate by posting US Top Secret information here.
I don't but also other people can read this!
Your readers might, technically speaking, be mishandling yjr classified information by reading it on your substack from a computer that was not secured per requirements....
You shall all just have to imagine it as a hypothetical.
Not a lawyer, but I would think compelling literal speech by humans (and not just politicians running for office!) every time they want to speak on the internet poses some constitutional issues.
Part of me thinks that some clever (for specific values of "clever") intern realized that 5% of Google or Facebook's global revenue would go a very long way towards covering CA's budget deficit, and then they wrote a bill that would make collecting that penalty nearly guaranteed unless everyone stopped using AI forever. Sort of a "You know, the EU is on to something here... and people are scared of AI..." situation. Powerful enough to hit everyone at any time, but vague enough to allow selective enforcement when desired.
Hi Zvi,
Just to introduce myself, I'm a SAG-AFTRA performer. My digital identity is my only source of income. I like this bill.
With no ill intent, I feel compelled to counter this article with my own extremely nerdy view on the bill, which I support so strongly that I am about to travel to Sacramento to support it.
Sincerely,
Erik Passoja
----
This critique on AB-3211 raises several points of contention regarding the bill’s feasibility and potential impact on the tech industry. However, these criticisms overlook the bill’s essential goals of protecting digital identities and ensuring ethical AI use.
My view: if you take away a carpenter's voice, he can still build. But if you take away a voice actor's voice, they will never work again.
Protect us.
Your doctor calls and tells you to double up on your medication... ...but it’s not your doctor.
You call school to pick up your kids, and the school releases them... ...but it’s not you.
President Biden's was deepfaked in a robocall in New Hampshire this January. It took 32 days to catch the culprit.
With digital provenance info, law enforcement would have kicked the door down in minutes.
----
The Nerd Stuff re: the article:
Burden on AI Developers:
The requirement for databases with digital fingerprints and “maximally indelible watermarks” ensures accountability and traceability.
Theatrical Motion Picture watermarking has been standard since 2005 (https://creativepro.com/major-movie-studios-specify-digital-watermarking-for-digital-cinema/). DRM has been around since 1983, and when Napster reared its ugly head, the music industry would have tanked without it.
And while it may seem burdensome at first, these measures are crucial to prevent misuse and protect consumers. The development of such standards is necessary to maintain trust in AI technologies.
Conversational AI Disclosures:
The necessity for AI systems to disclose their nature at the start of interactions is compared to cookie notifications.
This comparison is misplaced; ensuring users are aware they are interacting with AI is vital for transparency and consent. This requirement protects users from being deceived by AI systems posing as humans.
Impact on Existing AI Systems:
The claim that AB-3211 retroactively bans existing AI models is an overstatement. The bill provides provisions for existing systems, allowing for compliance through retroactive decoders or proof of incapability to produce inauthentic content. This approach balances innovation with the need for robust safeguards.
Financial Penalties:
Fines of up to $1 million or 5% of global revenue are intended to enforce compliance and deter violations. Large tech companies with significant resources must be held accountable to ensure adherence to ethical standards. These penalties are proportional to the potential harm caused by non-compliance.
Furthermore, stealing Taylor Swift's face and putting it in a deepfake porn is about to cost someone a lot more than that (https://www.bbc.com/news/technology-68110476).
Practicality and Enforcement:
The ambiguity surrounding open weights models and the 24-hour reporting requirement are valid concerns. However, these provisions highlight the urgency and importance of addressing vulnerabilities promptly to prevent widespread misuse. The bill’s focus on immediate action aims to protect consumers in a rapidly evolving technological landscape.
The core intent of AB-3211 is to establish necessary safeguards for digital identities in the AI era.
Sure, balancing innovation with regulation is inherently complex, but the protections offered by AB-3211 are essential for ensuring ethical AI use and maintaining public trust. Continuous dialogue between lawmakers, tech industry stakeholders, and the public is crucial for refining and implementing effective legislation.
Don't throw the baby out with the bathwater, so to speak.
Related reading: http://www.protectdigitalidentity.org
Thank you! It's good to understand the other perspective - I'd previously literally not seen anyone in support of this bill.
As I read this, there are essentially two central points.
1. The requirements are not as onerous as I think and the downsides overstated.
2. The situation is so dire that we should be willing to take down generative AI the way we took out Napster, until a solution is found the way we eventually found a solution there via streaming services. A very large blast radius is acceptable.
Mostly I see arguments for #2, and I don't see a need to respond there - I think your points stand, the downsides are real, and people can decide on how bad they likely will get (I talk about these every week) and how much utility we should be willing to sacrifice in their name.
I do see a large factual (world model) disagreement on #1. You seem to think certain technologies exist or are feasible, in ways I don't. And certainly one could rely on #2 only, and say 'I don't care that's your problem, your product sucks and it should die until such time as it stops sucking' but one should do that eyes open.
In particular, if anyone technical thinks that either of the two options for existing LLMs is viable rather than a Can't Happen - showing it cannot produce inauthentic content, or building a 99% accurate detector - I'd be pretty surprised, and ask them to say how they think one could do that.
Posting this up top, because using "reply" cuts it off:
HI Zvi,
Thank you for your thoughtful analysis...I appreciate the opportunity to offer a different perspective on AB-3211.
You've correctly identified two central points in my argument:
(1) The requirements are not as onerous as they might initially appear.
(2) The potential harm from unregulated AI is severe enough to warrant strong measures.
Let me elaborate on point #1, as I believe there's a factual disagreement about the feasibility of the technology required.
First, it's important to understand that much of the necessary technology already exists. Digital watermarking, for instance, has been standard in theatrical motion pictures since 2005, and companies like Digimarc and Verance have amazing products.
So, the challenge lies not in creating new technology, but in implementing and standardizing existing tools across industries. This process will involve:
a) Getting various government agencies to agree on an overall data standard to be watermarked on media
b) Customizing parts of this standard for different sectors (medical, industrial, entertainment, etc.)
c) Allowing technical experts to lead the implementation instead of politicians and lawyers and Luddites
d) Continuously updating security measures to stay ahead of potential hackers
Point (d) is crucial. Just as banks use complex algorithms to detect potential fraud, we need to be equally vigilant in protecting digital identities. The key word = paranoia. Paranoia is good when you deal in security.
Now, regarding the specific requirements for existing LLMs:
Digital Watermarking: advanced versions capable of marking 3D, video, image, audio, and motion capture files already exist and are in use.
Blockchain Integration: This technology can securely store consent information and usage rights.
Consent Matrix: A system for managing granular consent across different use cases and industries (for a performer version, check out www.consentmatrix.app).
Smart Contracts: To ensure proper compensation for authorized use of digital content.
While building this system will require effort, it's not beyond our current technological capabilities. In fact, I've personally experienced demos of some of these technologies (though I'm limited in what I can share due to NDAs).
Regarding the 99% accuracy requirement for detectors, I've worked with a company that can identify voices with 99% accuracy, even in live performances. Even better than Pindrop (https://www.pindrop.com). This suggests that such high accuracy rates are achievable.
I understand your concerns about the potential impact on innovation and existing AI systems. However, I believe we must balance these concerns against the very real threats to individual privacy, security, and livelihood that unchecked AI development poses.
The goal isn't to stifle innovation, but to ensure it proceeds responsibly. Just as DRM helped the music industry adapt to the digital age, these measures can help us navigate the AI revolution while protecting digital identities.
I'm always eager to learn more and hear different perspectives on this issue. If you see alternative solutions that could achieve the same goals, I'd be very interested in discussing them.
My commitment: on behalf of our great union, to protect, manage, and monetize the digital identities of 170,000 SAG-AFTRA performers at the dawn of Artificial Intelligence.
Sincerely,
Erik
I was making the exact opposite argument, namely that the experience with watermarking for copyright enforcement tells us that the watermark is going to get broken.
Well, watermarking alone won't work. Do consider the clone in the blockchain and a consent matrix. Nothing will work alone, just like your PIN code isn't the only thing protecting your debit card.