26 Comments
User's avatar
Steeven's avatar

Memory seems to be intensely recency biased. It’s interesting to have it make a picture of all your conversations to see what it thinks of you, and most of the subject in my case was stuff from the past week

Expand full comment
Alex's avatar

> I’m excited for both features, but long term I’m more excited for Google integration than for research. Yes, this should 100% be Gemini’s You Had One Job, but Gemini is not exactly nailing it, so Claude?

Have you enabled the Workspace App for Gemini and tagged @Workspace into your chats?

https://support.google.com/gemini/answer/15229592

Expand full comment
tup99's avatar

Codex says: "Codex always runs in a sandbox first"

Codex also says: "Linux – there is no sandboxing by default"

So I'm confused.

(https://github.com/openai/codex)

Expand full comment
Victualis's avatar

Maybe they assume most people are running Linux inside a sandboxed container, but direct MacOS use needs a sandbox.

Expand full comment
Austin Morrissey's avatar

If you want to try o3 full reply to this message with the prompt and I’ll try it

Expand full comment
Kyle Wilson's avatar

> Each context is a “day.” Then, the model is retrained each “night” on the day’s data so that it has long-term knowledge of what happened (just as humans sleep).

It seems difficult to robustly align a model whose weights are guaranteed to drift off in unknown directions.

Expand full comment
Mariana Trench's avatar

I uploaded a black-and-white 1956 Kodak snapshot of my toddler sister in a small pool, and it got it right. I am duly impressed and terrified.

Expand full comment
tup99's avatar

In the "where was this photo taken" challenge?

Expand full comment
Mariana Trench's avatar

Yes.

Expand full comment
Mariana Trench's avatar

"Anna Gat is super gung ho on memory, especially on it letting ChatGPT take on the role of therapist."

We are living in a time when our personal data is even less secure than it was a month ago. I would be very wary of doing this. Yes, I know our personal data was never actually secure. But now it's even worse.

Expand full comment
Rock Docs's avatar

TWITTER is a trademark, not a copyright. Source: me, an IP lawyer who hates jokes.

Expand full comment
Cjw's avatar

If that were the case, and they’ve stopped using the mark coming up on a few years now, have they abandoned it? Does having Twitter.com redirect to x.com count?

Expand full comment
Rock Docs's avatar

They have definitely not abandoned it. I don't know about other countries, but in the U.S. the rule of thumb is at least three years of non-use is required for trademark abandonment. In this case, for example, twitter.com redirects to X.com and that alone may be sufficient to show use in commerce given how well-known and popular the service is.

Expand full comment
Cjw's avatar

Ok, thanks, I had IP law in 2003 and never use it so I was curious if a IP lawyer would think that was the case.

Expand full comment
Jeffrey Soreff's avatar

Many Thanks! I've bumped into IP law from the patent side a little, but very much appreciate the explanation!

Expand full comment
Jeffrey Soreff's avatar

One strange thing about Robin Hanson's view:

Yes, real interest rates have not risen significantly, which suggests skepticism about major near term changes.

But the valuations of the major AI labs are in the hundreds of billions of dollars.

_Both_ the investors in e.g. U.S. Treasury bonds _and_ the investors in major AI labs have "skin in the game", and presumably are making the most accurate bet they can. How can they _both_ be right?

Hanson seems to believe the former investors - but why them instead of the latter?

Expand full comment
[insert here] delenda est's avatar

The argument would be that the Tbill market is multiples bigger than the AI market.

The counterargument is that the AI market is 100% people expressing their expected value of AI whereas the treasury market is made up of many different buyers, a large number of which are _not_ expressing an opinion on the future value of their treasury bills

Expand full comment
Jeffrey Soreff's avatar

Both good points. Many Thanks!

Expand full comment
tup99's avatar

We want our AI to not lie to us. Won't that make it an asshole? If you go to your grandma's house and she asks how you like her brisket, you tell her it's delicious. You don't tell her that it's dry and stringy. (And this is a white lie, not a pleasantry.)

If someone uploads a photo and asks if they are ugly or fat or their nose is too big, what do we want an AI to say? A non-asshole human would say a white lie. They would not say the truth, even gently using nice words.

Expand full comment
Thor Odinson's avatar

This could be a culture thing, I would prefer brutal honesty where applicable. Zvi is a New Yorker, and Jewish, and both of those groups are famous for preferring brutal honesty too.

Now, to be clear, I do think "gently, using nice words" matters quite a bit! But it is better to know the truth than not, believing false things causes far more harm in the long run

Expand full comment
tup99's avatar

It is true that cultures vary quite a bit in this area. But I am pretty sure that no culture is 0% or 100% truthful. I don’t think you or Zvi would say “dry and stringy” if your grandma asked how is her brisket.

Also, even if YOU prefer 99.9% honesty, most people do not. Social niceties are necessary in most cultures to be considered not an asshole. So at best, this would be training AIs to be assholes in MOST cultures.

Expand full comment
Mark Schröder's avatar

Re: Please send text tokens for greater training efficiency.

There’s whisper for that.

Expand full comment
GoodGovernanceMatters's avatar

> "Huh, Upgrades"

Well done.

Expand full comment
Victualis's avatar

I really would not draw any conclusions about AI (or human) deception from Mafia-style games like Among Us. The optimal strategy is sensitive to tiny tweaks in the rules. I would be more likely to conclude from Claude preferring deceit that that is optimal play for this variant of the game, than that Claude prefers deceit. An AI system (or a human) might well prefer deceit outside the game setting, but to measure that you first have to have a perfect strategy to measure against.

Expand full comment
dan mantena's avatar

Gemini models integrate Google docs fine at the moment so not sure what you mean that Google dropped the ball.

Also, perplexity deep research is another option that rivals Claude's deep research depth balanced with speed.

Expand full comment