Showing posts with label AI. Show all posts
Showing posts with label AI. Show all posts

Monday, April 27, 2026

Grounding AI in the Actual Law.

When I use an AI assistant for legal work, the first thing I do is give it the actual text of the statute. Not a summary, not whatever the model absorbed during training, but the current statutory language as published by the California Legislature. This is not a novel insight. It is the most basic version of a technique the AI community calls "grounding," and more specifically, retrieval-augmented generation, or RAG. You retrieve the authoritative source material and put it in front of the model before it generates a response. The model still does the reasoning, but it is reasoning about real text rather than a half-remembered approximation of it.


The problem is that grounding only works if you have the primary law in a format the AI can actually ingest. And that turns out to be harder than it should be. I went looking for a publicly available, regularly updated collection of California statutory text that I could use as a reference corpus. I did not find one. Westlaw and Lexis have the text, but it is behind a paywall and neither offers a public API for downloading statutory text at scale. The state's leginfo website lets you browse individual sections, but it is not designed for pulling down an entire code at once. I looked at legal data projects, open-source repositories, academic datasets. Nothing I found fit the requirement of being complete, current, and available in a format an AI can work with.

What I did find, eventually, is that the California Legislature publishes the entire statutory database as a single downloadable archive. It is a roughly 764-megabyte ZIP file containing data tables and XML content files for every active section of every California code. It is official. It is free. And as far as I can tell, almost nobody is using it for this purpose, which is surprising given how much attention legal AI hallucination has received.

I wrote a script that downloads that file, parses the database tables, and writes each of the 30 California codes out as clean plain text, one file per code, in proper statutory order with structural headings. Then I set up a free GitHub Actions workflow to run the script every night. If the Legislature has updated anything, it commits the changes automatically. The repository is public, at github.com/johnakelly-yahoo-com/california-codes. Anyone can use it.

The practical payoff is straightforward. When I point the AI at the actual text of the relevant code, it can still get the reasoning wrong, but it is no longer fabricating the source material. That eliminates a common category of hallucination: the kind where the statute the AI cites does not say what the AI claims it says, or does not exist at all. By constraining the input to only the authoritative source, the model has no reason to invent a statute because the answer is either in the text or it is not.

Giving the AI the statute text is only half of it. The other half is telling the AI how to cite what it finds. My system prompt includes an instruction that specifies California Style Manual citation format: year before the case name, parallel citations in brackets, no year for code sections or Rules of Court. When the AI writes "(Code Civ. Proc. § 1005(b).)" instead of some improvised format, that is not because it memorized the California Style Manual during training. It is because I told it the format and gave it examples — what the AI community calls few-shot prompting. The grounding works the same way in both directions: authoritative source material constrains the substance, few-shot examples constrain the output. The AI does not have to guess at either one.

Make this text accurate pursuant to the California Style Manual,
Fourth Edition, by Edward W. Jessen.

Put the year before the case citations.
Make sure the parallel cite is bracketed.
Do not include years for code section citations.
Do not include years for Rules of Court citations.

CASE CITATION:
(People v. Barton (1995) 12 Cal.4th 186
[47 Cal.Rptr.2d 569, 906 P.2d 531].)

CODE SECTION:
(Code Civ. Proc. § 1005(b).)

RULE OF COURT:
(Cal. Rules of Ct., Rule 3.1300(b).)

This is not a replacement for Westlaw or Lexis. It does not include case law, annotations, or secondary sources. What it does is solve a specific, well-defined problem: if you are going to use AI to work with California statutes, you need the AI to have access to the actual statutes. That sounds so simple it barely seems worth saying, but the gap between "the AI should have the real text" and "here is a maintained, current source of the real text in a usable format" turns out to be where most practitioners get stuck. They either trust the AI's training data and risk hallucination, or they copy-paste individual sections by hand, which does not scale.

I could not find anyone else who had built this, which is the part that still surprises me. The data has been sitting on the Legislature's website for years. The tools to automate the pipeline are free. The need is well-documented. If someone else has done this and I missed it, I would genuinely like to know about it.

The research supports the approach. Stanford's study of legal AI tools found that Lexis+ AI and Westlaw AI hallucinate in 17 to 33 percent of queries, even with retrieval-augmented generation designed to anchor the output in real sources. (Magesh et al., Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (2025) J. Empirical Legal Studies 1.) A March 2026 benchmark study went further and found that retrieval quality, not the choice of AI model, is the primary driver of whether a legal AI system gets the answer right. Most hallucinations in production systems, the authors concluded, are induced by retrieval failures. (Butler & Butler, Legal RAG Bench (Mar. 2, 2026) arXiv:2603.01710.) Get the retrieval right and the model follows. Get it wrong and no amount of model sophistication will save you.

And this is not a theoretical problem. Earlier this month, the Sixth Circuit sanctioned an attorney whose appellate briefs contained fabricated quotations attributed to real cases. The briefs had been drafted using Westlaw's CoCounsel. The court found three citations where "purported direct quotations do not appear in their cited sources" and could not locate any authority containing "the same or substantially similar language." (U.S. v. Farris, 6th Cir. Apr. 3, 2026.) This was not ChatGPT. This was the tool lawyers are paying for precisely because it is supposed to be grounded in real legal databases. The repository is there. It updates every night. And for anyone working with AI on California law, it is one less reason for the model to make things up.

Wednesday, February 18, 2026

The Greenspan Question.

This week, San Francisco Fed President Mary Daly gave a speech at the Silicon Valley Leadership Group that deserves attention. The topic was AI and productivity, but the real substance was a story about Alan Greenspan in the mid-1990s, and what that story means for how we should think about technology and economic data right now. The speech is worth reading in full (it is available on the San Francisco Fed's website), but the core of it is a historical parallel that I think cuts against a lot of the conventional wisdom in financial commentary.

Here is the setup. In 1995 and 1996, businesses were pouring money into information technology. PCs, networking, inventory management systems, GPS for trucking fleets. Investment was surging. But the official productivity numbers were flat. Standard macro models said the economy was overheating, the labor market was too tight, and the Fed should raise rates. Greenspan looked at the same economy and reached a different conclusion. He talked to executives. He walked factory floors. He saw firms not just buying computers but reorganizing their operations around them. Wholesale firms were using inventory management to cut warehouse stockpiling. Manufacturers were doing mass customization with computer-aided design. Trucking companies were eliminating deadhead hauling with GPS. Greenspan argued that official productivity data was lagging reality, and that the economy could grow faster than the models suggested without triggering inflation. The FOMC stayed patient. The roaring 1990s followed.

Daly's point is that we may be in a similar moment with AI. Businesses are investing. Use cases are multiplying. Firms in the Twelfth District (the Fed's western region, which includes Silicon Valley, Seattle, and the broader West Coast tech ecosystem) are reporting real savings from AI in consumer research, back-office operations, product development, and more. But the aggregate productivity statistics have not moved. Most macro studies find little evidence of a significant AI effect on economy-wide productivity growth. Daly cites Daron Acemoglu and a recent NBER working paper by Yotzov et al. surveying over 5,000 firm executives, both finding minimal impact so far.

The question Daly poses is whether GenAI is a sufficient catalyst to change the nature of production and business, or whether it is still at the stage of replacing a steam-powered motor with an electric one while leaving the factory floor unchanged. Her answer is honest, that no one knows yet. But her implication is clear: "we won't find all the answers in the aggregate data on productivity, the labor market, or inflation," she says. "Seeing developments before they fully emerge requires digging deeper, relying on disaggregated information that foreshadows transformation." The lesson from the 1990s is that aggregate data will be the last place to see transformation, not the first.

I find this compelling, and I think it highlights a bias that runs through a lot of financial journalism. Robert Solow's 1987 line about seeing the computer age everywhere except in the productivity statistics is one of the most quoted observations in economics. It is clever, it is pithy, and for decades it has been the default frame for skeptics. The Economist ran a piece in late 2024 titled "There will be no immediate productivity boost from AI." Carl Benedikt Frey argued in the Financial Times that "AI alone cannot solve the productivity puzzle," noting that even Microsoft's Satya Nadella acknowledges AI has yet to have its transformative moment. Apollo's chief economist Torsten Slok recently told CNBC that AI is "everywhere except in the incoming macroeconomic data," consciously echoing Solow. Robert Gordon at Northwestern has built a career arguing that the great innovations are behind us and that nothing since indoor plumbing will move the needle the way electrification did.

There is something to this skepticism. Rigor matters. Demanding evidence before declaring a revolution is exactly what serious people should do. But I think the skepticism has an undercurrent that is worth naming. In financial circles, there is a professional incentive to be the person who sees through the hype. Calling a bubble, puncturing enthusiasm, being the grown-up in the room, these are status moves. And they are coupled, I think, to a cultural distance between the people who build technology and the people who trade on it. The nerds are excited, so the sophisticated response is to be unimpressed. This is not analysis. It is posture.

What Daly's speech illustrates is that Greenspan's innovation in the 1990s was not a model or a formula. It was a practice. He looked at data that contradicted his priors and investigated rather than dismissed. He listened to people doing the work instead of relying solely on the aggregate statistics that his own institution produced. That is harder than running a regression, and it is harder than writing a clever headline about the Solow paradox.

Erik Brynjolfsson at Stanford recently pointed to revised 2025 data suggesting U.S. productivity growth may have roughly doubled to 2.7%, nearly twice the prior decade's average, while job creation was minimal. If that holds up, it is the pattern you would expect from technology-driven productivity gains. Fewer workers producing more output. Brynjolfsson calls it the transition from the investment phase to the "harvest phase." It is early, and the data will be revised, but the signal is consistent with what Daly is describing.

The practical takeaway is simple. If Greenspan had listened to the macro models and the skeptics in 1996, the Fed would have raised rates and may have choked off one of the strongest periods of broad-based growth in modern American history. He did not, because he looked past the aggregates. Daly is asking whether we are willing to do the same thing now. The answer will matter for interest rates, for labor markets, and for whether the current wave of AI investment produces widely shared prosperity or just another round of capital returns to the firms that got there first. That is not a question for the nerds or the traders. It is a question for everyone.