Thursday, October 23, 2025

Electric short-haul vehicles

An electric forklift parked near a wall and plugged into a charger.

The market for forklifts and similar lift trucks electrified early. The market for Class I lift trucks intended mainly for use inside warehouses crossed 50% electric in approximately 2010. In indoor environments there is a substantial advantage for a vehicle which produces no exhaust fumes. The early models used lead-acid batteries, gradually shifting to Lithium chemistries as batteries aimed at electric vehicles improved.

More recently, other classes of forklift which handle heavier loads and outdoor use have been electrifying, as batteries have reached a capacity to operate all day without charging and electricity is less expensive than the equivalent propane for internal combustion forklifts.

Electric lift trucks have several inherent advantages over propane:

  • Efficiency: electric vehicles excel with frequent starts and stops, which a forklift spends all day doing.
  • Regenerative lowering: it takes energy to lift a heavy load, but regenerative braking techniques can recover energy while lowering a heavy load. Anything lifted up will (eventually) be lowered back down, over time the fleet of forklifts will recover a good portion of the energy spent lifting.

Short haul trucking

Drayage trucks, heavy vehicles intended for short distance duty such as to and from transport hubs, are at the start of their electrification process now. They also have several inherent advantages:

  • Idling: even with a decade of effort in schedule optimization and just-in-time arrival, such vehicles spend a substantial amount of time waiting for loading and unloading. Internal combustion vehicles consume fuel while idling, electric trucks do not.
  • Emissions: air quality near ports and transportation hubs has been a concern for decades, and have resulted in ever more strict limits on emissions in the vicinity of the port. A zero emission electric drayage vehicle more easily meets these requirements.

Vehicle to Grid?

To me, one of the interesting potential developments for electric vehicles is to take advantage of the battery capacity available in fleets of electric vehicles with one common owner and fairly stable usage patterns. Warehouses and ports mostly do not operate at 100% capacity 24/7. Noise ordinances and the economics of three shift work means they will often not be fully staffed overnight. There will be hours where some of the equipment is plugged into chargers and mostly sitting idle.

These are the same hours where solar production is not available. Might the fleet owner be able to make some amount of revenue while the equipment sits idle? The equipment does need to end the night with a mostly full battery for the first shift's work, but there may be an opportunity to charge while power is cheap and, knowing usage patterns in advance, participate in virtual power plants to bid into the day-ahead market.

School buses remain the best example of Fleet-V2G potential. They can charge after dropping off children after school, at a time when solar power is still generally feeding the grid. They will sit all night, and can supply some amount of power in the late evening hours.

Tuesday, October 14, 2025

Virtual Power Plants and the California Grid

The California Independent System Operator (CAISO) has published a report regarding electricity generation in the summer of 2025, which includes data through the end of September 2025. They provide a nifty interactive chart at that link, and the underlying data is easy to find in the source of the web page. I've included the data for July and September at the bottom of this post, as we're going to dive into those details in a moment.

Below is the graph of the sources of energy on the California electrical grid in September of 2025, the period ending about two weeks ago at the time of this writing. Importantly, the Solar resources shown are only utility connected solar arrays. Rooftop solar is behind the meter, reducing demand for electricity rather than adding to the supply measured here. There is approximately another 19 Gigawatts of solar capacity installed behind the meter.

Graph of energy sources in California in September 2025 with Methane providing 25 Gigawatts, Hydro 7 Gigawatts, Nuclear 3 Gigawatts, with Solar peaking to 12 Gigawatts in the middle of the day and Battery sustaining Solar contribution for a few more hours after the Sun sets


This is the same data, shown as a percentage of the total. Because total electricity consumption varies throughout the day, the contribution of baseload sources like Methane-fired generators varies as a percentage of the total even though their output is constant.

Graph of energy sources in California in September 2025 with Methane providing roughly 50%, Hydro 13%, Nuclear 5%, with Solar peaking to 20% in the middle of the day and Battery sustaining Solar contribution for a few more hours after the Sun sets

A few observations:

  • The contribution of wind power is smaller than I expected. I may get a skewed view of the prevalence of wind generation in California as I pass through the Altamont Pass wind farm regularly, but I expected wind to be a larger percentage.
  • It is too small to be visible in the graph, but Solar power never actually drops to zero. It continues to supply about 4 Megawatts all night. I believe this might be Ivanpah, a solar thermal generation plant in the desert, which continues generating power from stored heat even after the sun has set.



I tried to incorporate behind-the-meter solar into a similar graph, below. This is a crude estimate:

  • In 2024, 19 Gigawatts of rooftop solar was installed versus 21 GWatts of utility-scale. I made the assumption that the production from rooftop solar would be approximately 19/21 of the utility number.
  • Prior studies show that rooftop solar is not installed in ideal locations nor properly angled toward the sun, I made the additional assumption that it would be 80% as productive as utility solar.

The result hews considerably closer to the 67% renewable result announced by the state last year.

Graph of energy sources in California in September 2025 with Methane providing roughly 40%, Hydro 10%, Nuclear 4%, with the combination of utility-scale and rooftop Solar peaking to 32% in the middle of the day and Battery sustaining Solar contribution for a few more hours after the Sun sets


Virtual Power Plants

I have combined the categories which CAISO reported separately as Batteries and Demand Response, because until 4/2024 Virtual Power Plants formed via aggregation of residential batteries like Powerwalls were contracted as Demand Response. Only utility-scale battery installations like Megapacks were accounted for as Batteries. Splitting into two categories obscures and substantially minimizes the true contribution of battery power to the grid.

We participate in a Virtual Power Plant in northern California, allocating about half of the 27 KWh of capacity installed at the house. For a number of days this summer in the late afternoon and early evening, the house supplied about 6 kilowatts of power back to the grid. It is quite smooth, we don't even notice unless we look at the app to see what it is doing.




Raw data from the CAISO report web page.

"September": [
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1392.000662, 1319.840318,            // Demand Response
   1239.827827, 1177.653666, 1023.942241, 287.33, 0, 0],                                // Demand Response

  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2264.012689, 6312.981683,            // Battery Storage
   10219.53932, 9740.80126, 7382.450552, 5789.963039, 2869.891625, 0],                  // Battery Storage

  [11665, 11665, 11665, 11665, 11665, 11665, 11665, 11665, 11665, 11665, 11665,         // Imports
   11665, 11665, 11665, 11665, 11665, 5500, 5500, 5500, 5500, 5500, 5500, 5500,         // Imports
   11665],                                                                              // Imports

  [742.4458, 725.114696, 719.58188, 591.859728, 356.992952, 328.293048, 330.9584,       // Wind
   323.6145657, 277.992424, 205.762648, 221.590544, 199.863504, 216.38616,              // Wind
   311.644072, 316.5584821, 324.313968, 396.1921449, 535.622064, 785.470392,            // Wind
   905.120696, 1081.627632, 1164.064064, 1096.58392, 1058.725816],                      // Wind

  [3.7224, 3.7224, 3.95928, 4.43304, 5.02524, 7.1064, 155.30868, 3417.63696,            // Solar
   8622.78732, 11304.55656, 12390.19452, 12923.98668, 13035.8448, 12592.86717,          // Solar
   12166.76592, 11529.86232, 8779.161639, 5405.92308, 1148.5296, 12.79152,              // Solar
   5.34672, 4.01004, 3.8916, 3.84084],                                                  // Solar

  [1755.632861, 1755.632861, 1755.632861, 1755.632861, 1755.632861, 1755.632861,        // Other renewables
   1755.632861, 1755.632861, 1755.632861, 1755.632861, 1755.632861, 1755.632861,        // Other renewables
   1755.632861, 1755.632861, 1755.632861, 1755.632861, 1755.632861, 1755.632861,        // Other renewables
   1755.632861, 1755.632861, 1755.632861, 1755.632861, 1755.632861, 1755.632861],       // Other renewables

  [1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682,        // Other
   1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682, 1682],                   // Other

  [7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014,        // Hydro
   7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014, 7014],                   // Hydro

  [2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280,        // Nuclear
   2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280, 2280],                   // Nuclear

  [26188, 26188, 26188, 26188, 26188, 26188, 26188, 26188, 26188, 26188, 26188,         // Methane
   26188, 26188, 26188, 26188, 26188, 26188, 26188, 26188, 26188, 26188, 26188,         // Methane
   26188, 26188],                                                                       // Methane

  ...
]

Sunday, October 12, 2025

Germany Photos Then and Now

In our collection of family photos we have a number of sets taken in Germany over the decades:

  • in the early 1950s by my spouse's grandparents
  • in the late 1960s by my father while stationed in Germany
  • in 1995 by my spouse
  • in summer 2024 on our first family trip to Germany
  • in summer 2025 on our second family trip to Germany

In our trips of the last several summers we've made an effort to take photos of places our parents visited, allowing side-by-side comparison. The main thing one notes is the evolution of camera technology and its handling of colors, amusingly enough.


 

Göbelstraße 2, Hannover

My spouse's grandparents owned a house on this land decades ago. That house has since been replaced by larger buildings.

Göbelstraße 2, Hannover

 

Burg Pfaltzgrafstein, Rhine River

Burg Pfaltzgrafstein, Rhine River

 

Gasthaus Rheingold, Rhine River

Gasthaus Rheingold, Rhine River

 

Assmanhausen, Rhine River

Assmanhausen, Rhine River

 

Glockenspiel in Munich

Glockenspiel in Munich

 

Odeonsplatz in Munich

Odeonsplatz in Munich

 

Karlsplatz in Munich

Karlsplatz in Munich

 

Heidelberg Bridge

Heidelberg Bridge

 

Heidelberg Castle

Heidelberg

Saturday, October 11, 2025

New Ohio Exonym Just Dropped

Tell me your game development is done outside of the United States without telling me your game development is done outside of the United States.

Cartoonish map of the upper midwest of the United States with Colombus, Ohio spelled Calambus

As an alumnus of the University of Michigan, I approve of the association of Ohio with "Calamity".

Friday, October 10, 2025

New Google Blogger Features ?!?

**Try our New Beta Features**: Create a more engaging reading experience with the help of Google

Google Search previews: Easily insert visual Google Search previews for popular people, locations, pop-culture and more directly in your blog! In Compose View, look for the ‘G’ button in the editor tool bar to get started.

That is the notice greeting me at the top of draft.blogger.com today. After years of not noticing any change in the service at all, it is now getting search previews.

Honestly I would have expected any sudden burst of activity in Google Blogger to be more distinctly AI-related, part of someone's promotion packet to sprinkle LLMs anywhere and everywhere.

Tuesday, October 7, 2025

EasyPASS and entering Germany

If you've ever flown into Germany you've seen the two lines at customs and immigration, one for citizens of the European Union and one for non-EU citizens. The non-EU line is often longer and slower, though not exceptionally so at the times I've been through. Nonetheless in a family like ours, where the rest of them go through the EU line and I wait in the non-EU line, it leads to some extra complexities in travel.

EasyPASS is a program to enroll passports from several non-EU countries including the United States, South Korea, and Taiwan to be able to use the electronic readers in the EU line at German airports. One fills out a form to submit at an airport with an EasyPASS office, including Berlin (Brandenburg), Köln/Cologne/Bonn, Dusseldorf, Frankfurt, Hamburg, Hanover, München/Munich, and Stuttgart.

I enrolled at the Munich airport in July 2025, where EasyPASS is handled at the police substation. Follow the signs for the Politzei office, which is at the far end of the airport where there are a number of restaurants and shops. You ring the buzzer for admittance.

Shopping court at the Munich airport, with an arrow pointing to the police substation

It is helpful if you speak a bit of German, but the police stationed at the airport have to deal with international travellers every day and understand English well. They were quite helpful in correcting a mistake I'd made in filling out the form. They then took my US passport to enroll in EasyPASS.

I don't have an outcome to report yet, it will likely be some months until our next trip to Germany to try using the EU line at the airport.

Friday, September 26, 2025

On the Nature of Microservices

a series of interlocking sawtoothed gears, intended to represent a set of microservices

Microservices entered the lexicon of systems engineering a number of years ago, breaking up a system which might once have been delivered as a single large binary — though usually with separate relational database at least — to instead consist of a series of small microservices each performing a specific function and communicating amongst themselves to provide an overall service.

In my experience at least, moving to a microservices architecture does solve real problems but the nature of the problems it solves are as much organizational as technical.

In a very large engineering team all trying to work together to deliver a solution, one frequently loses velocity simply because of the number of teams jostling against each other:

  1. The release process grows over time, and tends to add process of the form "make sure that problem never happens again."
  2. Rollbacks roll back everything. When every team has P0 deliverables, no team has P0 deliverables.
  3. The internal architecture might have been carefully planned... or might not. Even without a "if I can call it, I can use it" mentality, there still can be significant underspecification.

 

Moving to Microservices

A team which moves to microservices, devoting considerable effort to do so, has every incentive to solve these issues:

  1. The release process for each microservice can be (mostly) decoupled with disciplined versioning of API between producers and consumers.
  2. Rollbacks impact only a portion of the overall service.
  3. The interfaces between microservices will have an API boundary, though Hyrum's Law still applies that un-promised behavior can become a load-bearing dependency.

The tradeoff is that one now has a distributed system, and distributed systems are hard.

  • Debugging is much more likely to cross multiple processes.
  • The 95% and 99% tail latency will suffer, as long code paths likely incur extra cross-process messaging.
  • A robust design will degrade if subsystems are unavailable, transforming the potential for outage into more frequent partial systems failure.

Kubernetes logo, a ship's wheel

If the problem being solved is "maximize effectiveness of a large engineering team" then these are perhaps good tradeoffs. One has the resources to develop tooling for debugging and latency and reliability. If one has further chosen to deploy those services via an orchestration system like Kubernetes, one has the personnel to cover the operational burden of that too.

If one doesn't have the personnel to cover the cost, microservices become a more questionable choice. Fundamentally: delivering via monolithic binary is not inherently bad. It isn't automatically a poor choice. It isn't poor engineering.

How to deliver software is influenced by the size and composition of the team delivering it.

Thursday, September 25, 2025

Try Turning the Train Off and Back On

BART train stopped at an outdoor station

Today I rode the Bay Area Rapid Transit train from the peninsula up to San Francisco. Our stop at the 24th and Mission station was unusually long. The conductor announced on the speaker that they were rebooting part of the train computer.

The universal first troubleshooting step now extends to turning the train off and back on.

Sunday, September 21, 2025

One Year of Electrified Caltrain

Electric Caltrain engine, destination San Francisco

In November 2020 California voters passed Measure RR to fund electrification of Caltrain down the San Francisco Peninsula. After several years of construction, the new electrified trains entered service on September 21, 2024: exactly one year ago at the time of this writing.

The electrification measure was an audacious plan for Caltrain to recover from the Covid-driven disruption in travel patterns by radically improving the service. The electrified fleet would provide more frequent service because the new engines would accelerate and decelerate far more strongly than the diesel locomotives did. The rolling stock would additionally be refreshed with new passenger cars.


So: did it work? Based on ridership data from the last year, to me it certainly appears so.

Ridership numbers drop suddenly in 3/2020 at the start of Covid, climb slowly until 8/2024, and then climb rapidly from 9/2024 through 8/2025

The large drop in ridership in 2020 is due to Covid. I tried to show the slope by drawing red lines in the few years between Covid and electrification, and between electrification and now. The rate of increase in ridership changed markedly for the better in almost exactly 9/2024 when the lines were electrified.

Ridership has not yet returned to pre-Covid levels, but now appears to be on track to do so if trends continue.

Monday, September 15, 2025

Android Signal Keyboard Privacy

Android keyboards can be quite sophisticated, including learning of commonly used words or languages. Some of these result in uploading what you enter to a service you may not know about. The Signal app on Android includes a setting to say that its keyboard input should not be uploaded.

Privacy settings for Android Signal app with Incognito Keyboard enabled

Malicious keyboard apps can ignore this, but if a malicious keyboard has made its way onto your device I think you have bigger problems.

Android keyboard with overlaid dialog: This app doesn't support voice input

The tradeoff: if you use speech-to-text on the Android keyboard, it will no longer work. The microphone will be greyed out and will bring up a small message saying "This app doesn't support voice input." The stock Android keyboard apparently has no on-device voice processing.

I learned of this from Liz Fong-Jones' Bluesky feed.

Wednesday, September 10, 2025

Continuous Improvement in LLM Code Generation

One week ago, I wrote:

"I wish wish wish that Claude Code would automatically populate a .gitignore for node_modules. Not for the first time, I checked 437 Megabytes of code into git and had to rewrite the history to remove it."

I used Claude Code to create a new frontend project, using Qwik this time, and what do I see?

dgentry@llm:frontend$ cat .gitignore
# Build
/dist
/lib
/lib-types
/server

# Development
node_modules
.env
*.local

...

A classic hacker stock photo in a darkened room sitting in front of a laptop wearing a hoodie and mask, except the person typing is a robot I don't know if this represents something which the Claude Code team specifically made happen since the last time I had it generate code like this, or if the training data of Qwik codebases is so much more likely to have included node_modules in their .gitignore file.

It is one of the perverse things about use of tools like this: we tend to give credit to the tool, and not the community which created the information upon which it relies.

Tuesday, September 9, 2025

Passport Cards as Proof of Citizenship

Passport card issued November 2024

In November 2024 we ordered Passport Cards for the first time. The cards arrived 18 days after we mailed in the forms, without paying for expedited service nor even for express mail.

The Passport Card is a stiff plastic card, slightly thicker than our driver's licenses. It is specifically not valid for international air travel, though it can be used to board a domestic flight and is Real-ID compliant. It cost $30, versus $130 for a Passport Book, at least as of the timeframe we ordered in late 2024.

Most importantly though: it is much more reasonable to have a Passport Card with you at all times than it would be to carry around a Passport Book, and the card is a valid proof of citizenship. It can be used within the United States, can be used for travel within the Americas, and will allow re-entry into the US even if you have lost the regular Passport Book.

One caution: when it comes time to renew, both the Passport Book and Card will need to be turned in for renewal. Keep good care of both, if one is lost then the renewal of the other becomes a Lost Passport event which requires DS-64 and DS-11 forms to replace. The DS-11 requires birth certificates and other proof of citizenship, just like getting the passport for the first time required.

A Passport Card is a good way to have proof of citizenship with you at all times.

Monday, September 8, 2025

Konsulatstermine für Reisepässe

Hand holding four German Reispässe Upon acceptance of a Staatsangehörigkeit § 5 declaration, making one a German citizen, the next step is generally to order a German passport called a Reisepass. This requires filling out the form and bringing the Urkunde über den Erwerb der deutschen Staatsangehörigkeit durch Erklärung and passport photos to the responsible Consulate in a passport appointment.

It can be difficult to get a passport appointment. You keep checking the site and there are never any appointment slots available.

German Consulates around the world add new appointments every weekday at midnight in Germany. For example, that is 3pm in California. If you start polling the appointment site at 2:59pm on Sunday, you have the best chance of seeing new appointments appear and grabbing one before they are all gone. Note that Daylight Savings Time differs by several weeks between Europe and the US, they aren't the same number of hours apart all year.

There are Honorary Consuls in a number of cities who can make copies of your documentation and forward it to the Consulate, which might be easier to get an appointment with if your Consulate is swamped.

Tuesday, September 2, 2025

LLMs to Blaze a Trail

As an engineering executive there are a few ideas and practices which I reinforce via repetition to the team, either explicitly at the start of a recurring meeting or implicitly by bringing it up whenever relevant. The first of these ideas is:

The product is not the code, not the features, not the designs.
    The product is that people can use the service for things that are important to them.
The business is not the code, not the features, not the designs.
    The business is that people can use the service for things that are valuable to them.


But the topic of this post is not that. The topic is the second thing I frequently reinforce via repetition:

Get something, anything, working end to end as quickly as you can.
Not even a minimum viable thing. Any thing.


It has been my experience that, as developers, we tend to focus in on one area of a system to explore its requirements and build it out sufficiently until we feel confident that we understand what else will need to be done before moving on to the next piece. This results in a system where the understanding and the plan for development is grown by accretion, each piece layered atop the previous which is left undisturbed by later developments. We might go back and harmonize all of them later... maybe.

It has also been my experience that everything starts progressing more quickly once the system does something, anything end-to-end.

  1. We gain perspective on how the whole system will work and apply it to everything we do subsequently.
  2. One can make a change and see it function all the way through. Enthusiasm improves productivity.
  3. It is far more effective when multiple people work on a system in parallel if they can all see the impacts of each other's work.

Thus:

Get something, anything, working end to end as quickly as you can.
Not even a minimum viable thing. Any thing.


LLMs to Blaze a Trail

With the maturing capabilities of LLM code generation, I tried an experiment with Claude Code. At Google one of the classes in orientation was to construct a web scraper. I asked Claude Code to build a scraper, but an even simpler one: scrape a metric.

In a new scraper directory, create a go program which will scrape a web page formatted
in prometheus metrics format, and extract a floating point value labeled "example"

Create an SQL schema for a timeseries, with columns for a timestamp and a floating point value.

Have scraper connect to a Postgres database and write each sample it collects to the database.

In a new webui/frontend directory, create a web page using React and typescript which will
poll a backend server for changes in a loop and display rows of timeseries data with timestamp,
sample name, and value.

In a new webui/backend directory, create a go program which will handle queries from
webui/frontend and fetch timeseries data from the postgres database.

A classic hacker stock photo in a darkened room sitting in front of a laptop wearing a hoodie and mask, except the person typing is a robot

It produced a small, functional implementation.

scraper123 linesGo
backend169 linesGo
frontend127 linesTypescript
 100 linesCSS
 43 linesHTML

A few interesting tidbits:

  • It produced no unit tests for the Go code. I didn't tell it to.
  • It did produce unit tests for the TypeScript code, even though I did not tell it to. I think this speaks well for the TypeScript community, the training data is infused wth testing as an expected practice.
  • I wish wish wish that Claude Code would automatically populate a .gitignore for node_modules. Not for the first time, I checked 437 Megabytes of code into git and had to rewrite the history to remove it.

 

Unit Tests

Having no tests at all sets a bad example. I don't actually want to encourage the construction of large system test suites at this stage of a project, as the effort to keep updating a large test as the system evolves is likely to outweigh the value of the test at this stage. Yet I do want to set the example by ensuring there is something.

In the scraper directory, keep the main() function in main.go but move the rest of the code
to a scrape.go file. Write tests for scrape.go with a local prometheus server and in-memory
database. Check that metrics are correctly stored in the database.

Claude Code generated 377 lines of test cases, including scraping one value and several values. Most of the code was to set up an in-memory database using sqlite and to run a local Prometheus server.

The cost of the first prompt to generate the system and the second prompt to add unit tests: 93 cents.


 

Non-trivial example

That example was pretty contrived. How about an example of a more realistic system which:

  1. Implements a protocol connecting to a legacy communications system.
  2. Implements a set of modern protocols connecting to current Internet communications infrastructure, to forward messages to and from the legacy protocol.
  3. Has a management layer watching all of the connections and can stop or restart them as needed.
  4. Has a dashboard and console showing the status and configuration of the system.

Can it produce this? Well... not exactly. I kindof cheated: this is the first thing I attempted, I made up the contrived system later.

The problem is that first step. Claude Code was not much help in producing the first piece, connecting to the legacy system. The tasks there were more like engineering archaeology:

  • Trying variations on the digest hash function until the remote system suddenly returned 200 OK.
  • Figuring out what portions of the the poorly documented header fields were actually implemented.
  • Diagnosing failures when the only indication we get is "Invalid" with no further information about what was invalid.

There just isn't any training data for this, and so trying to rapidly get to a functioning end-to-end system entirely via code generation didn't work. I was able to work on the management layer and the dashboard and so on while still debugging the first piece, but it only started working when that first piece was done.

Could I have set that first piece aside with a mockup, and worked on the rest? Probably, but it was just me not a team and the first piece was the biggest risk. I focussed on eliminating the risk.

In an engineering team, I think I would approach this with a small team whose job is to sketch out the overall system. It might be entirely senior engineers or at least led by a quite senior engineer, and tasked to identify and quantify risks and to plan out a system. That team could multiply its efforts using LLMs to help generate the more well understood portions of the system.

Monday, September 1, 2025

On the Persistence of Human Memory

Tell me this looks wrong to you, too.

A screenshot of green Save and red Cancel buttons where the Save button is quite obviously lower on the screen than Cancel

Claude Code doesn't see it. I mean, of course Claude Code doesn't see it, it has no eyes or other senses. Nonetheless I tried to get Claude Code to fix it by leading it to a solution.

In frontend/ in the Delivery page, align the Save and Cancel buttons vertically.
In frontend/ in the Delivery page, remove the height property from the Save and Cancel
buttons. Put both buttons inside a div, and set the height of the div to 40px.

Neither of these fixed it, because these were not the problem. The actual problem was:

233 .save-button,
234 .add-button {
235   background-color: #48bb78;
236   margin-top: 1rem;
237 }

This was leftover from when the button was elsewhere on the page, and not removed when it moved to be next to the Cancel button. Poking around with Chrome's Developer Tools and looking at the Elements on the page identified it.


On the Persistence of Human Memory

One thing I am finding is that memory of code generated with the help of an LLM fades much more quickly. Some portions of this system were not amenable to getting help from Claude Code — things which involve low level interoperability with existing and legacy systems. There is no relevant material in the training set, Claude Code could not help in iterative debugging in staring at the errors from the legacy system to figure out what to do next.

Those portions of the codebase, those developed with blood and sweat and tears, remain clear in my memory. Even months later I can predict how they will be impacted by other changes and what will need to be done.

That is not true of the portions which the LLM generated. Continuing with the analogy of treating it as an early career developer, I only reviewed the code I didn't write it. As with any code review, the memory of how it works fades much more quickly compared with actually digging in to the work.

(This is better than Claude Code, though, which retains no memory at all of how code has evolved and instead discovers it all afresh at the start of each session).

Treating an LLM like an early career programming partner can provide large increases in productivity, but it also means that one has less personal recollection of the windy path the code took to get to its current state. One must be able to go spelunking. This isn't that much different from a codebase which one has worked on over a long period: little detailed memory of specific portions of the code remain, but an overall sense of the codebase is retained much longer.

Monday, August 25, 2025

Claude Code's 19 cent Parser

A brief prompt:

In authheader.go write a function to parse a SIP WWW-Authenticate header for Digest
authentication. It should return a map[string]string of key:value pairs which are
present. It should handle the case of valueless parameter with no "=" by populating
an empty string in the map.

Write unit tests, including these WWW-Authenticate headers:
1. WWW-Authenticate: Digest algorithm=MD5,realm="example.com",nonce="abcd="
2. WWW-Authenticate: Digest realm="example.com", nonce="efgh=", opaque="1234__", algorithm=MD5, qop="auth"

A classic hacker stock photo in a darkened room sitting in front of a laptop wearing a hoodie and mask, except the person typing is a robot

From this, Claude Code generated quite reasonable parsing code for a SIP WWW-Authenticate header. It did this in approximately one minute of wall-clock time at a cost of 19 cents. This is considerably more quickly and cheaply than I could have produced a similar function.

I made one manual fix: the string comparison for "Digest" and for parameter field names are supposed to be be case-insensitive, and I added unit tests for it. I hadn't specified this in the prompt, and Claude Code didn't figure that out from the mention of SIP.

I remain of the opinion that vibe coding can be a force multiplier for expertise, not a complete replacement for expertise.


 

Wisdom

Returning to an earlier topic: does the code which Claude Code generated exhibit wisdom? Did it have shortcomings which would be harmful? Claude Code came up with the following test cases, and wrote a Go table-driven test case for them.

  1. The two I explicitly gave it.
  2. Header with valueless parameter
  3. Header with unquoted values
  4. Empty header
  5. Header with comma in quoted value
  6. Header with extra spaces

I looked into the handling of unquoted values. The SIP standard says that fields like algorithm or qop which are enumerated in specifications can be left unquoted. What Claude Code generated would allow any field to be unquoted, including arbitrary text strings like realm.

The spec says these values must be quoted. Yet there is also the Robustness Principle, to be liberal in what you accept and strict in what you send.


 

Postel's Law Considered Harmful

Nowadays I think this principle has ultimately been more harmful than good. Over time we end up with a protocol which is only partially specified, where real implementations require a neverending series of quirks handling to work around the behaviors of widely deployed yet incorrect implementations which other implementations have liberally accepted. For new protocols I'm a fan of be strict in what you send and strict in what you accept, to not allow quirks to accumulate. Like barnacles, quirks slow the forward progress over time and tend to cause standrds to bog down and eventually stop even trying to evolve.

But SIP is ancient. In Internet Years it is a centennarian. What should one do about SIP? Being strict in what one accepts would lead to a series of relaxations being added during deployment when engineering philosophy meets harsh reality that there are a lot of barely-compliant production services run by vendors far too large to care what some Internet Rando thinks of their implementation.


 

Epilogue

I did consider whether to just leave it this way, and allow unquoted strings for all fields. Life is too short to fight the weight of Internet Protocol Inertia... but I couldn't do it. That would make my little corner of the SIP world be part of the problem. I made it only accept unquoted strings for algorithm and qop, the two enumerated fields which my system deals with.

In authheader.go:parseWWWAuthenticate() fields named “algorithm” or “qop” may be
quoted or unquoted. Any other field name must have its value quoted to be accepted.

In authheader_test.go add test cases:
1. fields named “algorithm” or “qop” may be quoted or unquoted.
2. Any other field name must have its value quoted to be accepted.

Monday, August 18, 2025

Training Gemma3-270m for German Q-and-A

Google recently introduced Gemma3-270M, a smaller Gemma3 model with "only" 270 million parameters instead of billions.

The most interesting aspect of this model to me is that it is explicitly intended to be able to run locally, without requiring highly specialized infrastructure — well within what is achievable outside of specialized datacenters. The potential to run the model with an air gap, isolating it from outside, would be interesting for some future stuff I'm working on.

The eventual uses would involve communication in the German language, so I decided to see about adding training to answer questions in German specifically. I referenced an existing colab notebook, which uses Gemma3-270M to predict chess moves. Chess as an application for LLMs isn't as interesting for me personally, we have better ways to use neural networks to play chess, but the training flow is the same.

We start by loading dependencies and instantiating the gemma-3-270m-it model.

%%capture
import os
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    # Do this only in Colab notebooks! Otherwise use pip install unsloth
    !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft
    !pip install --no-deps trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1,<4.0.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth


from unsloth import FastModel
import torch
max_seq_length = 2048
model, tokenizer = FastModel.from_pretrained(
    model_name = "unsloth/gemma-3-270m-it",
    max_seq_length = max_seq_length, # Choose any for long context!
    load_in_4bit = False,  # 4 bit quantization to reduce memory
    load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
    full_finetuning = False, # [NEW!] We have full finetuning now!
    # token = "hf_...", # use one if using gated models
)

We set it up to accept training data in a chat format using the Huggingface deepset/germanquad dataset, a curated set of training data from the Deutsch Wikipedia and various academic sources.

model = FastModel.get_peft_model(
    model, r = 128,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 128, lora_dropout = 0, bias = "none",
    use_gradient_checkpointing = "unsloth",
    random_state = 3407, # Seems pretty random
    use_rslora = False, loftq_config = None,
)

from unsloth.chat_templates import get_chat_template
tokenizer = get_chat_template(tokenizer, chat_template = "gemma3")

from datasets import load_dataset
dataset = load_dataset("deepset/germanquad", split = "train[:10000]")

def convert_to_chatml(example):
    return {
        "conversations": [
            {"role": "system", "content": example["context"]},
            {"role": "user", "content": example["question"]},
            {"role": "assistant", "content": example["answers"]["text"][0]}
        ]
    }
dataset = dataset.map(convert_to_chatml)

def formatting_prompts_func(examples):
   convos = examples["conversations"]
   texts = [tokenizer.apply_chat_template(convo,tokenize = False,
       add_generation_prompt = False).removeprefix('<bos>') for convo in convos]
   return { "text" : texts, }
dataset = dataset.map(formatting_prompts_func, batched = True)

from trl import SFTTrainer, SFTConfig
trainer = SFTTrainer(
    model = model, tokenizer = tokenizer,
    train_dataset = dataset, eval_dataset = None,
    args = SFTConfig(
        dataset_text_field = "text",
        per_device_train_batch_size = 8,
        gradient_accumulation_steps = 1,
        warmup_steps = 5, num_train_epochs = 1,
        max_steps = 100, learning_rate = 5e-5,
        logging_steps = 1, optim = "adamw_8bit",
        weight_decay = 0.01, lr_scheduler_type = "linear",
        seed = 3407, output_dir="outputs",
        report_to = "none",
    ),
)

from unsloth.chat_templates import train_on_responses_only
trainer = train_on_responses_only(
    trainer,
    instruction_part = "<start_of_turn>user\n",
    response_part = "<start_of_turn>model\n",
)

We then train the model. This took about three minutes on Google Colab using a Tensor T4 system.

trainer_stats = trainer.train()

Now, the real test: can it give good answers to questions not in its training data?

messages = [
    {'role': 'system','content': 'Bielefeld'},
    {"role" : 'user', 'content' : 'Gibt es Bielefeld?'}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True, # Must add for generation
).removeprefix('<bos>')

from transformers import TextStreamer
_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to("cuda"),
    max_new_tokens = 125,
    temperature = 1, top_p = 0.95, top_k = 64,
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)

<bos><start_of_turn>user
Gibt es Bielefeld?
<end_of_turn>

<start_of_turn>model
Ja
<end_of_turn>

Indeed yes, it can!

If that interaction doesn't make much sense: it is a German joke, alleging that the city of Bielefeld doesn't actually exist. Wikipedia has an explanation in English.

The trained model says that Bielefeld does exist. Clearly it has no sense of humor.

Sunday, August 17, 2025

Iceland Carbfix Tour

In July 2025 we took a tour of the Geothermal Exhibition at Hellisheiðarvirkjun in Iceland, all about Geothermal power. My spouse is a Professional Geologist, for whom this was an especially interesting tour.

We took a slightly more extensive version of the tour which included the CarbFix plant, a carbon capture and sequestration project where carbon dioxide is injected deep underground to mineralize.

Near the Carbfix injection site is the Climeworks Mammoth plant, a direct air carbon capture facility. We didn't get to go inside, we could only see it from a distance.

large steam pipes running over a low hill and across the field
Steam pipes from the geothermal vents back to the power plant.
steam pipes within the power plant
Steam pipes within the power plant.
turbine within the power plant
Turbine within the power plant.
Building with a very large number of fans to pull air through
Climeworks Direct Air Capture facility.
Carbfix piping driving H2S and CO2 deep underground
Carbfix H2S+CO2 pumping facility.
Carbfix H2S and CO2 meters
Carbfix H2S+CO2 meters.

Saturday, August 16, 2025

Survey of Germany-related blog posts

A gothic building with a huge animatronic clockMy spouse's German mother emigrated to the United States in 1958. Until 1975, German mothers did not pass on citizenship to children born in wedlock. My spouse was not born a German citizen for this reason. The modern state of Germany has decided that this gender discriminatory policy was unconstitutional, and defined a declaration process called Staatsangehörigkeit § 5 (StAG5) by which descendants of such persons can declare their German citizenship.

Hand holding four German Reispässe

Our journey in this area began in 2020 with genealogical research, then filing a declaration of citizenship for spouse and our children, and finally taking trips to Germany as new German citizens. I've written a number of blog posts on this topic, roughly categorized below.


 

German Genealogy




German Citizenship




Other Topics of Interest to Americans, Concerning Germany




Our European Experiences

Monday, August 11, 2025

High School German with UCScout On Demand

I've written about our journey to German citizenship for my wife and our children. Yet merely having a German passport, Passdeutsche, isn't our goal: we want our children to be able to function comfortably in Europe if they choose to do so at any point in their lives. That means learning to speak German conversationally, if not fluently.

UCScout logo

Two of our kids are in High School. My school in Missouri lo these many years ago offered French, Spanish, and German, but times and school funding levels were different back then. Our High School now offers Spanish — the most widely used language in California after English, but we'd prefer they use this time to learn German instead.

Last year we started taking an online German course from UCScout, which is run by the University of California. The UCScout On Demand courses are self-paced but have an instructor available to assist, grade assignments, and conduct sessions in German. The On Demand courses cost $399 per semester, are accredited high school courses, and meet California's A-G requirements.

We have opted out of the Spanish class offered at school and instead enrolled in the UCScout German course, for 10th and 11th grade so far. At the end of each two semester course UCScout sends a report which our school incorporates into their regular transcript. There won't be a separate report card for the German classes when they apply for admission to college, it will all be part of their High School transcript.

A gothic building with a huge animatronic clockThis has worked out quite well for us. During the school year they use the hour which would have otherwise been the Spanish class to work on their German. They've also taken a class over each of the last two summers while we were in Germany. Being able to work on their own schedule lets them do the classwork in the evenings after we're done for the day.

If you choose to do something like this, start early. It took the entire first year of high school to get agreement that the kids would be allowed to drop Spanish and take German instead. It helped that the school had used UCScout during the pandemic to offer their Spanish course, they already had a way to incorporate the grades into their system.

Monday, August 4, 2025

Germany trip 7.2025

Reprising last year's trip, we spent another July in Europe this year.

One somewhat less pleasant aspect of last year's trip was the flights, particularly the return from Frankfurt to San Francisco where we spent 13 hours in the air. This year we broke up the time in the air:

  1. San Francisco -> Pittsburgh, to visit family
  2. Pittburgh -> Iceland
  3. Iceland -> Munich, Germany
  4. Munich -> Potsdam, near Berlin
  5. Potsdam -> Hannover
  6. Hannover -> Hamburg
  7. Hamburg -> Reykjavik, Iceland
  8. Iceland -> New York City
  9. New York -> San Francisco

 

Pittsburgh

We mainly visited family in Pittsburgh, but saw a few sights like the Duquesne Incline.

Duquesne Incline in Pittsburgh, a furnicular railway with a rail car climbing a steep slope
Duquesne Incline

 

Iceland Geothermal Exhibit

We rented a car in Iceland and went to the Geothermal Exhibit, all about Geothermal power. My spouse is a Professional Geologist, for whom this was an especially interesting tour.

Turbines and steam pipes
Geothermal power plant at Hellisheiðarvirkjun
Carbon capture system

 

Munich

We spent four days in Munich, a highlight was watching a performance of the Glockenspiel.

A gothic building with a huge animatronic clock
Munich Rathaus Glockenspiel

 

Potsdam

Last year we stayed in Berlin, and didn't find time to make it down to Potsdam but wanted to. So this year, we spent four days in Potsdam. We toured the Sanssouci Palace.

Sanssouci Palace

 

Hannover

My wife's family is from Hannover, we visit each time we are there. This year we went to Lake Maschee and the Herrenhäuser Garten.

Panoramic shot of a lake
Lake Maschee
Statue of a man laying in the lap of a woman
Herrenhäuser Garten

 

Hamburg

We loved Hamburg. Hamburg and Potsdam were our favorite cities on this trip, mainly because of the water. It reminded us of the San Francisco Bay.

Miniature replica of a city
Miniatur Wunderland

 

Reykjavik

We went back to Iceland on the return trip, staying in Reykjavik. We visited the Hallgrímskirkja church.

Hallgrímskirkja

 

New York City

I've been to New York a number of times but the rest of the family had not been, so this was a special treat. We took tours of the United Nations and of the Empire State Building.

View of a large room with two concentric seating areas, the UN Security Council chamber
United Nations
View of Manhattan from high above
View from the Empire State Building

Wednesday, July 30, 2025

Personal View of NYC Congestion Pricing

In the 2010s I managed an engineering organization with teams in California and New York. I travelled to NYC a number of times, typically staying near Chelsea Market.

The Maritime on 16th street was my usual lodging, next to Google's NYC office and with the 14th Ave subway station nearby. I recall the blaring of car horns being ever-present, continuing late into the night.

We brought the whole family to New York City in July, the first time I have been there in almost 10 years. We stayed in Manhattan in the Financial District, and went for pizza very near the Google building. The streets were very clear, nowhere near the level of traffic I remember.

view from high above the streets of Manhattan, with almost no cars visible driving on the roads
(view from the Empire State Building)
ground level view of an empty intersection in New York City
(on the way to pizza)

In January 2025, New York City implemented a congestion pricing mechanism to increase tolls for cars entering the city. It had an almost immediate impact in reducing traffic:

The Federal government, always eager to increase fossil fuel consumption, has revoked the needed authorizations and demanded that NYC end the congestion pricing mechanism. The two parties will present their arguments in court in October 2025.

I hope congestion pricing stays. The city is better for having it in place.