Creating Minecraft Mods with LLMs

Rewriting My Childhood with LLMs and Minecraft Mods
13 years ago, Minecraft captured my imagination. I remember the first time a friend showed me the game. I was in middle school and I was hooked. I didn't know it then, but that game would ultimately set me down the path that would teach me how to code (my first ever lines of code were written inside of Minecraft), led me to study CS at MIT, and ultimately enter the world of tech.
Minecraft is a world of endless possibilities, and for the first time I really felt like my imagination could stretch as far as I wanted it to. But after many hundreds of hours, my imagination demanded even more. My friends and I learned about mods. And eventually I couldn't help but wonder how I might make my own mods, but I tried and failed. At that point, though, I had the bug I knew I wanted to create things and now I knew coding is how I could do that. And 13 years later, I have the powers that I wish I had back then.
Ice Cream for Dinner
When I was a kid I used to think that once I was an adult I would eat ice cream for dinner every day. I mean what's the point of having adult responsibilities if you can't do that? And if I knew I would one day be half-decent at coding, middle school me would certainly expect me to be making Minecraft mods. Unfortunately I (usually) don't eat ice cream for dinner and (usually) end up coding things other than Minecraft mods. But modern LLM capabilities made me wonder: could I somehow put my coding powers of today into the creative hands of middle school aged me?
The Experiment - LLMs for Creating Mods
I decided to see whether an LLM could help me create the Minecraft mods I always wanted. It seemed like a great litmus test of model capability. The setup was relatively simple thanks to the efforts of the Minecraft modding community over the last 13 years. I downloaded Minecraft Java Edition and selected the relatively simple Fabric mod API. To develop mods, I went with Cursor since that's where I was already doing most of my coding, and I could easily switch between different model providers there.
From here, I basically started prompting to see how far I could get. To be honest, models alone took me a whole lot further than I expected. They ended up nailing pretty complex ideas. But I started simple:
- "Create a giant pig that shoots fireballs at me."
Sounds like something I might have wanted back in the day. I gave this a go with Claude Sonnet 3.7. I loaded up the game with no build errors, spawned the pig, aaand:

Not so giant.
But then I gave it a go with GPT o3, and it miraculously debugged and implemented the original idea. Minutes later, I also had a family of giant chickens, sheep, and cows to go along with it.

This kicked off an hours long spree of making my old dreams come true. For the most part, o3 just worked. It implemented complex ideas and debugged its own build errors. Many prompts it solved in one go. I wasn't using AI to fill in the gaps, I was exclusively relying on it to fully implement every mod and it was delivering.
Next was a lightning staff (solved first attempt):

Then we followed up with a lightning arrow and an explosive arrow. This gets into more complex gameplay mechanics since bows need to consume arrows. But o3 and I eventually worked through it. Along the way, I had o3 build a debugging toolset, which it then started using for future mod development.

Now that I had some confidence in o3's capabilities for pretty basic mods, I wanted to start to push the limits. To do this, I went for mods that either:
- Had requirements outside of the capabilities of the vanilla game
- Needed nuanced/advanced development like 3D rendering
- Or had more involved ideas with more moving parts to them
Gravity Gauntlet
To push the capabilities on game mechanics, I requested a gravity gauntlet that could pick up mobs from far away and allow me to carry them and then throw them in any direction. I thought this would be a bit involved since it's nowhere near as simple as taking an existing mob and making it bigger or making an arrow explode on impact. To my surprise, o3 solved this in its first attempt. Within a couple of minutes of having the idea, my gravity gauntlet was in the game.

TNT Catapult
What about something that looks nothing like anything else in the game? Anything that's not a block or handheld item in Minecraft has a custom renderer that takes a texture and maps it onto a 3D structure described by code. A catapult that launches TNT is a fairly simple block, but what if I wanted to actually see it as a catapult? Well o3 managed to deliver on that too. It's not perfect - there's no catapult motion and it's a pretty basic look. But for something entirely generated? Pretty solid.

Mimic Chest
The last piece I wanted to push the boundary on was something a bit more nuanced. What if we made a chest that turns into a zombie when I try to open it?
At first, o3 made a block that turned into a zombie on right click. But that's pretty close to the mark. One prompt later, and I had my mimic chest.

Reaching The Limits
There were limits I hit that remained unsolved even after both o3 and I tried debugging for some time. A grappling hook never came to fruition, which makes sense as this is far outside the way the game engine is designed to allow the player to move.
Another interesting area of difficulty was in generating textures for mobs. I mentioned earlier that Minecraft mobs are texture-mapped onto 3D objects. As it turns out, generating such a texture is impossible with even the most capable image models. This also makes sense as such a specific use case, and one that requires such high precision, is just far outside the bounds of images in the training data. Here's an entertaining attempt at making a smiling cow:

You might have noticed however that I generated icons and items for the other mods above. This worked exceptionally well with GPT image generation.
There are still a lot of rough edges. The models did not always self-rescue, and I was often there to point the model in the right direction. All things considered, however, I was impressed. I wrote almost no code across the entire exercise and created mods that were significantly more complex than I thought possible.
Any ideas for other mods I should try building? Send them my way (wilson@creativemode.net). Or better yet, give it a go yourself!