Starting Small

Creating a repository, pushing that first branch and creating a subsequent pull request is pretty daunting. It's arguably even worse when you're migrating an existing codebase piece by piece. How much is too much per pull request? Personally I felt that getting my basic Message and Conversation structures pushed was good enough for v0.0.1. It really isn't much to look at but it at least got me over that initial milestone of staking my claim and committing to open sourcing my LLM repository.

Big Push

Basic message and conversation structures need to be sent to an LLM if they're ever going to be useful. I'd consider the LLM client as the second most important component in a repository that deals with inference providers. So coming out strong and creating a solid client was really important for my second pull request. Unfortunately I ended up with a monster 7k+ lines pull request bloated with redundant rustdoc comments and more examples than necessary. I also opted to not push my streaming client at first, choosing instead to focus on only regular requests without dealing with deltas and chunks. This was in an effort to contain the size of the pull request but I would say that didn't really go as planned.

Getting out a basic OpenAI API client, an example and showcasing an LLM calling a tool to get the current time and tell it like a pirate was getting somewhere we could build on. I now felt tee'd up to start bringing in the core chat loop, my agent building framework, and tool crate with MCP support. All of which needed to be picked apart, scrutinized and cleaned up. So I rolled up my sleeves and told myself that I would do each in individual pull requests.

Ultimately that would not work out the way I thought it would.

Open Remote Distractions

I'm not really a fan of pushing work in progress branches to an open remote. There is always some potential risk of pushing sensitive information in a commit and I was really paranoid about doing something like that. Is it the end of the world if I pushed an API key? Probably not since I'd just roll the key, rewrite the history, and force push. Still, I really didn't want to deal with that on an open remote when I was finally starting to feel some momentum and it just doesn't look good professionally.

So I started considering hosting my own git forge via either cgit, forgejo, or keeping it classic with a remote on my VPS.

It's pretty simple to stash your code on a cheap server for safe keeping.

ssh myuser@<ipaddress>
git init --bare neuromance.git

And something like

cd neuromance
git remote add origin ssh://myuser@<ipaddress>/var/git/my-repo.git

You can find more detailed guides through other blogs. The point is that I really wanted somewhere new but still private to work through stripping my original repository for parts. Ultimately I decided to just create a new repository in my GitHub organization and run a private fork of neuromance which I named protomance. I still may revisit hosting my own forge eventually but for now this was going to have to do because I had other priorities to address.

Bigger Push

So at this point I've got my private fork setup, my prototype repository configured, and I'm ready to bring that core chat loop in. Given that my core chat loop was all about a tool execution feedback loop to enable what constitutes an agent, I quickly realized I needed the whole tool crate. This includes the ToolImplementation trait, ToolRegistry, and ToolExecutor. The core loop is pointless without these.

This was the point where I should have stuck to my original intentions and pushed feat: core chat loop with tools before moving onto the next thing but frustration with perceived slow responses due to lack of streaming would finally get the best of me.

Local Inference & Streaming Client

It's worth noting that I do all of my development against local open weight models and while my desktop can compute 83 TFLOPs according to huggingface, not being able to read the output of slower local models in real time was becoming cumbersome. I could have mocked responses to try to chase down bugs and formatting issues but I think it's a lot more interesting to use 0.6B through 8B sized models during development. The capabilities of these smaller models continue to grow and they're becoming quite capable but they can still get off track, requiring killing the response. Therefore I had to address the streaming client even if it pushed my open source release back. So a few thousand lines of toil later, streaming support was built into the OpenAI API client and I could move on with my life.

Examples aren't enough

At this point I was getting sick of editing my examples to tinker with my libraries and while examples can be incredibly useful for development, a simple terminal based interface offers that extra bit of quality of life to keep you going. Plus I already had established a CLI crate in my original implementation but I ultimately wanted to just redo it. There are a few great options for building terminal interfaces in rust like ratatui, but I went with rustyline which is based on linenoise which is a readline alternative.

Rustyline offers a simple readline interface which is easily powered by /slash commands such as /resume <conversation_id>. This was simple enough and I had already gone down a rabbit hole before by creating a TUI using ratatui in the original repository. It started as an extremely simple interface, quickly gained vim bindings, emacs chords, async display, and finally an emacs like buffer implementation. It was pretty cool and worked decent enough for my needs but it had taken over my repository through a lack of proper refactoring and interfaces. Suffice to say I won't be revisiting a proper TUI until I feel stable in the implementation, structures, and architecture. When I do revisit the TUI, I'll likely opt for a client/server setup where a neuromance.d daemon runs and the TUI communicates with it over gRPC. That should really enforce keeping it isolated from the repository at large.

Feature Creep

Protomance is drowning in features at this point and while efforts to keep them isolated are holding strong, the majority of the changes live within the core underpinnings of the repository. This makes it basically unavoidable to ship them as one large feature commit in the open source repository.

I've gone back and forth with myself about pushing the individual commits within the prototyping private fork dubbed protomance to the open source neuromance. I have a clean commit history that does tell a tale of features, fixes, performance optimizations, documentation updates, and more. All of this could be worthwhile to properly bring over to the open remote. However, I opted to just do one large feature commit because of how new the repository is and how subject things are to change.

Wrapping Up

So I've finally refactored and rewritten some of my critical features, amounting to a hefty +6.8k/-922 LOC pull request. This proved I hadn't learned my lesson and left me facing a daunting final self-review.

Thankfully Claude Code is a great companion and has excellent pull request features built in. I've also experimented with the Claude Code Github Action and found it useful, although I currently prefer to have an ongoing pull request conversation in the Claude Code TUI instead. I do think it's worthwhile to still have the @claude workflow ready to go in a private repository.

With that said, as much as I love open weight models, using a top model like Claude has a noticeable impact during development. Emulating and trying to recreate that experience with open weight models is something I really aspire to do and want to see for the community.

After three days of self review with Claude at my side, I finally decided if I didn't just push this out it would go on forever and languish. So I set forth by creating my branch, pushing to origin, making a pull request and merging. Followed by v0.0.3 pushing to crates.io. I could finally get back to toiling away at my other features and ideas.

As expected, the moment a friend tried my newly established "baseline repository" a bug was found where I was passing None in my OpenAI API requests and that was causing providers like OpenRouter to reject my requests. So I quickly fixed that and pushed out a v0.0.4.

In the end I violated a lot of my principles to get this out the door, didn't self correct on pull requesting mountains of code, and still ended up with a silly bug. On to better things at least, I have a follow on post about what's next for the repository and how I am going to keep building this up.