Earlier this week, I hooked my agent, Andy, up to my Twilio.

Now, I can call Andy and unspool my thoughts, log any todos that come to mind, and even make audio journal entries.

Twilio transcribes the phone call, NanoClaw picks up the transcript and hands it to Andy, a Claude agent with access to my Obsidian vault. I wrote Andy instructions for exactly how to break down my calls and where each piece should go in Obsidian, which I’m using as my knowledge graph.

Andy creates and adds to sections: there’s one for my journal, ideas I’m brainstorming, action items, and decisions made.

It identifies people, projects, and concepts I mention and links them to the relevant pages in my graph. If someone doesn’t have a page yet, Andy makes one. Action items get pulled out and added to my weekly todo list.

Andy dumps the raw transcript at the bottom of the call log. I considered using Whisper, but haven’t felt the need to because the transcription has been pretty accurate so far – even in the Chicago wind with AirPods.

I did consider having Andy talk back, and I may even build that out; but for now, I like using him as a way to capture my thoughts.

When I was in San Francisco earlier this week, Jesse Vincent told me that he built a tool where his AI calls him after meetings so he can debrief over voice. I thought that was a great idea. Talking is faster, more fluid, and isn’t limited by your words per minute.

Around the same time, I’d made Obsidian my personal HQ and linked Clawdbot to it. I could text it over Telegram and it would figure out how to route whatever I sent it – a todo, grocery list, a passing thought. I started using it a lot on my walks when random thoughts popped up. But then the temperatures in Chicago dropped, my gloved hands stayed in my pockets, and texting was no longer an option. (Dictation ring companies, take note: Chicago hands are off limits five months a year.)

Harper recommended I get an 8BitDo Micro, a tiny Bluetooth controller that iOS registers as a regular Bluetooth keyboard.

There’s an Accessibility setting called Full Keyboard Access that lets you bind any key combo to a Siri Shortcut. So I bound the “A” button to a “Voice to Telegram” Siri Shortcut that I built. The shortcut records audio, transcribes it on-device using Apple Speech, and sends the text straight to my Telegram bot.

Did I build this? You bet.

Does it work? Yes. I pressed the controller and Clawdbot accurately captured and routed my asks to the relevant grocery lists, inboxes, and action lists.

But did I have a lot of issues with Siri Shortcuts? Also yes.

  • “Dictate Text” requires the screen to be on. So I walked to work with my phone screen lit up in my pocket.
  • Full Keyboard Access puts a blue border on your screen. It made the phone glitchy and weird to use for everything else.
  • Recording is either fixed duration or tap to stop. Neither works with gloves on. I set a 60-second timer and just waited it out every time, talking or not.

Could I have just done this on WhatsApp’s built-in voice messages and skipped Shortcuts entirely? Technically yes, but WhatsApp banned chatbots, and to get around it we’d need another phone for the bot with its own number.

If I were to do it over again, would I buy another phone? Yes.