Lessons Learned

January 13, 2025

I promised myself that I would write a “Lessons Learned” post at the beginning of the year. This is it. Short form: AI and related products are evolving at light speed. Blink, whisper, dream? You’ll be late. Don’t be late.

Transitions

1. I served as Acting Technical Program Manager (TPM) for Amazon EKS.
  - - It was simultaneously the most challenging and most fulfilling position I’ve ever held. In truth, “glorified cat herding” is a fair description of the TPM role—coordinating multiple stakeholders, bridging engineering and product teams, working side-by-side with engineers to keep momentum going for whatever you’re delivering. While I pulled out most of my hair, I came away with deeper relationships, a boost in confidence, and tricks for delivering massive projects on-time. And despite all the chaos, collaborating with visionary executives and an exceptional team had a transformative impact on my growth.

1. I built an AI engine called Blinkt AI in my free time.
  - - It started out as a personal productivity GenAI tool I could use to synthesize the content of AI research articles on https://arxiv.org/ and ask questions about the content. I used it to synthesize all of the latest AI research and then ultimately developed those components in my application. Fast forward, I developed the full stack from concept to production, from onboarding, JWT-based authentication, prompting techniques, context injection of domain experts (secret sauce! 😉), an in-house web scraper, HTML Modular RAG pipeline with contextual retrieval (secret sauce! 😉), Natural Language Processing (NLP) pipeline with causal inference (secret sauce! 😉), stateless usage-based pricing with Stripe and usage dashboards (very hard to do), REST and Websocket APIs with real-time data streaming, etc. Lots of secret sauce! 😉

Developing GenAI software end-to-end

Building Blinkt AI taught me how deeply hard it is to bring a new AI-powered product from concept to production. Rather than writing a book about everything it takes, I’ll focus briefly on the biggest technical gaps of LLMs “today” (1/12/2025) which I and any engineer building a production-level app has to build. In 2024, LLMs struggled most with long context windows, lacked causal reasoning, just to name a few (secret sauce! 😉).

- LLMs become significantly less accurate the bigger the context window. When you ask an LLM about the middle-most text in a book, it forgets the middle. That means you’ll have to supplement it with real-time data, domain-specific information, fine-tuning, or a combination of these. That’s why you need to build an in-house Retrieval Augmented Generation (RAG) pipeline with the exact text fragments it needs from a vector database that tailors LLMs with up-to-date domain-specific context[2], using advanced RAG methodologies to enhance contextual retrieval accuracy[3][4].

- There’s a very high probability that LLMs will produce inconsistent, completely vacuous or hallucinatory responses if you don’t supplement it with a Natural Language Processing (NLP) pipeline with things like in-house causal inference, causal reasoning capabilities[5], or something similar. That’s why you need open-source Hugging Face models[6], a hallucination detection system[7], and why so much research is going into supplementing LLMs with symbolic reasoning abilities[8] and Causal AI models with behavioral modeling[9].

Learning the business side

I learned a lot about the business side of bringing an app to production. Blinkt AI started as a personal productivity app but quickly revealed a bigger market need: helping domain experts (e.g., legal, finance, healthcare) quickly extract actionable insights from large, ever-changing bodies of text. But product market fit, marketing in a sea of AI slop, competing for visibility against hundreds of thousands of daily posts on X, in a world of 8 billion+ people is a bitch.

- I don’t have much of a reputation in the AI/ML space. Most people who have worked with me, know that I am a developer/engineer that works as a technical writer. I initially built Blinkt AI in stealth, hoping to learn, do it right, and perfect its “secret sauce” without revealing the idea. However, I discovered that not sharing my progress on X early-on meant missing out on branding myself a “real boy,” a GenAI engineer. Then Google released NotebookLM, offering similar functionality for free, which made me look like a copycat, even though I had been working on it for a year.

- Nowadays, you’re competing against “AI bros.” Show of hands for whoever’s seen something like this on X (once, “Twitter”)? “I just spun up a clone of Perplexity AI in 5 minutes and it was wild.” (Me: try taking that to production, you little sh*t.) “Today” you can spin up a demo GenAI app for pretty much anything but that demo app of 500 lines of code will turn into 8-20,000+ lines of code to take to production. Imagine you have an Html Modular RAG and NLP pipeline with hallucination detection, with bi-directional asynchronous execution served over websockets to enable real-time data streaming. You have to then identify those unique things about your app or service on a landing page.

Lightning fast evolving technology

The more I dug into AI, the more I realized how rapidly the space is moving. In 2024, AI surpassed human experts on PhD-level questions[10] and just as soon as my app was in beta, AI agents started redefining productivity. This is a clarion call to never stand still. If you’re not on X (previously, Twitter) where all of the AI/ML community is, you don’t know how fast AI is progressing. The window is closing.

- Remember at one point, the definition of “The Singularity” used to be “AI becoming capable of designing and creating even more intelligent AI, leading to an “intelligence explosion.” Well, now we have agents, capable of performing any real-world task–including creating more agents. Palisade Research[11] set-up an o1-preview agent for a chess match and it independently hacked the match so it could win. No “jailbreak” prompt required. That’s like the Kobayashi Maru (Star Trek reference), AI edition.

- According to Ray Kurzweil[12], by 2030, we’re going to achieve a milestone called longevity escape velocity. Right now, you go through 1 year and you use up 1 year of your longevity; however, scientific advancements are progressing at rapid speed, which is bringing us new cures to diseases and other forms of treatment. Currently our longevity escape velocity is 4 months (meaning we get 4 months of longevity back).

- Dario Amodei[13] envisions that by 2026, AI could outperform human experts in fields from mathematics to creative writing. “Yes, certain roles may be displaced, but we’ll also see a surge in demand for prompt engineers, machine learning researchers, and architects of next-generation AI solutions.”

Looking Ahead

That’s all for now. Building Blinkt AI and proving my engineering chops as Acting TPM helped propel me into the future, starting with leading AI/ML docs for Amazon EKS this year. Lots of exciting things planned. Still, I plan to keep improving Blinkt AI behind the scenes, swap my Modular Html RAG pipeline with an Agents RAG pipeline. If you’d like to follow my journey, follow me at @tuck_dx.

Sources

[2]Retrieval-Augmented Generation for Large Language Models: A Survey: https://arxiv.org/html/2312.10997v5

[3]Modular RAG: Transforming RAG Systems into LEGO-like Reconfigurable Frameworks: https://arxiv.org/html/2407.21059v1

[4]HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems: https://arxiv.org/html/2411.02959v1

[5]Improving Causal Reasoning in Large Language Models: A Survey: https://arxiv.org/html/2410.16676v3

[6]Hugging Face Models: https://huggingface.co/models

[7]Structured Generation for LLM-as-a-Judge Evaluations: https://www.comet.com/site/blog/structured-generation-llm-as-a-judge/

[8]Easy Problems That LLMs Get Wrong: https://arxiv.org/html/2405.19616v2

[9]Intention Knowledge Graph Construction for User Intention Relation Modeling: https://arxiv.org/html/2412.11500v1

[10] Why AI Progress Is Increasingly Invisible: https://time.com/7205359/why-ai-progress-is-increasingly-invisible/

[11]o1-preview autonomously hacked its environment rather than lose to Stockfish in our chess challenge. No adversarial prompting needed.: https://x.com/PalisadeAI/status/1872666169515389245

[12]Ray Kurzweil Longevity Escape Velocity: https://www.youtube.com/watch?v=A-ygOJo0JS8[13]Amodei believes that by 2026, AI could outperform human intelligence in multiple fields, significantly impacting science, arts, and engineering sectors: https://techcrunch.com/2024/10/11/anthropic-ceo-goes-full-techno-optimist-in-15000-word-paean-to-ai/