Back to Projects
BlinkAI

Demo video

Watch the product walkthrough

In-progressTypeScriptReactElectron+8 more

BlinkAI

Desktop AI assistant that combines local chat, screen capture, voice input, MCP tools, Composio integrations, local memory, and cloud-backed workflow automation into one workspace

Timeline

2026

Role

Full Stack AI Engineer

Team

Solo

Status
In-progress

Technology Stack

TypeScript
React
Electron
Mastra
MCP
Composio
Bun
AWS
Tailwind CSS
Deepgram
LibSQL

Key Challenges

  • Desktop Agent UX
  • Harness Pool and Thread Lifecycle
  • MCP and Composio Tool Routing
  • Voice and Screen Context
  • Local Memory
  • AWS-backed File, Graph, and Queue Workflows

Key Learnings

  • Mastra Agent Architecture
  • Electron Desktop Surfaces
  • MCP Tool Orchestration
  • Composio Workflow Integrations
  • Harness Pool Design
  • AWS S3, DynamoDB, and SQS
  • Voice-first Assistant UX

BlinkAI: Desktop AI Assistant

Overview

BlinkAI is a desktop AI assistant built around Electron, React, TypeScript, Bun, and Mastra. It is designed to live where work actually happens: on the desktop, close to the screen, files, browser context, messages, and the tools that usually force people into a stack of tabs.

The product idea is simple: keep the assistant close enough to observe context, but give it enough tool access to do useful work. BlinkAI combines local chat, screen capture, voice input, file handling, persistent memory, MCP servers, Composio integrations, and AWS-backed workflow storage into one agent workspace.

Why I Built This

Most AI productivity workflows still ask the user to leave the task they are doing. A developer sees an error, opens a browser, pastes context into ChatGPT, checks GitHub, opens docs, and comes back with half the original thread lost. A product or ops person does the same loop across Gmail, calendar, WhatsApp, docs, and issue trackers.

BlinkAI explores a different interface: a desktop assistant that can read visible context, accept voice or chat input, route into the right tools, and complete small workflows without turning every request into a browser-tab detour.

The goal is not just "chat on desktop." The goal is a local command surface for real work: ask, inspect, route, act, remember.

Key Capabilities

  • Desktop AI chat for drafting, summarizing, debugging, planning, and task execution
  • Screen capture and OCR context so the assistant can reason about what is visible, not just what is pasted
  • Voice input with Deepgram transcription and voice activity detection for hands-free interaction
  • File handling for documents, images, code context, and user uploads directly inside the assistant loop
  • MCP and Composio routing for GitHub, Gmail, Google Calendar, WhatsApp, and a wider tool ecosystem
  • Mastra agent modes for fast answers, planning, and build-style task execution
  • Local memory backed by LibSQL and FastEmbed-style semantic retrieval
  • AWS-backed infrastructure using S3 for objects, DynamoDB for knowledge graph state, and SQS for async work

Architecture

BlinkAI is organized as a desktop-first agent system with a local interaction layer, a routing layer, a Mastra agent core, a tool layer, and storage split between local memory and cloud workflow state.

Technical architecture

Core components, runtime flow, and durability boundaries.

This structure keeps the interface responsive while still giving the assistant access to the systems that make it useful beyond plain chat.

Request Flow

A typical request moves through the system in a few stages:

1. User asks a question from the desktop app, web widget, or WhatsApp bridge.
2. The router sends the request into the active thread in the harness pool.
3. The harness chooses the right agent path: FAST, PLAN, or BUILD.
4. The agent gathers context from memory, screen capture, files, or voice input.
5. The tool orchestrator selects MCP, Composio, workspace, or AWS-backed tools.
6. Long-running work is pushed into queue/storage paths instead of blocking the UI.
7. The response streams back while the conversation and useful context are persisted.

For example, "summarize the emails from the design team and draft replies" can become a routed workflow: pull Gmail context through Composio, retrieve user tone from memory, create draft responses, schedule follow-ups, and keep the thread state available for the next instruction.

Component Breakdown

LayerResponsibilityImplementation
Desktop surfacePersistent assistant UI close to the user's workElectron, React, TypeScript, Tailwind CSS
Routing layerAccept messages and route them into active threadsHono routes, API handlers, harness pool
Agent coreDecide whether the task needs a fast answer, a plan, or executionMastra agent modes
Context captureBring the user's real working context into the loopscreen capture, OCR, files, Deepgram voice input
Local memoryKeep conversation and semantic recall available without forcing everything into cloud stateLibSQL, embeddings
Tool orchestrationConnect the assistant to real workflowsMCP servers, Composio, workspace tools
Cloud workflow stateHandle durable files, graph state, and async tasksAWS S3, DynamoDB, SQS

Local-First vs Cloud-Backed

BlinkAI keeps the interaction loop local-first where it matters: desktop UI, visible context, thread state, and semantic memory can stay close to the user. That keeps the assistant fast and avoids making every action depend on a remote product shell.

The cloud layer is used for the parts that benefit from durability and asynchronous execution:

  • S3 stores uploaded files, shared documents, and object context.
  • DynamoDB acts as the persistent knowledge graph layer for entities, workflows, and relationships.
  • SQS handles background work, delayed jobs, scraping tasks, and workflows that should not block the chat UI.

That split is the core architecture choice: local responsiveness for interaction, cloud durability for workflows.

Agent Modes

BlinkAI uses multiple agent paths instead of treating every request as the same kind of problem.

  • FAST is for short, low-tool answers where loading heavy context would waste time.
  • PLAN is for multi-step tasks that need decomposition before execution.
  • BUILD is for action-heavy workflows where the assistant may call tools, touch files, or coordinate external services.

This keeps simple questions lightweight while still leaving room for deeper automation.

Tool Routing

The tool layer combines two ideas:

  • MCP servers provide a standard way to attach local or community tools.
  • Composio integrations provide broad access to SaaS workflows such as GitHub, Gmail, Google Calendar, and more.

The result is an assistant that can move from "what does this error mean?" to "create an issue, attach the context, and draft the follow-up" without changing surfaces.

What I Built

The project touches the full stack of an agent product:

  • Electron desktop shell and React UI
  • Mastra-based agent orchestration
  • Hono/API routing and harness lifecycle management
  • screen, voice, and file context ingestion
  • local semantic memory
  • MCP and Composio tool integrations
  • AWS-backed file, graph, and queue infrastructure
  • demo and web preview surface

What This Shows

BlinkAI shows the kind of AI system I like building: product-shaped, tool-aware, and close to real user workflows. It is not just a prompt wrapper. It combines desktop UX, agent architecture, memory, workflow automation, cloud infrastructure, and integration design into one assistant surface.

The most important proof is the architecture: BlinkAI is built around routing, context, memory, and execution. That is the difference between a chatbot demo and an assistant that can become part of someone's actual operating system for work.

A man who is master of patience is master of everything else.

~ George Savile

Made with ❤️ by Mohit Goyal
© 2026. All rights reserved.