Back

Heart Cove

~6 min read

heartcove.app
Next.jsTypeScriptFirebaseMongoDBSSTAWS LambdaGeminiRAG

Overview

Heart Cove is a private digital space for couples I built as a personal project. It covers a shared photo feed with real-time reactions and comments, notes and checklists (shared or private), daily habit tracking, and a dual-mode expense tracker with automated monthly PDF reports. An AI assistant backed by RAG retrieval lets us surface memories and notes through natural language queries.

Deployed at heartcove.app on AWS via SST v4, the stack uses Next.js 16 App Router, Firebase for real-time state and media, MongoDB as the primary database, and Google Gemini for generation and image captioning.

My Role

I designed and built the entire product from scratch: infrastructure as code via SST, backend server actions, frontend, Firebase integrations, and the AI and embedding pipeline. Being both the sole engineer and a primary user kept every decision grounded. Several patterns I validated here have since carried over into production work.

Architecture

The system uses two storage layers divided by access pattern. MongoDB is the source of truth for all durable data. All reads and mutations flow through Next.js server actions. Firebase handles the narrow set of data that genuinely needs real-time sync: comment threads, emoji reactions, and push notifications via FCM. Media goes to Firebase Storage via direct browser upload, bypassing Lambda entirely to avoid payload size constraints, then a Cloud Function optimizes each image and triggers the captioning pipeline.

Firebase and MongoDB: Division of Responsibility

Rather than forcing one database to do everything, the architecture leans into what each system does best. MongoDB handles complex queries, aggregations, and durable writes. Firebase RTDB handles the narrow slice of data that needs instant sync across both clients. This keeps Firebase usage tight and costs predictable.

Serverless Next.js with SST

The Next.js app runs on Lambda via OpenNext with response streaming enabled. This is a hard requirement for the AI assistant: without it, Lambda buffers the full response before returning anything, which hits the default function timeout on longer generations. Streaming lets the AI SDK deliver chunks to the client as they arrive. Infrastructure is defined entirely in TypeScript through SST v4, with a warmer cron to reduce cold start frequency during active hours.

AI Assistant and RAG

The assistant answers questions across the full history of shared memories and notes. Queries like what were we doing last New Year's? or find my note about the Bohol trip retrieve semantically relevant content from MongoDB and inject it as context before generation.

Embedding Visual Memories

A photo feed presents a harder retrieval problem than notes alone. Most posts have short captions, so embedding only the caption text produces weak recall for descriptive queries.

The fix was to generate a Gemini vision caption for each uploaded image and combine it with the user's own caption before producing the embedding. The resulting vector encodes both what the user wrote and what is visually present in the photo, so queries that describe a scene, activity, or location surface the right memories even when the original captions were brief. Embeddings are stored in MongoDB alongside the rest of the data. Using a dedicated vector database was not warranted at this scale, and keeping everything in one store simplifies the operational surface.

Where It's Going

The current AI layer is retrieval-only: Clove (the in-app assistant) can answer questions about shared memories and notes but cannot act on them. The next phase is moving from a passive assistant to an action-oriented agent with a broader surface area.

Agentic Actions on App Data

Clove should be able to do what the user can do: create a feed post, write or edit a note, delete content they own, all from a natural language instruction. This means exposing the existing server actions as tools the agent can call, with ownership checks enforced at the tool layer the same way they are for direct user mutations. The agent orchestration would run through the AI SDK's tool-calling interface, keeping the same streaming pipeline already in place.

A Couple-Aware Assistant with Web Access

Beyond the app's private data, there is a class of queries that need external context: date ideas nearby, what to do for an anniversary, how to navigate a recurring argument. The plan is to extend Clove with web search as a tool, combined with a couple-specific system prompt that gives the model grounding in the relationship context it already has from the feed and notes. The goal is an assistant that can blend personal history with real-world information in a single response, rather than treating them as separate modes.

Habit and Expense Analysis

The habit tracker and expense data are currently query-only. Extending the agent to cover these would let Clove surface patterns: spending trends across months, habit streaks worth calling out, or budget forecasts based on the current month's pace. The data is already in MongoDB in a structured shape. It is mostly a matter of writing the tool definitions and deciding what the agent should proactively surface versus only answer when asked.