2024

Bookmarked: Your Go-To Tool for
 Curating Tweets in Notion

A full-stack application that enables users to save tweets to Notion via Telegram bot.

Bookmarked: Your Go-To Tool for Curating Tweets in Notion

Overview

Bookmarked is a personal bookmarking tool inspired by . It features a web interface and a Telegram bot as clients, enabling users to save tweets and threads simply by forwarding URLs. The backend retrieves content from Twitter using web scraping.

Curious about what it looks like in action? Check out my personal collection of bookmarked tweets here.

Goals and Motivation

X (formerly known as Twitter) is a treasure trove of tech insights and tips, and I often found myself collecting valuable information there. Initially, I used the like button to save these tweets, but retrieving specific information later became tedious and inefficient.

That’s when I discovered Save to Notion, which was a game changer. It combined the best of both worlds—useful information and an organized second brain, making it easy to retrieve saved content.

While it served its purpose, as a free user, I encountered several limitations. The biggest inconvenience was the need to publicly mention the SaveToNotion Twitter bot in a reply to save a tweet or thread.

Saving via Mentions ✨ Just mention me @SaveToNotion in a reply to the tweet/thread you wanna save. With one of these hashtags: #tweet or #thread. Note: if your account is private🔒 You won't be able to use this method. Use Saving via DM instead 📷

So, I built my own bookmarking tool to save tweets to Notion, tailored to my preferences. Using a Telegram bot as the client, it allows me to forward tweet links directly to the bot, especially on mobile devices, without the need for public mentions.

Meet my NT Stacks

Notion, Next.js, NestJS, and Telegram divided into three repositories: backend, full-stack web application, and Telegram bot.

Notion

Stores all bookmarked tweets. Each tweet or thread is saved as an individual page in a Notion database with the following properties: Author, Tags, Tweet Date, Tweet Link, and Type (Tweet/Thread).

NestJS

Powers the back-end, built with TypeScript for enhanced type safety. This project was my introduction to NestJS, offering a great learning experience.

  • Puppeteer: Utilized as a headless browser for web scraping.
  • BullMQ: Handles queue management for the scraping process, seamlessly integrated with NestJS.
  • EventEmitter2: Constructs responses to send real-time updates to the client.

Next.js

Builds the web app and handles authentication using Auth.js. It also manages user data and acts as a proxy to communicate with the NestJS backend.

Telegram

Functions as the client for submitting bookmark requests and receiving responses about bookmark status. Built using Telegraf.js, it provides a convenient interface for mobile users to forward tweet links effortlessly.

Features

Save Tweets Privately

No need to mention a Twitter bot publicly—just easily forward the tweet or thread’s URL to the Telegram bot.

Never Lose Your Bookmarked Tweet Again

The media might get deleted if the author deletes their post, private or deactivates their account, but not the text content. After successfully bookmarking a tweet, it will remain there forever unless you delete it manually.

Reading Experience

Say goodbye to the “Show more” button. Once saved to Notion, you can read a long thread conveniently without having to press the “Show more” button to expand each tweet’s full content.

From this

Tweet thread with "Show more" button to expand each tweet content

to this

Bookmarked tweet thread with full content

Easily Organize and Share Your Bookmarks Collection

Customize how you view your bookmarked tweets. Organize them by tags, display them as a list or table, and explore a variety of options. Transform your collection into a public site that anyone can access. Share your bookmarks with friends, including long threads, even with those who don’t use Twitter—no account needed to read them!

Real-Time Bookmarking Process Updates

Get live updates on the progress of which tweet is currently being scraped and how many tweets have been successfully processed. Currently, this feature is only available on the web.

Challenges and How I Overcame Them

Twitter API Costs vs. Web Scraping: My Workaround

As someone who can’t justify paying for the Twitter API just for this personal project, I decided to try web scraping instead. At first, I was skeptical about this method. My initial approach was to scrape the HTML and manually parse it. To do so, I had to wait until the page fully loaded. Plus, I could only scrape one tweet at a time, which was very time-consuming.

Then, I wondered if it would be possible to scrape the API response instead.

Turns out, it was!  🥳

Fortunately, Puppeteer provides an interface to listen to network activity. This made the scraping process much faster since I didn’t have to wait for the entire page to load. The best part? The data was already structured in JSON format, so it required less effort to map compared to raw HTML.

TMI, there’s actually an easier way to get tweet data without web scraping, which I only discovered after finishing this project 🤡. You can check out this video for more details. I’ve already tried it and used it to embed this tweet. Well, at least it was a nice try, right?

Vercel Free Plan limitations

As mentioned earlier, my Next.js app acts as a proxy for my NestJS backend, including forwarding SSE responses from the backend to the web client. In this process, the server side of the Next.js app first receives the SSE response from the NestJS backend. It then forwards this response to the client side of the Next.js app.

SSE response flow from the backend to the web client

Typically, consuming SSE happens directly in the browser using the EventSource Web API.

And EventSource does not exist on Node  🙃

To address this limitation, the EventSource package allows EventSource to work in a Node.js environment. While this solution works perfectly in my local environment, it breaks when deployed to Vercel 🙃.

The issue arises because Vercel’s free plan have a 60-second connection timeout, which can be limiting, especially when scraping longer threads that require more time. Although do not have this limitation, they lack support for the EventSource package, as it requires a Node.js runtime.

You can read more about it here.

Vercel Functions

Functions enable running compute on-demand without needing to manage your own infrastructure, provision servers, or upgrade hardware.

Vercel Functions

https://vercel.com/docs/functions#choosing-the-right-runtime

Vercel Functions

To work around these limitations, I utilized a free VPS provided by my GitHub Student Benefits to self-host the Next.js app. This allowed me to bypass Vercel’s constraints. Setting up SSL, Nginx, and Docker was a steep learning curve, especially since I needed Docker to get Puppeteer (a headless browser) running on my VPS.

Why proxy the backend through the server side of the Next.js app instead of accessing the data directly from the client side?

My NestJS backend is dedicated solely to web scraping and saving data to Notion. It doesn’t handle database operations or user information. Authentication and user management are handled by the Next.js app using Auth.js. To bridge this gap, I enrich the payload on the server side of the Next.js app with user-specific data before sending it to the NestJS backend.

The Road to Restarting My Project

This project actually started in 2023, but I abandoned it for almost a year. The main reason? I hadn’t found anyone who had built something exactly like what I had in mind. The idea existed only in my head, but the tutorial didn’t exist though—at least not one that matched what I was trying to do. Back then, I underestimated myself.

Fortunately, I decided to revisit the project, though I wasn’t entirely sure why. Perhaps it was just a matter of giving myself another chance. I began by documenting the idea clearly to make it feel more real, then broke it down into smaller, manageable parts. This made it easier to explore and conduct research, and allowed me to compile tutorials from various sources.

Building something totally new with an unfamiliar tech stack while working a 9-5 job was a real challenge. I pushed myself to work as efficiently as possible, hoping to free up enough time to continue working on the project. Balancing work, personal life, and the demands of a side project taught me valuable lessons in time management and determination.

Future Expansions

Speeding up the Bookmarking Process

The web scraping method could be replaced by the faster approach I mentioned earlier. This improvement would not only speed up the bookmarking process but also enable video embedding in Notion—something that isn’t possible with the current scraping approach.

Improving the Mobile Experience

Using the Notion app on my 🥔 Android phone feels a bit sluggish. To improve this, I’ve developed a personal Bookmarked Android client using Jetpack Compose and Hono.js. I may share more about this development in the future.

Preserving Tweet Media

The tweet media could be stored externally, ensuring that even if the tweet is deleted or the author’s account is deactivated, the media would still be accessible.

Demo

Repositories

Outro

This project has been an incredible learning experience. Every challenge I faced and overcame taught me something new. While there’s still plenty of room for improvement, completing this project has boosted my confidence and made me even more excited to work on future side projects.

Start Bookmarking 🎉

  1. Clone the Template: Get started by cloning this Notion template.
  2. Create an Account: Head over to this website and sign up.
  3. Connect Your Notion: Sign in and link your Notion account.
  4. Use the Telegram bot to bookmark tweets effortlessly.
  5. Start Bookmarking!

I’d love to hear your thoughts on this project! Feel free to connect with me on Twitter. 😊

Last updated on December 2, 2024 at 5:33 PM UTC+7. See Changelog

Explore more projects