BuzzFeed engineering blog

Try:

BuzzFeed1 year ago

Crafting Analytical Summaries with Chat GPT

The article explains how BuzzFeed's analytics team automated recurring traffic summaries by combining spreadsheet-generated topline statements, SQL queries against BigQuery, a Python orchestration script, and calls to ChatGPT's API for deep-dive narrative summaries. It focuses on iterative prompt engineering, using system role prompts and tightly constrained inputs to produce repeatable outputs, and documents practical lessons for integrating LLMs into analytics workflows.

BuzzFeed2 years ago

The Right Tools For The Job

BuzzFeed Tech describes how it augmented and productionized LLM-powered features by using embeddings-based semantic search to address recency limits, building a production nearest-neighbor search architecture (event-driven with NSQ), migrating from Google Matching Engine to Pinecone for cost reasons, experimenting with self-hosted models (FLAN-T5 + LoRA), and moving from LangChain to a homegrown ReAct implementation to retain control over error handling and instrumentation while continuing to use OpenAI models for text generation.

BuzzFeed2 years ago

Lessons Learned Building Products Powered by Generative AI

BuzzFeed engineers describe lessons learned building products with generative AI: democratizing model access (OpenAI Playground, Slack/CMS integrations), close collaboration on prompt engineering, moderation guardrails (system prompts, logit_bias, banned-words), staff education on LLM mechanics, scaling and outage strategies when using OpenAI's API, cost optimization via fine-tuning open-source models (FLAN-T5 + LoRA + PEFT) and hosting Stable Diffusion on GCP, and tooling choices such as HuggingFace Transformers/Diffusers, Vertex AI Matching Engine, Vercel, and TorchServe.

BuzzFeed3 years ago

Hack Week 2022: Together Again!

BuzzFeed4 years ago

CLS at BuzzFeed — Part 3: Dealing with the unpredictable

BuzzFeed reduced Cumulative Layout Shift (CLS) from third‑party embeds by applying predictable inline placeholders and by crowdsourcing embed dimensions from users via their analytics pipeline. They exposed the gathered data through a Content Layout Stability API so embeds can be sized before third‑party code runs, producing a large sitewide CLS improvement (~20% → ~80% good pageviews).

BuzzFeed4 years ago

CLS at BuzzFeed — Part 2: Getting help from real users

BuzzFeed used real-user instrumentation (Layout Instability API + RUM) and their analytics pipeline to capture DOM elements and CLS scores, sent those events into BigQuery, and built a Data Studio dashboard using an impact metric (volume * CLS) to prioritize fixes. This revealed that a Branch SDK app-install banner placed with position:fixed was a major CLS offender; switching it to position:sticky removed the shifts and improved RUM CLS scores from the 50s to the low 70s.

BuzzFeed4 years ago

CLS at BuzzFeed — Part 1: Raising the floor

BuzzFeed describes their first-phase work to reduce Cumulative Layout Shift (CLS): improving observability (RUM into BigQuery, synthetic testing with Calibre), creating layered synthetic renders for consistent tests, and implementing front-end fixes (image sizing, static placeholders for ads and embeds). They used Chrome DevTools and Lighthouse to diagnose issues and found a gap between synthetic and real-user data that they plan to address in subsequent parts.

BuzzFeed4 years ago

Hack Week 2021 Round-Up!

BuzzFeed4 years ago

18 Things I Learned Taking a Business School Class on Management

Personal reflections from an MBA management class listing 18 lessons on growth mindset, psychological safety, goal-setting (OKRs), personality typing (MBTI), meeting practices, coaching, delegation, feedback, stress framing, emotional culture, and storytelling — practical managerial and leadership guidance.

BuzzFeed4 years ago

How We Made Your AI-Generated Lover With A GAN

The post details building a BuzzFeed quiz that paired users with GAN-generated face images using StyleGAN2 ADA. It covers infrastructure needs such as NVIDIA GPUs and CUDA and recommends Google Colab and an existing Colab notebook to run the model. To mitigate bias and unsafe outputs (including child-like faces) they precomputed thousands of images and used a human-review mini-webapp to label and filter images by gender, age bucket, and problematic artifacts. For generated text they used curated template populations from a spreadsheet rather than live AI text generation to avoid incoherent or offensive outputs.

BuzzFeed4 years ago

Accessible BuzzFeed

BuzzFeed undertook a two-year, cross-functional initiative to make buzzfeed.com accessible. After an external audit flagged 400+ issues, engineers, designers, product folks, QA, and editorial staff collaborated: engineers fixed UI/semantic problems and built accessible components and documentation, while editors learned to author descriptive alt text and updated style guides. The result was an accessibility certification and a cultural shift embedding accessibility into design patterns and workflows.

BuzzFeed5 years ago

Best of Hack Week 2020

We recently wrapped up our 2020 Hack Week, appropriately themed “Hacking from Home”. Even though our tech teams are 100% remote, that didn’t stop us from wor...

BuzzFeed5 years ago

Hacking From Home

Or: Chaotic Group Projects Meet Social DistancingIf you look at the to-do lists and notes apps of BuzzFeeders in tech this week, you might be a little confus...