Expedia logo

Expedia

An inside look into the tech and people powering the travel industry

119 posts
Expedia logo
Expedia

Why You Should Prefer MERGE INTO Over INSERT OVERWRITE in Apache Iceberg

The article argues for preferring MERGE INTO with Merge-on-Read in Apache Iceberg over INSERT OVERWRITE for many update workloads. It explains COW vs MOR, shows how MOR reduces I/O and compute costs (with an EMR+S3 example), warns about INSERT OVERWRITE pitfalls when partitioning evolves, and outlines necessary compaction and maintenance best practices. It also highlights Iceberg v3 deletion vectors as an improvement for high-churn update workloads.

Expedia logo
Expedia

Chill Your Data with Iceberg Write Audit Publish

Technical guide and Expedia case study showing how to implement a Write‑Audit‑Publish (WAP) workflow using Apache Iceberg branches and tags to validate data in isolated branches, audit changes, and atomically publish to production—reducing UAT duplication, lowering costs, and improving auditability.

Expedia logo
Expedia

Beyond the Handoff: Boosting ML Outcomes Through Integrated Scientist and Engineer Collaboration

An Expedia Group Tech article advocating integrated collaboration between ML scientists and ML engineers across the full model lifecycle. It recommends federated collaboration, details MLOps best practices (feature stores, versioning, reproducible training, CI/CD, monitoring, retraining), and emphasizes aligning production constraints, shared KPIs, and cultural practices to move models from research to reliable production.

Expedia logo
Expedia

Contextual Property Embeddings for Corse-grained Personalization

Expedia extended hotel2vec to produce contextual property embeddings conditioned on traveler loyalty tier (blue/silver/gold) by early-fusing a traveler-context embedding with property features. Trained on one year of click sessions with negative sampling, the contextual embeddings improved offline metrics (hits@k, slight NDCG@10 gains) and yielded qualitative shifts (tier-appropriate price/quality changes). In production, tier-specific embeddings are stored in a feature store and retrieved at inference, and online tests showed significant CVR improvements. Future work will explore more contexts and downstream tasks.

Expedia logo
Expedia

Preventing Revenue Loss With Real-Time A/B Test Monitoring

Expedia Group built EGTnL Circuit Breaker to provide real-time monitoring and automatic suspension of underperforming A/B tests. The post details the first-phase design using Apache Flink and Kafka: filtering, user-state collection with distinct-user tracking, and a two-stage aggregation (partial and final) to produce cumulative metrics (including sums and sum-of-squares) while handling bot reclassification. It explains technical requirements (accuracy, timeliness), implementation choices, and challenges (result delay, computational bursts, and scalability limits), and reports business impact from the initial deployment.

Expedia logo
Expedia

Does High Click-through Rate Lead to High Conversion Rate?

Expedia Group describes an iterative journey improving property recommendation systems: starting from item-to-item interaction matrices, moving to contextual LightGBM rankers, then a hyper-personalized model using historical clicks and embeddings, and finally a multi-task pAction model optimizing clicks, long clicks, and bookings. Key findings include that high CTR does not necessarily yield high CVR, motivating label and objective redesign to drive bookings. The post reports offline and online metric improvements and discusses future work such as position-bias correction and GenAI-powered recommendations.

Expedia logo
Expedia

Protect the Base

The article describes how Expedia Group’s Checkout team established an Operational Excellence (OpEx) forum and culture to reduce revenue-impacting checkout incidents. They standardized tech and business metrics (SLOs, availability, performance, incident metrics), aligned teams, set targets/OKRs, adopted monitoring/observability tooling, improved testing and deployment practices, and achieved measurable improvements (36% reduction in production incidents, 43% LCP improvement, cloud cost optimizations, better compliance and test coverage).

Expedia logo
Expedia

Lessons from a Rollback Gameday

Expedia Group shares lessons from a rollback gameday, covering planning and cadence for drills, metrics to capture (Total Rollback Time, SRE golden signals, rollback success rate), aligning rollbacks with SLO/MTTR, using safe-harbor tagging in CI/CD pipelines, Kubernetes health probes and performance checks, and building and iterating rollback playbooks to speed recovery and improve incident response.

Expedia logo
Expedia

Elevating Travel Experiences with AI

Expedia Group Technology’s article guides when to use generative AI (LLMs) versus traditional AI for travel products. It compares strengths and limitations of each approach, lists travel-specific use cases (content creation, chatbots, search personalization, propensity models, review summarization), and provides decision criteria including problem type, data availability, time-to-market, performance measurement, cost, and scalability. The article also highlights hybrid architectures and operational considerations (inference cost, latency, delegation to tools/APIs), with examples such as Expedia’s Romie assistant.

Expedia logo
Expedia

Turbocharge Efficiency & Slash Costs: Mastering Spark & Iceberg Joins with Storage Partitioned Join

Technical deep dive from Expedia’s Analytics Data Engineering team on using Spark + Iceberg storage-partitioned joins (SPJ) to reduce shuffle and sort overhead in large-scale batch joins. The article explains join strategies (sort-merge, shuffle-hash, SPJ), SPJ requirements and Spark/Iceberg configurations, provides EMR/S3 benchmark scenarios showing large time and cost savings (45–70%), and discusses when SPJ is appropriate (and when broadcast joins or other strategies are better).

Expedia logo
Expedia

Polymorphic Deserialization in Spring Data MongoDB: A Reactive approach

A technical how-to demonstrating how to handle polymorphic deserialization of Kotlin/Java sealed classes stored in MongoDB when using Spring Data's reactive stack. The post explains the deserialization error caused by missing alias-to-type mappings and presents solutions including a custom ReadingConverter and a Jackson-backed converter, then shows how to plug custom conversions into ReactiveMongoTemplate via MappingMongoConverter and MongoMappingContext.

Expedia logo
Expedia

Identifying Top-Scoring Arms in Ranking Bandits With Linear Payoffs in Real-Time

Expedia Group describes a technique to efficiently find the exact top-scoring ordering (arm) in high-cardinality ranking problems for linear contextual bandits by reformulating arm-selection as an assignment problem. Using an encoding that decomposes ordering scores into position-item contributions, they build a score-contribution matrix per context and solve it with the Hungarian algorithm (the "Assignment Solver") to obtain exact, low-latency decisions. The method yields large speedups over exhaustive and greedy approaches and is designed for real-time production use.

Expedia logo
Expedia

Gateways, Guardrails, and GenAI Models

Expedia Group Technology describes their in-house GenAI toolkit—GenerativeAI Proxy (GAP) and EG-Guardrails—that centralizes access to generative AI APIs (third-party and self-hosted), enforces authentication and fine-grained rate limits, reports per-call costs, and applies pre- and post-request guardrails for data protection, content filtering, and compliance. The post highlights avoiding vendor lock-in and extending GAP to multiple vendors.

Expedia logo
Expedia

Enhancing Data Reliability With An SLO Platform

Expedia Group built an SLO Platform to centralize SLO telemetry from multiple DataDog instances: ingesting raw SLO records via a Spring Boot service into Kafka, enriching/transformation with Kafka Streams, persisting enriched data in PostgreSQL, and exposing it via APIs for developer portals, BI, ML and governance. The platform addresses DataDog rate-limits, extends retention, improves SLO data quality, supports error-budget policies, and plans future integrations such as chaos engineering.

Expedia logo
Expedia

Optics: A Real-time Data Analytics Solution

Expedia Group built "Optics," a real-time analytics framework that uses microservices to pre-curate events, Apache Druid for low-latency ingestion, and a custom Data Resolver API plus React-based visualization library to deliver sub-15s dashboards at scale. The redesign replaced Snowflake/Looker for this workload, improved costs and responsiveness, and achieved high concurrency and availability SLAs.

Expedia logo
Expedia

Inside Expedia’s Migration to ScyllaDB for Change Data Capture

Expedia migrated a critical 15-node Cassandra 'Identity' cluster to ScyllaDB to leverage Scylla's native CDC and simplify operations. They executed a zero-downtime migration of 52 tables (~1TB data, ~3TB raw replicated) using scylla-migrator with a dedicated Spark cluster, dual-writes, extensive tuning (Spark and scylla-migrator), TLS/workaround steps, validator-based data verification, and temporary scaling of the target cluster; the migration completed successfully and reduced operational complexity and cost.

Expedia logo
Expedia

Enabling Core Machine Learning Platform Capabilities

Expedia Group built core ML platform components: a Model Repository Service for registering, versioning, storing artifacts and model cards, and a Model Deployment Service to standardize and orchestrate online model inference deployments (including LLMs), with CI/CD integration and centralized metrics for traceability.

Expedia logo
Expedia

Quantifying Stress for Customer Service Agents at Expedia Group

Expedia Group developed and validated a brief, reliable 3-item psychometric metric to quantify customer service agent stress using shadowing, surveys, and experiments. The metric distinguishes negative from neutral situations, predicts burnout, and shows that providing resolutions reduces stress. The article argues firms can use this measure to monitor agent stress, guide tool and workplace improvements, reduce turnover, and improve customer experience.

Expedia logo
Expedia

Channel-Smart Property Search: How Expedia Tailors Rankings for You

Expedia describes adapting its lodging ranking for channel-specific behavior (property search vs destination search). They train a deep neural learning-to-rank model on converted search logs (booking > click > impression), use search-context and property features (including Word2vec-like property embeddings), add similarity features for property searches, ran feature ablation and A/B tests, and found a blended relevance+similarity approach performs best.

Expedia logo
Expedia

The Perils of Deprecating a Legacy Microservice

Expedia Engineering describes deprecating a legacy Cache Service used by the Flight Details stack. Attempts to integrate Cassandra directly into the Detail Service exposed serialization, framework, and upgrade challenges, so the team adopted Redis for caching (using FastInfoset-serialized values). The change simplified PubSub logic, reduced supplier calls and connection errors, improved TP95 latency by ~50ms, and saved about $5,000/month. The effort also required Spring Boot/Java upgrades and code rewrites—turning a planned small deprecation into a broader modernization.