State Of Mlops
List of awesome mlops articles. Curated from Feb 2022.
Installation
npx state-of-mlopsAsk AI about State Of Mlops
Powered by Claude · Grounded in docs
I know everything about State Of Mlops. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
state-of-mlops
List of awesome mlops articles.
8 Tips for Writing Agent Skills
Date: Apr 13, 2026
Tags: Practice
Gemma 4 Fine-tuning Guide
Tags: Guide
MirrorCode: Evidence that AI can already do some weeks-long coding tasks
Date: Apr 10, 2026
Tags: Blog
Full-Stack Optimizations for Agentic Inference with NVIDIA Dynamo
Date: Apr 17, 2026
Tags: Engineering
Company: NVIDIA
Can I Run AI locally?
Tags: Portal
Introducing routines in Claude Code
Date: Apr 14, 2026
Tags: Press release
Company: Anthropic
Towards developing future-ready skills with generative AI
Date: Apr 13, 2026
Tags: Blog
Company: Google
Claude Managed Agents: get to production 10x faster
Date: Apr 8, 2026
Tags: Press release
Company: Anthropic
Simulate realistic users to evaluate multi-turn AI agents in Strands Evals
Date: Apr 2, 2026
Tags: OSS, Engineering
Company: Amazon
Components of A Coding Agent
Date: Apr 4, 2026
Tags: Blog
Quantization from the ground up
Date: Mar 25, 2026
Tags: Learning resource
Deploying Disaggregated LLM Inference Workloads on Kubernetes
Date: Mar 23, 2026
Tags: Engineering
Company: NVIDIA
KAI Scheduler
Tags: OSS, Kubernetes, GPU
Five techniques to reach the efficient frontier of LLM inference
Date: Mar 28, 2026
Tags: Engineering
Company: Google
How Kimi, Cursor, and Chroma Train Agentic Models with RL
Date: Mar 28, 2026
Tags: Learning resource
How to Choose the Best Embedding Model for RAG in 2026: 10 Models Benchmarked
Date: Mar 27, 2026
Tags: Review
Company: zilliz
If DSPy is So Great, Why Isn't Anyone Using It?
Date: Mar 21, 2026
Tags: Blog
Building an MCP Ecosystem at Pinterest
Date: Mar 20, 2026
Tags: Practice
Company: Pinterest
Personalization at Bluesky
Date: Feb 24, 2026
Tags: Engineering
Company: Bluesky
TurboQuant: Redefining AI efficiency with extreme compression
Date: Mar 24, 2026
Tags: Research
Company: Google
A Visual Guide to Attention Variants in Modern LLMs
Date: Mar 22, 2026
Tags: Learning resource
Simon Willison: Engineering practices that make coding agents work - The Pragmatic Summit
Date: Mar 19, 2026
Tags: Presentation
Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster
Date: Mar 18, 2026
Tags: Hands on
LumberChunker: Long-Form Narrative Document Segmentation
Date: Mar 17, 2026
Tags: Research, RAG
Company: CMU
State of RL for reasoning LLMs
Date: Mar 15, 2026
Tags: Learning resource
The Best Tacit Knowledge Videos on Every Subject
Date: Apr 1, 2024
Tags: Learning resource
Mamba-3
Date: Mar 17, 2026
Tags: Press release
How We Hacked McKinsey's AI Platform
Date: Mar 9, 2026
Tags: Security
Company: CodeWall
Improving instruction hierarchy in frontier LLMs
Date: Mar 10, 2026
Tags: Practice
Company: OpenAI
Code Concepts: A Large-Scale Synthetic Dataset Generated from Programming Concept Seeds
Date: Mar 11, 2026
Tags: Engineering
Research note: Many SWE-bench-Passing PRs Would Not Be Merged into Main
Date: Mar 10, 2026
Tags: Blog
The Optimization Ladder
Date: Mar 10, 2026
Tags: Python, Practice
Native Observability & Alerts for Your OpenClaw with Opik
Date: Mar 5, 2026
Tags: Press release
Company: comet
Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model
Date: Mar 4, 2026
Tags: Press release, Engineering
Company: Microsoft
Mixture of Experts (MoEs) in Transformers
Date: Feb 26, 2026
Tags: Blog
Announcing the Opik Claude Code Plugin: Automatically Configure Observability for Complex Agentic Systems
Date: Mar 2, 2026
Tags: Press release
Company: comet
Give your agentic chatbots a fast and reliable long-term memory
Date: Feb 28, 2026
Tags: Blog
Company: Google
Is Boosting Still All You Need for Tabular Data?
Date: Mar 1, 2026
Tags: Blog
MediaFM: The Multimodal AI Foundation for Media Understanding at Netflix
Date: Feb 24, 2026
Tags: Engineering
Company: Netflix
How we caught our AI agent embezzling tokens
Date: Feb 23, 2026
Tags: Blog
Honey, I Tiled the Tensors
Date: Feb 26, 2026
Tags: GPU, NVIDIA Cute
Detecting and preventing distillation attacks
Date: Feb 23, 2026
Tags: Blog
Company: Anthropic
Pydantic Monty: you probably don't need a full sandbox
Date: Feb 27, 2026
Tags: Blog, OSS
Company: Pydantic
Scaling LLM Post-Training at Netflix
Date: Feb 13, 2026
Tags: Engineering, Design
Company: Netflix
Optimizing AI IDEs at Scale
Date: Feb 17, 2026
Tags: Practice
Company: comet
OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments
Date: Feb 12, 2026
Tags: Survey, OSS
Company: Turing
Two different tricks for fast LLM inference
Date: Feb 15, 2026
Tags: Survey
We Extracted OpenClaw’s Memory System and Open-Sourced It (memsearch)
Date: Feb 13, 2026
Tags: Press release
Company: milvus
Harness engineering: leveraging Codex in an agent-first world
Date: Feb 11, 2026
Tags: Practice
Company: OpenAI
How AI assistance impacts the formation of coding skills
Date: Jan 29, 2026
Tags: Research
Company: Aunthropic
2026 Practical Data Community State of Data Engineering
Date: Jan, 2026
Tags: Review
Beyond one-on-one: Authoring, simulating, and testing dynamic human-AI group conversations
Date: Feb 10, 2026
Tags: Research
Company: Google
TabICLv2: A state-of-the-art tabular foundation model
Date: Feb, 2026
Tags: OSS
Towards a science of scaling agent systems: When and why agent systems work
Date: Jan 28, 2026
Tags: Research
Company: Google
Build TikTok's Personalized Real-Time Recommendation System in Python with Hopsworks
Date: Oct 7, 2024
Tags: Presentation
The Case for RL-Aligned Ranking in RecSys
Date: Jan 3, 2026
Tags: Presentation
The AI Evolution of Graph Search at Netflix: From Structured Queries to Natural Language
Date: Jan 27, 2026
Tags: Engineering
Company: Netflix
Inside OpenAI’s in-house data agent
Date: Jan 29, 2026
Tags: Engineering
Company: OpenAI
Gas Town’s Agent Patterns, Design Bottlenecks, and Vibecoding at Scale
Date: Jan 30, 2026
Tags: Blog, Design
ATLAS: Practical scaling laws for multilingual models
Date: Jan 27, 2026
Tags: Research
Company: Google
Scaling PostgreSQL to power 800 million ChatGPT users
Date: Jan 22, 2026
Tags: Engineering
Company: OpenAI
Heaps do lie: debugging a memory leak in vLLM.
Date: Jan 21, 2026
Tags: Engineering
Company: Mistral
LLM Context Pruning: A Developer’s Guide to Better RAG and Agentic AI Results
Date: Jan 15, 2026
Tags: OSS, Practice
Company: Milvus
How We Built a Semantic Highlight Model To Save Token Cost for RAG
Date: Jan 15, 2026
Tags: OSS, Practice
Company: Hugging Face
Counterfactual Evaluation for Recommendation Systems
Date: Apr, 2022
Tags: Learning resource
Universal Commerce Protocol
Date: Jan 23, 2026
Tags: Press release
FLUX.2 [klein]: Towards Interactive Visual Intelligence
Date: Jan 15, 2026
Tags: Press release
A gRPC transport for the Model Context Protocol
Date: Jan 14, 2026
Tags: Press release
Company: Google
What's New in FastMCP 3.0
Date: Jan 20, 2026
Tags: Press release, OSS
Best practices for coding with agents
Date: Jan 9, 2026
Tags: Practice
Company: Anthropic
Supercharging LLMs: Scalable RL with torchforge and Weaver
Date: Jan 9, 2026
Tags: Practice
Company: PyTorch
MIPRO: The Optimizer That Brought Science to Prompt Engineering
Date: Jan 12, 2026
Tags: Blog
Company: comet
Towards Generalizable and Efficient Large-Scale Generative Recommenders
Date: Jan 2, 2026
Tags: Engineering
Company: Netflix
2025: The year in LLMs
Date: Dec 31, 2025
Tags: Review
Multi-Agent Systems: The Architecture Shift from Monolithic LLMs to Collaborative Intelligence
Date: Jan 5, 2026
Tags: Press release
Company: comet
Why Stochastic Rounding is Essential for Modern Generative AI
Date: Dec 20, 2025
Tags: Press release
Company: Google
2025 LLM Year in Review
Date: Dec 28, 2025
Tags: Review
Python Data Science Handbook
Tags: Learning resource
Agents Meet Databases: The Future of Agentic Architectures
Date: Aug 12, 2025
Tags: Engineering
Company: MongoDB
Prompt Drift: The Hidden Failure Mode Undermining Agentic Systems
Date: Dec 23, 2025
Tags: Blog
Company: Comet
Google's year in review: 8 areas with research breakthroughs in 2025
Date: Dec 23, 2025
Tags: Press release
Company: Google
Top Python libraries of 2025
Date: Dec 18, 2025
Tags: OSS, Practice
Interactions API: A unified foundation for models and agents
Date: Dec 11, 2025
Tags: Press release
Company: Google
How to Build Privacy-Preserving Evaluation Benchmarks with Synthetic Data
Date: Dec 12, 2025
Tags: Engineering
Company: NVIDIA
Introducing AISAQ in Milvus: Billion-Scale Vector Search Just Got 3,200× Cheaper on Memory
Date: Dec 10, 2025
Tags: Engineering
Company: milvus
Introducing OpenSearch 3.4
Date: Dec 16, 2025
Tags: Press release
Company: OpenSearch
Claude Agent Skills: A First Principles Deep Dive
Date: Oct 26, 2025
Tags: Hands-on
Top 5 AI Model Optimization Techniques for Faster, Smarter Inference
Date: Dec 9, 2025
Tags: Practice
Company: NVIDIA
The State of Production ML in 2025
Date: Dec, 2025
Tags: Review
State of PyTorch Hardware Acceleration 2025
Date: Dec, 2025
Tags: Review
Announcing MCP support in Apigee: Turn existing APIs into secure and governed agentic tools
Date: Dec 11, 2025
Tags: Press release
Company: Google
Claude Code is coming to Slack, and that’s a bigger deal than it sounds
Date: Dec 8, 2025
Tags: Press release
OpenAI to acquire Neptune
Date: Dec 3, 2025
Tags: Press release
Company: OpenAI
PyData Boston March 2025 Meetup | Best practices for hiring data scientists
Date: Apr 15, 2025
Tags: Presentation
Making Sense of Memory in AI Agents
Date: Nov 20, 2025
Tags: Practices
Real-Time Anomaly Detection with Apache Flink
Date: Nov 18, 2025
Tags: Engineering
Skill Learning: Bringing Continual Learning to CLI Agents
Date: Dec 2, 2025
Tags: Blog
Company: Letta
OpenSearch as an agentic memory solution: Building context-aware agents using persistent memory
Date: Dec 2, 2025
Tags: Press release
Company: OpenSearch
Build and Run Secure, Data-Driven AI Agents
Date: Nov 24, 2025
Tags: Press release
Company: NVIDIA
From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates
Date: Dec 3, 2025
Tags: Learning resource
650GB of Data (Delta Lake on S3). Polars vs DuckDB vs Daft vs Spark.
Date: Nov 13, 2025
Tags: Blog
Advent of Sysadmin 2025
Date: Dec, 2025
Tags: Advent Calendar
Advice for New Principal Tech ICs (i.e., Notes to Myself)
Date: Oct, 2025
Tags: Career
The Thinking Game | Full documentary | Tribeca Film Festival official selection
Date: Nov 26, 2025
Tags: Documentary
Company: DeepMind
Introducing agentic search in OpenSearch: Transforming data interaction through natural language
Date: Nov 24, 2025
Tags: Press release
Company: OpenSearch
Disrupting the first reported AI-orchestrated cyber espionage campaign
Date: Nov 14, 2025
Tags: Blog
Company: Anthropic
Reciprocal Rank Fusion and Relative Score Fusion: Classic Hybrid Search Techniques
Date: Nov 21, 2025
Tags: Learning resource
Company: MongoDB
Continuous batching
Date: Nov 25, 2025
Tags: Learning resource
Company: HaggingFace
Code execution with MCP: Building more efficient agents
Date: Nov 4, 2025
Tags: Blog, Practice
Company: Anthropic
Gemini 3 Prompting: Best Practices for General Usage
Date: Nov 19, 2025
Tags: Blog, Practice
Generative UI: A rich, custom, visual interactive user experience for any prompt
Date: Nov 18, 2025
Tags: Press release
Company: Google
Qdrant 1.16 - Tiered Multitenancy & Disk-Efficient Vector Search
Date: Nov 19, 2025
Tags: Press release
Company: Qdrant
Introducing real-time streaming for AI models and agents in OpenSearch
Date: Nov 18, 2025
Tags: Press release
Company: OpenSearch
Building an Interactive AI Agent for Lightning-Fast Machine Learning Tasks
Date: Nov 7, 2025
Tags: Blog
Company: NVIDIA
The Definitive Guide to Agentic AI: What AI Agents Actually Are and How to Build Them for Production
Date: Nov 13, 2025
Tags: Blog
Company: comet
Mapping LLMs with Sparse Autoencoders
Date: Oct, 2025
Tags: Learning resource
KEYNOTE: Hannes Mühleisen - Data Architecture Turned Upside Down | PyData Amsterdam 2025
Date: Oct 31, 2025
Tags: Presentation
ADK architecture: When to use sub-agents versus agents as tools
Date: Nov 8, 2025
Tags: Blog
Company: Google
Building powerful RAG pipelines with Docling and OpenSearch
Date: Nov 10, 2025
Tags: Blog, Design
Company: OpenSearch
Best LLM Observability Tools of 2025: Top Platforms & Features
Date: Nov 11, 2025
Tags: Review
Company: Comet
Human-in-the-Loop Review Workflows for LLM Applications & Agents
Date: Nov 11, 2025
Tags: Blog
Company: Comet
Omnilingual ASR: Advancing Automatic Speech Recognition for 1,600+ Languages
Date: Nov 10, 2025
Tags: Press release
Company: Meta
Introducing Nested Learning: A new ML paradigm for continual learning
Date: Nov 7, 2025
Tags: Paper
Company: Google
Scaling Large MoE Models with Wide Expert Parallelism on NVL72 Rack Scale Systems
Date: Oct 20, 2025
Tags: Engineering
Company: NVIDIA
Clario streamlines clinical trial software configurations using Amazon Bedrock
Date: Oct 31, 2025
Tags: Engineering, System design
Company: AWS
Introducing torchforge – a PyTorch native library for scalable RL post-training and agentic development
Date: Oct 22, 2025
Tags: Engineering
Company: PyTorch
Build your first AI Agent with Gemini, n8n and Google Cloud Run
Date: Oct 30, 2025
Tags: Hands-on
Stress-testing model specs reveals character differences among language models
Date: 2025
Tags: Paper
Beyond Standard LLMs
Date: Nov 4, 2025
Tags: Learning resource
Introducing Aardvark: OpenAI’s agentic security researcher
Date: Oct 30, 2025
Tags: Press release
Company: OpenAI
the bug that taught me more about PyTorch than years of using it
Date: Oct 22, 2025
Tags: Blog
LLM Tracing: The Foundation of Reliable AI Applications
Date: Oct 28, 2025
Tags: Blog
Company: comet
Post-Training Generative Recommenders with Advantage-Weighted Supervised Finetuning
Date: Oct 24, 2025
Tags: Algorithm
Company: Netflix
DORA - State of AI Assisted Software Development 2025
Date: 2025
Tags: Report
Company: Google
Al Engineering 101 with Chip Huyen (Nvidia, Stanford, Netflix)
Date: Oct 23, 2025
Tags: Presentation
But what is a Laplace Transform?
Date: Oct 12, 2025
Tags: Presentation
Causal Inference for The Brave and True
Tags: Learning resource
Legacy data to RAG : Modernise Your Apps with Amazon Sagemaker Unified Studio
Date: Oct 16, 2025
Tags: Blog
Company: Weaviate
DeepSeek-OCR
Tags: OSS
Dexterous Robotic Foundation Models
Date: Oct 19, 2025
Tags: Learning resource, Presentation
The State of Open Models
Date: Oct 16, 2025
Tags: Learning resource, Presentation
Claude Skills are awesome, maybe a bigger deal than MCP
Date: Oct 16, 2025
Tags: Blog
SuperOffload: Unleashing the Power of Large-Scale LLM Training on Superchips
Date: Oct 9, 2025
Tags: Blog
Company: PyTorch
Securing your agents with authentication and authorization
Date: Oct 13, 2025
Tags: Engineering
Company: LangChain
Rearchitecting Letta’s Agent Loop: Lessons from ReAct, MemGPT, & Claude Code
Date: Oct 14, 2025
Tags: Practice
Company: Letta
Scaling Request Logging from Millions to Billions with ClickHouse, Kafka, and Vector
Date: Oct 2, 2025
Tags: Engineering
So you want to build a data mesh ...with your dbt project
Date: Oct 15, 2025
Tags: Blog
Andrej Karpathy — AGI is still a decade away
Date: Oct 18, 2025
Tags: Video
Scaling Pinterest ML Infrastructure with Ray: From Training to End-to-End ML Pipelines
Date: Jun 25, 2025
Tags: Practice, Design
Company: Pinterest
Practical LLM Security Advice from the NVIDIA AI Red Team
Date: Oct 2, 2025
Tags: Security
Company: NVIDIA
Introducing CodeMender: an AI agent for code security
Date: Oct 6, 2025
Tags: Security
Company: Google
A small number of samples can poison LLMs of any size
Date: Oct 9, 2025
Tags: Practice
Company: Anthropic
How tech companies measure the impact of AI on software development
Date: Sep 17, 2025
Tags: Blog, Productivity
Why Multi-Agent Systems Need Memory Engineering
Date: Sep 25, 2025
Tags: Practice
Company: MongoDB
Developing an open standard for agentic commerce
Date: Sep 29, 2025
Tags: Press release
Company: Stripe
How to Integrate Computer Vision Pipelines with Generative AI and Reasoning
Date: Sep 25, 2025
Tags: Blog
Company: NVIDIA
Stanford CS230 | Autumn 2025 | Lecture 1: Introduction to Deep Learning
Tags: Learning resources
dspy-profiles
Tags: OSS
An Introduction to Speculative Decoding for Reducing Latency in AI Inference
Date: Sep 17, 2025
Tags: Blog, Practice
Company: NVIDIA
Introduction to LLM-as-a-Judge For Evals
Date: Sep 22, 2025
Tags: Overview
Company: comet
Rapid ML experimentation for enterprises with Amazon SageMaker AI and Comet
Date: Sep 22, 2025
Tags: Hands-on
Company: AWS, comet
Adding Document Understanding to Claude Code
Date: Sep 22, 2025
Tags: Blog
Deep researcher with test-time diffusion
Date: Sep 19, 2025
Tags: Research
Company: Google
Diffusion Beats Autoregressive in Data-Constrained Settings
Date: Sep 22, 2025
Tags: Research
Introducing SedonaDB: A single-node analytical database engine with geospatial as a first-class citizen
Date: Sep 24, 2025
Tags: OSS
How to turn Claude Code into a domain specific coding agent
Date: Sep 11, 2025
Tags: Blog
Company: LangChain
Powering AI commerce with the new Agent Payments Protocol (AP2)
Date: Sep 17, 2025
Tags: Press release, OSS
Company: Google
The Ultimate Guide to LLM Evaluation: Metrics, Methods & Best Practices
Date: Sep 11, 2025
Tags: Blog
Company: comet
Training an LLM-RecSys Hybrid for Steerable Recs with Semantic IDs
Date: Sep, 2025
Tags: Blog, Hands-on
Reducing Cold Start Latency for LLM Inference with NVIDIA Run:ai Model Streamer
Date: Sep 16, 2025
Tags: Press release
Company: NVIDIA
Build and scale adoption of AI agents for education with Strands Agents, Amazon Bedrock AgentCore, and LibreChat
Date: Sep 8, 2025
Tags: Practice, Design
Company: AWS
SQL performance improvements: finding the right queries to fix (part 1)
Date: Sep 17, 2025
Tags: Learning resource, Hands-on
A postmortem of three recent issues
Date: Sep 17, 2025
Tags: Blog
Company: Anthropic
Will Amazon S3 Vectors Kill Vector Databases—or Save Them?
Date: Sep 4, 2025
Tags: Blog
How to Spot (and Fix) 5 Common Performance Bottlenecks in pandas Workflows
Date: Aug 22, 2025
Tags: Blog, Technique
Company: NVIDIA
From Frequencies to Coverage: Rethinking What “Representative” Means
Date: Sep 3, 2025
Tags: Blog
The two versions of Parquet
Date: Feb 10, 2025
Tags: Blog
Transparent, Robust and Ultra-Sparse Trees (TRUST)
Tags: Blog
Learn how Amazon Health Services improved discovery in Amazon search using AWS ML and gen AI
Date: Aug 26, 2025
Tags: Practice
Company: Amazon
Fine-Tuning gpt-oss for Accuracy and Performance with Quantization Aware Training
Date: Aug 29, 2025
Tags: Hands-on
Company: NVIDIA
From Generalist to Specialist: Fine-Tuning Gemini for Terraform Scans & Phishing Detection
Date: Aug 29, 2025
Tags: Blog
Data Lake Table Formats (Open Table Formats)
Date: Sep 3, 2025
Tags: Learning resource, Open Table Formats
AgentScope: Agent-Oriented Programming for Building LLM Applications
Tags: OSS,
Launch of Polars Cloud and Distributed Polars
Date: Sep 3, 2025
Tags: Press release
Company: Polars
DSPy 0‑to‑1 Guide: Building Self‑Improving LLM Applications from Scratch
Tags: Hands-on
Le Chat. Custom MCP connectors. Memories.
Date: Sep 2, 2025
Tags: Press release
Company: Mistral
Claude Code Tutorial
Tags: Learning resource
101+ gen AI use cases with technical blueprints
Date: Aug 22, 2025
Tags: Practice, Blueprint
Company: Google Cloud
8-bit Rotational Quantization: How to Compress Vectors by 4x and Improve the Speed-Quality Tradeoff of Vector Search
Date: Aug 26, 2025
Tags: Practice
Company: weaviate
JUDE: LLM-based representation learning for LinkedIn job recommendations
Date: May 22, 2025
Tags: Practice
Company: LinkedIn
Learning DSPy (1): The power of good abstractions
Date: Aug 26, 2025
Tags: Blog, Learning resource
Why Stacking Sliding Windows Can't See Very Far
Date: Aug 25, 2025
Tags: Blog, Learning resource
MIT How to AI (Almost) Anything, Spring 2025
Date: Aug 27, 2025
Tags: Lecture, Learning resource
Python: The Documentary | An origin story
Date: Aug 29, 2025
Tags: Fun
mini-swe-agent
Tags: OSS
Best Practices for Building Agentic AI Systems: What Actually Works in Production
Date: Aug 14, 2025
Tags: Practice
Basic Feature Engineering with DuckDB
Date: Aug 15, 2025
Tags: Blog, Hands-on, SQL
Company: DuckDB
Accelerating MoE’s with a Triton Persistent Cache-Aware Grouped GEMM Kernel
Date: Aug 18, 2025
Tags: Blog
Company: PyTorch
From Facts & Metrics to Media Machine Learning: Evolving the Data Engineering Function at Netflix
Date: Aug 22, 2025
Tags: Blog
Company: Netflix
From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels
Date: Aug 18, 2025
Tags: Hands-on
Beyond billion-parameter burdens: Unlocking data synthesis with a conditional generator
Date: Aug 14, 2025
Tags: Paper
Company: Google
MCP Vulnerabilities Every Developer Should Know
Date: Aug 11, 2025
Tags: Security
Four places where you can put LLM monitoring
Date: Aug 10, 2025
Tags: Blog, Observability
AI Agent Design Patterns: How to Build Reliable AI Agent Architecture for Production
Date: Aug 8, 2025
Tags: Blog
Company: comet
Elysia: Building an end-to-end agentic RAG app
Date: Aug 12, 2025
Tags: Press release
Company: Weaviate
7 Drop-In Replacements to Instantly Speed Up Your Python Data Science Workflows
Date: Aug 1, 2025
Tags: Technique
Company: Netflix
Architecting LARGE software projects.
Date: Aug 5, 2025
Tags: Software engineering
FM-Intent: Predicting User Session Intent with Hierarchical Multi-Task Learning
Date: May 21, 2025
Tags: Practice, Recommender system
Company: Netflix
GPT-5: Key characteristics, pricing and model card
Date: Aug 7, 2025
Tags: Blog
AI judging AI: Scaling unstructured text analysis with Amazon Nova
Date: Aug 4, 2025
Tags: Blog, Hands-on
Company: Amazon
Remember this: Agent state and memory with ADK
Date: Aug 2, 2025
Tags: Blog, OSS, Agent development
Company: Google
Pretraining: Breaking Down the Modern LLM Training Pipeline
Date: Aug 1, 2025
Tags: Learning resource
Company: comet
Why, When and How to Fine-Tune a Custom Embedding Model
Date: Aug 5, 2025
Tags: Learning resource
Company: weaviate
The Internals of PostgreSQL
Tags: Learning resource
Build enterprise workflows with Langchain and Weaviate v3
Date: Jul 30, 2025
Tags: Press release, Hands-on
Company: Weaviate
Introducing Letta Filesystem
Date: Jul 24, 2025
Tags: Press release
Company: Letta
MLOps with Databricks: Free Edition
Date: Jul, 2025
Tags: Learning resources
Vibe code is legacy code
Date: Jul 30, 2025
Tags: Blog
Agentic Coding Things That Didn’t Work
Date: Jul 30, 2025
Tags: Blog, Practice
Building and evaluating alignment auditing agents
Date: Jul 24, 2025
Tags: Blog, Practice
Evaluating Grok 4’s Math Capabilities
Date: Jul 25, 2025
Tags: Blog, Review
The Big LLM Architecture Comparison
Date: Jul 19, 2025
Tags: Learning resources, LLMs
Using GitHub Spark to reverse engineer GitHub Spark
Date: Jul 24, 2025
Tags: Blog
Ray Data, Train & Tune at Klaviyo
Date: Jul 22, 2025
Tags: Blog
Stop Saying RAG Is Dead
Date: Jul 12, 2025
Tags: Learning resources, Slides
Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning
Date: Jul 9, 2025
Tags: Press release
Company: Microsoft
Voxtral
Date: Jul 15, 2025
Tags: Press release, Voice AI model, OSS
Company: mistral
Introducing Kiro
Date: Jul 14, 2025
Tags: Press release
Company: kiro
Graph foundation models for relational data
Date: Jul 10, 2025
Tags: Press release
Company: Google
Reflections on OpenAI
Date: Jul 16, 2025
Tags: Blog
What is a Principal Engineer at Amazon? With Steve Huynh
Date: Jul 10, 2025
Tags: Vlog
Andrew Ng: Building Faster with AI
Date: Jul 11, 2025
Tags: Presentation, For startup
Agent Memory: How to Build Agents that Learn and Remember
Date: Jul 7, 2025
Tags: Blog
Integrating Long-Term Memory with Gemini 2.5 with Mem0
Date: Jul 3, 2025
Tags: Blog, Hands-on
Advancing Claude for Education
Date: Jul 10, 2025
Tags: Press release
Company: Anthropic
Per-Tensor and Per-Block Scaling Strategies for Effective FP8 Training
Date: Jul 1, 2025
Tags: Engineering
Company: NVIDIA
Measuring AI code assistants and agents
Date: Jul, 2025
Tags: Research
Company: GetDX
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
Date: Jul 10, 2025
Tags: Research
Company: METR
AI Assisted Coding with Cursor AI and Opik
Date: Jul 7, 2025
Tags: Blog
Company: comet
How LLMs are served efficiently at scale
Date: Jun 27, 2025
Tags: Design, In-house AI APIs
Training and Finetuning Sparse Embedding Models with Sentence Transformers v5
Date: Jul 1, 2025
Tags: OSS, Search domain, Hands-on
A practical guide to building agents
Date: Jul, 2025
Tags: Guide
Company: OpenAI
How to Scale Your Model
Date: Jul, 2025
Tags: Book, TPU
Build an agentic multimodal AI assistant with Amazon Nova and Amazon Bedrock Data Automation
Date: Jun 23, 2025
Tags: Design, Hands-on
Company: Amazon
The Bitter Lesson is coming for Tokenization
Date: Jun 24, 2025
Tags: Learning resource, BPE
Agentic Misalignment: How LLMs could be insider threats
Date: Jun 21, 2025
Tags: Blog, Security
Company: Anthropic
Optimizing SQL (and DataFrames) in DataFusion, Part 2: Optimizers in Apache DataFusion
Date: Jun 15, 2025
Tags: Learning resource
Company: DataFusion
Why Your Vibe Coding Generates Outdated Code and How to Fix It with Milvus MCP
Date: Jun 13, 2025
Tags: Press release
Company: milvus
TPU Deep Dive
Date: Jun 18, 2025
Tags: Learning resource
How we built our multi-agent research system
Date: Jun 13, 2025
Tags: Practice, Design
Company: Anthropic
Andrej Karpathy: Software Is Changing (Again)
Date: Jun 17, 2025
Tags: Presentation
Google's Approach for Secure AI Agents
Date: 2025
Tags: Paper, Practice
Company: Google
The State of Engineering Leadership in 2025
Date: Jun 16, 2025
Tags: Other, Survey
Benchmarking Multi-Agent Architectures
Date: Jun 10, 2025
Tags: Benchmark, Design
Company: LangChain
Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix
Date: Jun 13, 2025
Tags: Blog, Design, Data management
Company: Netflix
Building a Production Multimodal Fine-Tuning Pipeline
Date: Jun 7, 2025
Tags: Hands-on, GCP
Company: Google
More efficient multi-vector embeddings with MUVERA
Date: Jun 5, 2025
Tags: Press release, Technique
Company: Weaviate
GitHub MCP Exploited: Accessing private repositories via MCP
Date: May 26, 2025
Tags: Blog, Security
A Technical Tutorial on Reinforcement Learning from Human Feedback
Date: Jun 1, 2025
Tags: Blog, Learning resource, Hands-on
RAG is dead, long live agentic retrieval
Date: May 29, 2025
Tags: Blog, Design, RAG
Company: LlamaIndex
The State of Enterprise AI in 2025: Measured Progress Over Hype
Date: May 27, 2025
Tags: Review
Company: weaviate
Written using Claude
Tags: AI Coding
Company: CloudFlare
Announcing Opik’s Guardrails Beta: Moderate LLM Applications in Real-Time
Date: May 23, 2025
Tags: Press release, OSS, LLM security
Company: comet
How I built an agent with Pydantic AI and Google Gemini
Date: Jan 7, 2025
Tags: Blog, Hands-on, Agent
Architecting a Multi-Agent System with Google A2A and ADK
Date: Apr 20, 2025
Tags: Blog, Hands-on, Multi-Agent
Iceberg Operation Journey: Takeaways for DB & Server Logs
Date: Apr 18, 2025
Tags: Practice, System Design
Company: kakao
Exploring Quantization Backends in Diffusers
Date: May 21, 2025
Tags: Practice, Research
Company: Hugging Face
OpenAI & Meta Distinguished Engineer (IC9) On Working With Zuck, Carmack & Career Growth | Philip Su
Date: May 23, 2025
Tags: Video, Story
LlamaIndex agentic workflows: Deep Research code-along
Date: May, 2025
Tags: Hands-on
How to think about agent frameworks
Date: Apr 20, 2025
Tags: Blog
Company: LangChain
Vector Search in the Real World: How to Filter Efficiently Without Killing Recall
Date: May 12, 2025
Tags: Blog, Practice
Company: milvus
How to Build an MCP Server in 5 Lines of Python
Date: Apr 30, 2025
Tags: OSS
MetaShuffling: Accelerating Llama 4 MoE Inference
Date: May 12, 2025
Tags: Blog, Learning resource
Company: PyTorch
Working on Complex Systems
Date: May 7, 2025
Tags: Blog
Introducing Codex
Date: May 16, 2025
Tags: Press release
Company: OpenAI
AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms
Date: May 14, 2025
Tags: Press release
Company: Google
What Every AI Engineer Should Know About A2A, MCP & ACP
Date: Apr 24, 2025
Tags: Learning resource
Zero to One: Learning Agentic Patterns
Date: May 5, 2025
Tags: Learning resource
Building News Agents for Daily News Recaps with MCP, Q, and tmux
Date: May, 2025
Tags: Blog, Hands-on
Build an automated generative AI solution evaluation pipeline with Amazon Nova
Date: Apr 21, 2025
Tags: Hands-on, LLM Evaluation, FMEval, Ragas
Company: AWS
Expanding on what we missed with sycophancy
Date: May 2, 2025
Tags: Press Release, Retrospective
Company: OpenAI
Optimizing Mixtral 8x7B on Amazon SageMaker with AWS Inferentia2
Date: Apr 15, 2025
Tags: Hands-on
Company: AWS
Unlocking Gen AI at the Edge: Speeding up Transformers by 80% by Removing Self Attention
Date: Apr 21, 2025
Tags: Learning resource
The State of Reinforcement Learning for LLM Reasoning
Date: Apr 19, 2025
Tags: Learning resource
OpenAI Codex CLI, how does it work?
Date: Apr 17, 2025
Tags: Learning resource
ImageBind: a new way to ‘link’ AI across the senses
Date: Apr, 2025
Tags: Press release, LLM
Company: Meta
KubeCon + CloudNativeCon Europe 2025 - Keynotes
Date: Apr, 2025
Tags: Presentation
An Intro to DeepSeek's Distributed File System
Date: Apr 15, 2025
Tags: Learning resource, Engineering, 3FS
An Overview of Late Interaction Retrieval Models: ColBERT, ColPali, and ColQwen
Date: Apr 9, 2025
Tags: Learning resource, Machine Learning
Company: Weaviate
Decomposing Transactional Systems
Date: Apr 17, 2025
Tags: Learning resource, Engineering
Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)
Date: Mar 25, 2025
Tags: Learning resource, Machine Learning, Sparse AutoEncoder
Generate videos in Gemini and Whisk with Veo 2
Date: Apr 15, 2025
Tags: Press Release
Company: Google
Claude takes research to new places
Date: Apr 16, 2025
Tags: Press Release
Company: Anthropic
Introducing GPT-4.1 in the API
Date: Apr 14, 2025
Tags: Press Release
Company: OpenAI
Model Context Protocol (MCP) an overview
Date: Apr 3, 2025
Tags: Learning resource, MCP
Announcing the Agent2Agent Protocol (A2A)
Date: Apr 9, 2025
Tags: Press release, OSS, MCP
Company: Google
The “S” in MCP Stands for Security
Date: Apr 6, 2025
Tags: Blog, MCP, Security
Optimize Gemma 3 Inference: vLLM on GKE
Date: Apr 8, 2025
Tags: Blog, Infrastructure, LLM
Parsing is Hard: Solving Semantic Understanding with Mistral OCR and Milvus
Date: Apr 3, 2025
Tags: Hands-on, Mistral OCR
Company: milvus
Designing for AI Engineers: UI patterns you need to know
Date: Feb 10, 2025
Tags: Blog, UX, Design, AI Platform
Taking a responsible path to AGI
Date: Apr 2, 2025
Tags: Blog, Ethical AI
Company: Google
Big book of R
Date: Feb 8, 2025
Tags: Learning resource
The 2025 AI Index Report
Tags: Report, Industry
SelfCheckGPT for LLM Evaluation
Date: Mar 26, 2025
Tags: Blog, Hands-on, Opik
Company: Comet
Training and Finetuning Reranker Models with Sentence Transformers v4
Date: Mar 26, 2025
Tags: Blog, Hands-on
Recent reasoning research: GRPO tweaks, base model RL, and data curation
Date: Apr 1, 2025
Tags: Learning resource
Tracing the thoughts of a large language model
Date: Mar 27, 2025
Tags: Blog
Company: Anthropic
The 13 software engineering laws
Date: Apr 1, 2025
Tags: Practice
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
Date: Apr 5, 2025
Tags: Press release, Llama 4
Company: Meta
All RL Algorithms from Scratch
Tags: Learning resource, OSS
Build GraphRAG applications using Spanner Graph and LangChain
Date: Mar 22, 2025
Tags: Design, GraphRAG
Company: Google
Open-Source AI Agent Frameworks: Which One Is Right for You?
Date: Mar 19, 2025
Tags: Overview, LLM workflow
Parsing PDFs with LlamaParse: a how-to guide
Date: Mar 20, 2025
Tags: Press release, Data extraction
Company: LlamaIndex
Scaling Supervision or: How We Learned to Stop Worrying and Love Bitbucket Pipelines
Date: Mar 6, 2025
Tags: Blog, Human-in-the-loop, Data management
Microsoft unveils Microsoft Security Copilot agents and new protections for AI
Date: Mar 24, 2025
Tags: Press release, Security agent
Company: Microsoft
Architecture Patterns with Python
Tags: Learning resource, DDD
On the Biology of a Large Language Mode
Tags: Learning resource, LLM
Hands-On APIs for AI and Data Science
Tags: Learning resource, APIs, Python
Improving Recommendation Systems & Search in the Age of LLMs
Date: Mar, 2025
Tags: Overview, Learning resources, RecSys
Foundation Model for Personalized Recommendation
Date: Mar 22, 2025
Tags: Blog, LLMApp, RecSys
Company: Netflix
QueryGPT – Natural Language to SQL Using Generative AI
Date: Sep 19, 2024
Tags: Blog, LLMApp
Company: Uber
Scaling Recommendation Systems Training to Thousands of GPUs with 2D Sparse Parallelism
Date: Mar 11, 2025
Tags: Press release, PyTorch
Company: Meta
Introducing the Weaviate Transformation Agent
Date: Mar 11, 2025
Tags: Press release
Company: Weaviate
100 Most Watched Python Talks Of 2024
Date: Mar 19, 2025
Tags: Learning resources
What Are Agentic Workflows? Patterns, Use Cases, Examples, and More
Date: Mar 6, 2025
Tags: Overview, AI agent workflow
Company: Weaviate
OmniAI OCR Benchmark
Date: Feb 20, 2025
Tags: Benchmark, OCR, LLM usecase
Company: OmniAI
Defense Against Dishonest Charts
Tags: Learning resource, Data science
Using generative AI to scale DuoRadio 10x faster
Date: Mar 11, 2025
Tags: Blog
Company: Duolingo
LeRobot goes to driving school
Date: Mar 11, 2025
Tags: Blog, Autonomous driving
New tools for building agents
Date: Mar 11, 2025
Tags: Press release, AI workflow
THE STARTUP CTO'S HANDBOOK
Tags: Learning resouce, Engineering management
The data validation landscape in 2025
Date: Mar 5, 2025
Tags: Blog, Toolings, Data engineering
Optimizing Query Performance with Materialized Views with Arun Parthiban
Date: Mar 5, 2025
Tags: Presentation, Engineering
Company: Datadog
MIT 6.824 Distributed Systems (Spring 2020)
Tags: Learning resources
Company: MIT
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Tags: Paper
Company: Sony
CoreWeave and Weights & Biases to Join Forces
Date: Mar 4, 2025
Tags: News
LLM Evaluation Frameworks
Date: Feb 17, 2025
Tags: Overview, LLM Evaluation
LLM Juries for Evaluation
Date: Feb 24, 2025
Tags: Blog, Opik, LLM Evaluation
Company: Comet
Keynote: Why people think "agent" is a buzzword but it isn't
Date: Feb 22, 2025
Tags: Talk, Agent, Chip Huyen
Beyond RAG: Implementing Agent Search with LangGraph for Smarter Knowledge Retrieval
Date: Feb 22, 2025
Tags: Case, LangGraph, Retrieval
Company: LangChain
10x Cheaper PDF Processing: Ingesting and RAG on Millions of Documents with Gemini 2.0 Flash
Date: Feb 14, 2025
Tags: Case, RAG
The State of Machine Learning Competitions
Date: Feb, 2025
Tags: Overview, Competitions
AIBrix
Tags: OSS, LLM inference
Wan AI
Tags: OSS, Video generation
Company: Alibaba
Emerging Patterns in Building GenAI Products
Date: Feb 19, 2025
Tags: Practice, GenAI App
When Imperfect Systems are Good, Actually: Bluesky's Lossy Timelines
Date: Feb 19, 2025
Tags: Engineering
Deep Dive into LLMs like ChatGPT
Date: Feb 6, 2025
Tags: Learning resource, LLM
From PDFs to Insights: Structured Outputs from PDFs with Gemini 2.0
Date: Feb 7, 2025
Tags: Hands-on, Pydantic, Python
Hybrid Search Explained
Date: Jan 27, 2025
Tags: Blog, Technique
Company: Weaviate
Introducing Perplexity Deep Research
Date: Feb 14, 2025
Tags: Press release
Company: Perplexity
The Anthropic Economic Index
Date: Feb 10, 2025
Tags: News
Company: Anthropic
Introducing AgentWorkflow: A Powerful System for Building AI Agent Systems
Date: Jan 22, 2025
Tags: Press release, OSS
Company: LlamaIndex
Build your own xxx
Tags: Links, DIY
Your Company Needs Small Language Models
Date: Dec 26, 2024
Tags: Blog, Strategy
Introducing deep research
Date: Feb 2, 2025
Tags: Press Release
Company: OpenAI
Building Opik: A Scalable Open-Source LLM Observability Platform
Date: Jan 29, 2025
Tags: OSS, Observability
Company: Comet
G-Eval for LLM Evaluation
Date: Jan 28, 2025
Tags: OSS, Evaluation
Company: Comet
Choosing the Right AI Agent Framework: LangGraph vs CrewAI vs OpenAI Swarm
Date: Dec 3, 2024
Tags: Blog, Comparison
AI Engineering with Chip Huyen
Date: Feb 6, 2025
Tags: Interview
Introducing Citations on the Anthropic API
Date: Jan 24, 2025
Tags: RAG App, New Service
Company: Anthropic
The Illustrated DeepSeek-R1
Date: Jan 28, 2025
Tags: Learning resource, Argorithm
Which AI to Use Now: An Updated Opinionated Guide
Date: Jan 26, 2025
Tags: Overview
How to align open LLMs in 2025 with DPO & and synthetic data
Date: Jan 23, 2025
Tags: Learning resource, Argorithm
State of open video generation models in Diffusers
Date: Jan 27, 2025
Tags: Overview
SmolVLM Grows Smaller – Introducing the 250M & 500M Models!
Date: Jan 23, 2025
Tags: New LLM, Small LLM
International AI Safety Report
Date: Jan, 2025
Tags: Report
Common pitfalls when building generative AI applications
Date: Jan 16, 2025
Tags: Lesson, AI app
Lessons Learned from Building an AI Sales Assistant
Date: Jan 21, 2025
Tags: Blog, AI app, LlamaIndex Workflows
Company: NVIDIA
Building knowledge graph agents with LlamaIndex Workflows
Date: Jan 15, 2025
Tags: Blog
Company: LlamaIndex
The Rise of Single-Node Processing: Challenging the Distributed-First Mindset
Date: Jan 6, 2025
Tags: Overview, Data processing
Cellm
Tags: OSS, Excel extension
Observability: the present and future, with Charity Majors
Date: Jan 23, 2025
Tags: Talk, Observability, Honeycomb
NVIDIA CEO Jensen Huang Keynote at CES 2025
Date: Jan 7, 2025
Tags: Keynote, Overview
Company: NVIDIA
Introducing Agentic Document Workflows
Date: Jan 9, 2025
Tags: Blog, LLM App, Agent
Company: LlamaIndex
Codestral 25.01
Date: Jan, 2025
Tags: PR, Coding Support
Company: MISTRAL
How to Build AI Agents with LangGraph: A Step-by-Step Guide
Date: Sep 6, 2024
Tags: Hands-on, LangGraph
The 2025 AI Engineer Reading List
Date: Dec 28, 2024
Tags: Learning resource, Paper
Agents
Date: Jan 7, 2025
Tags: Overview, Learning resource, AI Agent
AI Agent Workflow Design Patterns — An Overview
Date: Dec 11, 2024
Tags: Learning resource, AI Agent
Building Agentic Workflows with Inngest
Date: Jan 7, 2025
Tags: Blog, Workflow, AI Agent
Company: weaviate
NeuralSVG: An Implicit Representation for Text-to-Vector Generation
Tags: OSS, Paper
Intro to LLM Observability: What to Monitor & How to Get Started
Date: Dec 19, 2024
Tags: Practice, LLM app, Observability
Company: Comet
Building effective agents
Date: Dec 20, 2024
Tags: Practice, LLM app, Agent, Workflow
Company: Anthropic
What Are Shapley Interactions, and Why Should You Care?
Date: Dec 4, 2024
Tags: Data science, Math
Streamlining AI Paper Discovery: Building an Automated Research Newsletter
Date: Dec 5, 2024
Tags: Blog, LLM app
Things we learned about LLMs in 2024
Date: Dec 31, 2024
Tags: Review, 2024
Databases in 2024: A Year in Review
Date: Jan 1, 2025
Tags: Review, 2024
DuckDB: Crunching Data Anywhere, From Laptops to Servers • Gabor Szarnyas • GOTO 2024
Tags: Learning resource, DuckDB
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Tags: OSS, LLM
Introducing Phi-4: Microsoft’s Newest Small Language Model Specializing in Complex Reasoning
Date: Dec 13, 2024
Tags: News, New Model
Company: Microsoft
Quick software tips for new ML researchers
Tags: Learning resource, ML and MLOps
Archetypes of LLM apps
Date: Nov 22, 2024
Tags: Learning resouce, LLM App
Introducing Google Agentspace
Tags: News, New Service
Company: Google
Ilya Sutskever: "Sequence to sequence learning with neural networks: what a decade"
Date: Dec 15, 2024
Tags: Learning resource, Presentation, Review
Monolith: Real Time Recommendation System With Collisionless Embedding Table
Date: Sep 27, 2022
Tags: Paper, Online Training and Serving
Introducing Gemini 2.0: our new AI model for the agentic era
Date: Dec 11, 2024
Tags: News, New Model
Company: Google
Sora is here
Date: Dec 9, 2024
Tags: News, New Model
Company: OpenAI
Sharing new research, models, and datasets from Meta FAIR
Date: Dec 12, 2024
Tags: News, New Model
Company: Meta
Designing Multi-Tenancy RAG with Milvus: Best Practices for Scalable Enterprise Knowledge Bases
Date: Dec 4, 2024
Tags: Engineering, RAG, Production
Company: milvus
Optimize parsing costs with LlamaParse auto mode
Date: Dec 9, 2024
Tags: News, New Feature
Company: LlamaIndex
Taming LLMs
Tags: Practice, LLMOps
5-Day Gen AI Intensive Course with Google Learn Guide
Tags: Hands-on, Learning resource, GenAI, LLMOps
Company: Google
Create a self-escalating chatbot in Conversational Agents using Webhook and Generators
Date: Nov 23, 2024
Tags: Hands-on, GenAI, Agent
Company: Google
Build an Agentic Video Workflow with Video Search and Summarization
Date: Dec 3, 2024
Tags: Hands-on, GenAI, Workflow
Company: NVIDIA
Deploy QwQ-32B-Preview the best open Reasoning Model on AWS with Hugging Face
Date: Dec 3, 2024
Tags: Hands-on, SageMaker, QwQ 32B Model
Veo and Imagen 3: Announcing new video and image generation models on Vertex AI
Date: Dec 4, 2024
Tags: PR, New Model
Company: Google
LLMOps Database
Tags: Learning Resource, LLMOps
Company: ZenML
Constructing a Knowledge Graph with LlamaIndex and Memgraph
Date: Nov 21, 2024
Tags: Hands-on
Company: LlamaIndex
Create a Swarm of Agents
Date: Nov 26, 2024
Tags: Hands-on
Company: Haystack
Perplexity for LLM Evaluation
Date: Nov 21, 2024
Tags: Evaluation, LLM
Company: comet
Which Foundation Model is best for Agent Orchestration
Date: Nov 20, 2024
Tags: Comparison, LLM
An Intuitive Explanation of Sparse Autoencoders for LLM Interpretability
Date: Jun 11, 2024
Tags: Learning resource, Sparse Autoencoder
Automatically generating cloud configurations: Introducing RAGformation
Date: Nov 14, 2024
Tags: Architecture, LLM Agent
Company: LlamaIndex
65,000 nodes and counting: Google Kubernetes Engine is ready for trillion-parameter AI models
Date: Nov 14, 2024
Tags: News, GKE, Spanner
Company: Google
Tracking Voluntary Commitments
Date: Nov 18, 2024
Tags: Blog, Activities
Company: Anthropic
Large Language Models explained briefly
Date: Nov 21, 2024
Tags: Learning resource, LLM
Introducing Netflix’s TimeSeries Data Abstraction Layer
Date: Oct 9, 2024
Tags: Architecture, Data Platform
Company: Netflix
FireDucks : Pandas but 100x faster
Date: Nov 11, 2024
Tags: Pandas, Polars
prompt-tuning-playbook
Tags: Learning resources, Prompt tuning
Designing Cognitive Architectures: Agentic Workflow Patterns from Scratch
Date: Oct 25, 2024
Tags: Learning resources, Architecture, LLM App
How to deploy and serve multi-host gen AI large open models over GKE
Date: Nov 9, 2024
Tags: Hands-on, Llama, GKE
Company: Google
Deploying LLMs with TorchServe + vLLM
Date: Oct 31, 2024
Tags: TorchServe, LLM
Company: PyTorch
RFP Response Generation Workflow (with Human-in-the-Loop)
Tags: Hands-on, Llama
Vector Indexes
Date: Aug 14, 2024
Tags: Learning Resource, Vector index
Company: VectorHub
BI-as-Code and the New Era of GenBI
Date: Nov 4, 2024
Tags: Blog, BI, GenAI
Company: Rill
What is Agentic RAG
Date: Nov 5, 2024
Tags: Blog, RAG Agent
Company: Weaviate
Slurm vs Kubernetes: Which to choose for your ML workloads
Date: Jun 10, 2024
Tags: Learning resource, Slurm, Kubernetes
Introducing the next-level of AI-powered workflows with Amazon Q Developer inline chat
Date: Oct 29, 2024
Tags: Amazon Q, Coding support
Company: AWS
Time-MoE
Tags: OSS, Dataset
GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning
Date: Oct 30, 2024
Tags: Blog, Data poisoning
Company: FAR.AI
Evaluating Model Retraining Strategies
Date: Oct 21, 2024
Tags: Practice, Drift
Vector Databases Are the Wrong Abstraction
Date: Oct 29, 2024
Tags: Practice, OSS, pgai, pgvectorscale
Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku
Date: Oct 23, 2024
Tags: PR, Calude, RPA
Company: Anthropic
Introducing quantized Llama models with increased speed and a reduced memory footprint
Date: Oct 24, 2024
Tags: OSS, Llama, Quantization
Company: Meta
Un Ministral, des Ministraux
Date: Oct 16, 2024
Tags: OSS, Mistral, Ministral
Company: Mistral
Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking
Tags: OSS, Dataset, Ranking
Deploy Llama 3.2 Vision on Amazon SageMaker
Date: Oct 17, 2024
Tags: Hands-on, SageMaker, Llama
PyTorch 2.5 Release Blog
Date: Oct 17, 2024
Tags: Release note, Pytorch
BitNet
Tags: OSS, 1-bit LLMs
Company: Microspft
Real-time Data Infrastructure at Uber
Date: Mar 31, 2021
Tags: Paper, System design
Company: Uber
Ray Batch Inference at Pinterest (Part 3)
Date: Oct 11, 2024
Tags: Ray Batch
Company: Pinterest
How Shopify improved consumer search intent with real-time ML
Date: Oct 16, 2024
Tags: Intent search, Real-time ML
Company: Shopify
Sharing new research, models, and datasets from Meta FAIR
Date: Oct 18, 2024
Tags: OSS, Data, Model, SAM
Company: Meta
STATE OF AI REPORT 2024
Tags: Report, AI
Announcing new products and features for Azure OpenAI Service including GPT-4o-Realtime-Preview with audio and speech capabilities
Date: Oct 1, 2024
Tags: Press release, Azure AI
Company: Microspft
Building a Legal AI Agent using Azure AI Search, Azure OpenAI, LlamaIndex, and CrewAI
Date: Aug 22, 2024
Tags: Blog, LLM App
How we built ngrok's data platform
Date: Sep 26, 2024
Tags: Blog, Data platform
Company: ngrok
Airbyte
Tags: OSS, Data pipeline
Company: Airbyte
Superset
Tags: OSS, BI tool
Company: Superset
How to train a model on 10k H100 GPUs?
Date: Oct 2, 2024
Tags: Blog, GPU cluster
Introducing Netflix’s Key-Value Data Abstraction Layer
Date: Sep 19, 2024
Tags: Practice, Key-Value Store, Pagination
Company: Netflix
Optimize and deploy models with Optimum-Intel and OpenVINO GenAI
Date: Sep 20, 2024
Tags: Edge, OpenVINO, LLM
GPU-Puzzles
Tags: Learning resources, GPU, CUDA
GenOps: the evolution of MLOps for gen AI
Date: Sep 21, 2024
Tags: Blog, GenOps, MLOps
Company: Google
langfun
Tags: OSS, Plug-and-play
Company: Google
Measuring Developer Goals
Date: 2024
Tags: Paper, Developer Productivity
Company: Google
GPU acceleration with Polars and NVIDIA RAPIDS
Date: Sep 17, 2024
Tags: News, Polars, GPU, NVIDIA
Company: Polars
Introducing llama-deploy, a microservice-based way to deploy LlamaIndex Workflows
Date: Sep 5, 2024
Tags: llama-deploy, workflow
Company: LlamaIndex
CUDA-Free Inference for LLMs
Date: Sep 4, 2024
Tags: Practice
Company: LlamaIndex
Learning to Reason with LLMs
Date: Sep 12, 2024
Tags: News, GPT o1
Company: OpenAI
Opik
Tags: OSS, LLMOps, Platform
Company: Comet
How Much GPU Memory is Needed to Serve a Large Language Model (LLM)?
Date: Aug 17, 2024
Tags: Tips
The Pragmatic Programmer for Machine Learning
Tags: Learning resource
GenOps: learning from the world of microservices and traditional DevOps
Date: Aug 31, 2024
Tags: Practice, GenAIOps
Company: Google
Building a serverless RAG application with LlamaIndex and Azure OpenAI
Date: Aug 27, 2024
Tags: Hands-on, RAG, LlamaIndex, Azure
Company: LlamaIndex
Building a Low-Cost Local LLM Server to Run 70 Billion Parameter Models
Date: Aug 30, 2024
Tags: Hands-on, Local, LLM
Company: comet
Things I Wished More Developers Knew About Databases
Date: Apr 22, 2020
Tags: Learning resource, DB
Aryn
Tags: OSS, Extract from PDF
Company: Aryn
Enriching and Ingesting Data into Weaviate with Aryn
Date: Sep 3, 2024
Tags: Hands-on, Aryn, Weaviate
Company: Weaviate
Locally running RAG pipeline with Verba and Llama3 with Ollama
Date: Jul 9, 2024
Tags: Hands-on, RAG, pipeline
Company: Weaviate
Prompt caching with Claude
Date: Aug 15, 2024
Tags: Prompt engineering
Company: Anthropic
Postgres as a search engine
Date: Aug 19, 2024
Tags: Blog, Postgres, Vector search
Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation
Date: Aug 7, 2024
Tags: Hands-on, Multimodal Language Model
Introduction to Distributed Pipeline Parallelism
Tags: Tutorial, Distributed system
Company: PyTorch
How to Deploy the Open-Source Milvus Vector Database on Amazon EKS
Date: Aug 9, 2024
Tags: Hands-on, Milvus
Company: milvus
Deploy open LLMs with Terraform and Amazon SageMaker
Date: Aug 5, 2024
Tags: Tutorial, LLMs, AWS SageMaker, Terraform
CPU-Optimized Embedding Models with fastRAG and Haystack
Date: Aug 1, 2024
Tags: Experiement, fastRAG
Company: Haystack
Securing Generative AI Deployments with NVIDIA NIM and NVIDIA NeMo Guardrails
Date: Aug 5, 2024
Tags: Tutorial, NVIDIA NIM, NVIDIA NeMo Guardrails
Company: NVIDIA
Amazon’s Exabyte-Scale Migration from Apache Spark to Ray on Amazon EC2
Date: Jul 25, 2024
Tags: Practice, Migration from Spark to Ray
Company: AWS
LLM Knowledge Graph Builder: From Zero to GraphRAG in Five Minutes
Date: Jun 19, 2024
Tags: Tutorial, Graph RAG
Company: neo4j
A Visual Guide to Quantization
Date: Jul 22, 2024
Tags: Learning resource, Quantization
Building A Generative AI Platform
Date: Jul 25, 2024
Tags: Practice, RAG App
What We’ve Learned From A Year of Building with LLMs
Date: Jun 8, 2024
Tags: Practice, LLM Application
Maestro: Data/ML Workflow Orchestrator at Netflix
Date: Jul 23, 2024
Tags: OSS, Workflow
Company: Netflix
Reasoning through arguments against taking AI safety seriously
Date: Jul 9, 2024
Tags: Blog, AI Safety
Introducing Micro Agent: An (Actually Reliable) AI Coding Agent
Date: Jun 19, 2024
Tags: OSS, Coding assistant
Company: builder io
LLM Evaluation doesn't need to be complicated
Date: Jul 11, 2024
Tags: Tips, LLM App Evaluation
Building and scaling Notion’s data lake
Date: Jul 1, 2024
Tags: Engineering, Data lake
Company: Notion
Deploy Multilingual LLMs with NVIDIA NIM
Date: Jul 8, 2024
Tags: Tutorial, NVIDIA NIM
Company: NVIDIA
Google Cloud TPUs made available to Hugging Face users
Date: Jul 9, 2024
Tags: NEWS
Company: HuggingFace
Extrinsic Hallucinations in LLMs
Date: Jul 7, 2024
Tags: Learning Resource, Hallucination
Multi AI Agent Systems 101
Date: Jun 17, 2024
Tags: Tutorial, CrewAI
Modernizing Uber’s Batch Data Infrastructure with Google Cloud Platform
Date: May 30, 2024
Tags: migration, GCP
Company: Uber
Step-by-Step Guide to Choosing the Best Embedding Model for Your Application
Date: Jun 4, 2024
Tags: FYI, Embeddings
Company: Weaviate
Constructing knowledge graphs from text using OpenAI functions
Date: Oct 20, 2023
Tags: Practice, Neo4j, LanguChain
MLOps Org
Tags: Org
Mastering AI Department Reorganizations: Lessons from the Trenches
Date: Jun 13, 2024
Tags: Engineering Management
Scale LLMs with PyTorch 2.0 FSDP on Amazon EKS – Part 2
Date: Apr 1, 2024
Tags: AWS EKS, LLMs Training
Company: AWS
From bare metal to a 70B model: infrastructure set-up and scripts
Date: Jun 25, 2024
Tags: Practice, ML Infra
Company: imbue
Deep dive into how Pinterest built its Text-to-SQL solution
Date: May 10, 2024
Tags: Text to SQL
Company: Pinterest
New Chunking Method for RAG-Systems
Date: Jun 2, 2024
Tags: Knowledge share
Benchmarking Haystack Pipelines for Optimal Performance
Date: Jun 24, 2024
Tags: Benchmarking, RAG, Performance
Company: Haystack
Gemma 2 is now available to researchers and developers
Date: Jun 27, 2024
Tags: OSS, LLMs
Company: Google
Data + AI Summit 2024 - Keynotes
Date: Jun, 2024
Tags: Summit, Data + AI Summit
Company: Databricks
Beyond the Basics of Retrieval for Augmenting Generation
Date: Jun 12, 2024
Tags: Practice, RAG
Sharing new research, models, and datasets from Meta FAIR
Date: Jun 18, 2024
Tags: OSS, Research, Models, Datasets
Company: Meta
How Meta trains large language models at scale
Date: Jun 12, 2024
Tags: Practice, Infrastructure
Company: Meta
Private Cloud Compute: A new frontier for AI privacy in the cloud
Date: Jun 10, 2024
Tags: Policy, Security, Privacy
Company: Apple
How to Build an End-to-End ML Pipeline in 2024
Date: Apr 7, 2024
Tags: Learning resources, MLOps
Uncensor any LLM with abliteration
Date: Jun 13, 2024
Tags: Technique, LLM Application
Advanced RAG: Corrective Retrieval Augmented Generation (CRAG) with LangGraph
Date: Apr 24, 2024
Tags: Technique, RAG Application
Monitoring LLM Security in Langfuse
Date: May 14, 2024
Tags: Tool, LLM, Security
Company: Langfuse
Let's reproduce GPT-2 (124M)
Date: Jun 10, 2024
Tags: Learning resources, Hands-on, GPT-2
Building RAG Applications with NVIDIA NIM and Haystack on K8s
Tags: Tutorial, NVIDIA NIM, Haystack
NVIDIA NIM Offers Optimized Inference Microservices for Deploying AI Models at Scale
Date: Mar 18, 2024
Tags: News, NVIDIA NIM
Company: NVIDIA
NVIDIA NIM Inference Microservice
Tags: Tutorial, NVIDIA NIM, LlamaIndex
Company: LlamaIndex
The 10 Minute Guide to Reliable RAG Systems Using Patronus AI, MongoDB Atlas, and LlamaIndex
Date: Jan 10, 2024
Tags: Tutorial, Patronus AI, RAG
Company: Patronus AI
AI in software engineering at Google: Progress and the path ahead
Date: Jun 6, 2024
Tags: Blog, Software Engineer, AI support
Company: Google
Developer Productivity for Humans, Part 7: Software Quality
Date: Dec 22, 2023
Tags: Paper, Developer Productivity
Company: Google
Open-source Model Fine-Tuning Leaderboard
Tags: Leaderboard, LLM, Fine Tuning
Company: Predibase
Hello Qwen2
Date: Jun 7, 2024
Tags: OSS, LLM
Company: Qwen
Unexpected Anti-Patterns for Engineering Leaders — Lessons From Stripe, Uber & Carta
Tags: Note, Engineering Management
From Predictive to Generative – How Michelangelo Accelerates Uber’s AI Journey
Date: May 2, 2024
Tags: System Design, MLOps, Michelangelo
Company: Uber
What We’ve Learned From A Year of Building with LLMs
Date: Jun 8, 2024
Tags: Practice, LLMs
Company: Applied LLMs
Create a Blog Writer Multi-Agent System using Crewai and Ollama
Date: May 27, 2024
Tags: Practice, LLM Apps
MLOps Infrastructure at Mission Lane (Part 1)
Date: Jan 31, 2024
Tags: System Design, MLOps
Company: Mission Lane
Powering Feature Stores with ClickHouse
Date: Jan 18, 2024
Tags: System Design, MLOps
Company: ClickHouse
How To Organize Continuous Delivery of ML/AI Systems: a 10-Stage Maturity Model
Date: May 17, 2024
Tags: Guideline, CICD, MLOps
Company: Outerbounds
Langfuse
Tags: OSS, LLMOps
Company: Langfuse
Building an Observable arXiv RAG Chatbot with LangChain, Chainlit, and Literal AI
Date: May 14, 2024
Tags: Practice, LLM App
Llama 3 implemented in pure NumPy
Date: May 16, 2024
Tags: Learning resource, Llama3, numpy
The 4 Advanced RAG Algorithms You Must Know to Implement
Date: May 4, 2024
Tags: Practice, RAG
Data Wrangler
Tags: VSCode extension, Data engineering
Company: Microsoft
MOMENT
Tags: OSS, Time series ML
Company: Carnegie Mellon University
A first attempt at DSPy Agents from scratch
Tags: Tutorial, DSPy
ScrapeGraphAI
Tags: OSS, Web Scraping, LLM
Common Pitfalls To Avoid When Using Vector Databases
Date: Apr 29, 2024
Tags: Practice, Vector database
Streaming Pipelines for Fine-tuning LLMs and RAG in Real-Time
Date: Apr 24, 2024
Tags: Streaming pipeline
Company: Comet
Building a Chat Application with LangChain, LLMs, and Streamlit for Complex SQL Database Interaction
Date: Feb 10, 2024
Tags: LangChain
A Visual Guide to Vision Transformers
Date: Apr 5, 2024
Tags: Learning resource, Vision Transformers
Snowflake Launches the World’s Best Practical Text-Embedding Model for Retrieval Use Cases
Date: Apr 16, 2024
Tags: OSS, LLM
Company: Snowflake
Rules of Machine Learning: Best Practices for ML Engineering
Tags: Practice, ML Engineering
Shepherd: How Stripe adapted Chronon to scale ML feature development
Date: Apr 15, 2024
Tags: Practice, Chronon
Company: Stripe
Verba: Building an Open Source, Modular RAG Application
Date: Mar 7, 2024
Tags: OSS, RAG Application, Verba
Company: Weaviate
Our next-generation Meta Training and Inference Accelerator
Date: Apr 10, 2024
Tags: Hardware, AI
Company: Meta
Measuring trends in AI
Tags: Report, Trend
Company: Stanford Univ
Introducing Meta Llama 3: The most capable openly available LLM to date
Date: Apr 18, 2024
Tags: LLM, Llama3
Company: Meta
NSA Publishes Guidance for Strengthening AI System Security
Date: Apr 15, 2024
Tags: Goverment, Guidance, Security
Company: NSA
Chronon, Airbnb’s ML Feature Platform, Is Now Open Source
Date: Apr 9, 2024
Tags: OSS, Feature Platform, Chronon
Company: Airbnb
But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning
Date: Apr 2, 2024
Tags: Learning Resource, Transformers
Data acquisition strategies for AI-first start-ups
Date: Apr 4, 2024
Tags: For Beginner, Data
Supporting Diverse ML Systems at Netflix
Date: Mar 8, 2024
Tags: Practice, ML system, Metaflow
Company: Netflix
Scaling AI/ML Infrastructure at Uber
Date: Mar 28, 2024
Tags: Practice, ML Infra
Company: Uber
Introducing DBRX: A New State-of-the-Art Open LLM
Date: Mar 27, 2024
Tags: OSS, LLM
Company: Databricks
CI/CD for Machine Learning in 2024: Best Practices to Build, Train, and Deploy
Date: Dec 27, 2023
Tags: Practice, MLOps
Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval
Date: Mar 22, 2024
Tags: LLM, Quantization
Company: Hugging Face
7 Methods to Secure LLM Apps from Prompt Injections and Jailbreaks
Date: Mar 26, 2024
Tags: Prompt Injection
Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data Platform
Date: Mar 5, 2024
Tags: DevOps, Auto Remediation, OOM error
Company: Netflix
PDF-Based Question Answering with Amazon Bedrock and Haystack
Date: Jan 17, 2024
Tags: Practice, Bedrock, OpenSearch, Haystack
Company: deepset
Generative AI for Beginners
Tags: Learning resource
Company: Microsoft
Building Meta’s GenAI Infrastructure
Date: Mar 12, 2024
Tags: Infrastructure, High-Level Architecture
Company: Meta
LLM Evaluation Metrics: Everything You Need for LLM Evaluation
Date: Jan 22, 2024
Tags: deepeval, LLM Evaluation Metrics
Company: Confident AI
MIT 6.S087: Foundation Models & Generative AI (2024)
Tags: Lecture
Build an LLM-Powered API Agent for Task Execution
Date: Feb 21, 2024
Tags: LLM, Application, API Agent
Company: NVIDIA
Intro to DSPy: Goodbye Prompting, Hello Programming!
Date: Feb 29, 2024
Tags: LLMOps, DSPy, Prompt
How to Build an Advanced AI-Powered Enterprise Content Pipeline Using Mixtral 8x7B and Qdrant
Date: Feb 19, 2024
Tags: LLM, Search, Touch and Try
Top 5 Web Scraping Methods: Including Using LLMs
Date: Feb 26, 2024
Tags: Scraping
Company: comet
Deploying LLMs Into Production Using TensorRT LLM
Date: Feb 22, 2024
Tags: NVIDIA, TensorRT LLM, Tutorial
Generative AI Design Patterns: A Comprehensive Guide
Date: Feb 14, 2024
Tags: GenAI, Practice, Applicaton Pattern
Gemma Open Models
Tags: AI, OSS
Company: Google
GPT in 60 Lines of NumPy
Date: Jan 30, 2023
Tags: Learning resource
Automated Unit Test Improvement using Large Language Modelsat Meta
Tags: Paper, LLM, Unit Test Improvement
Company: Meta
My MLOps bookshelf
Date: Apr 17, 2023
Tags: Books
(Almost) Every infrastructure decision I endorse or regret after 4 years running infrastructure at a startup
Date: Feb 1, 2024
Tags: Ops, Infrastructure, Kubernetes, Good and Bad Practice
Top Evaluation Metrics for RAG Failures
Date: Feb 3, 2024
Tags: RAG, LLM, Ops, Metrics, Practice
Forecast Evaluation for Data Scientists: Common Pitfalls and Best Practices
Tags: Best Practice, Data Science, Evaluation
Enhance Conversational Agents with LangChain Memory
Date: Jan 25, 2024
Tags: LangChain, ChatOps, Chat agent
Company: commet
Doubling Down on Production AI at Tecton
Date: Dec 19, 2023
Tags: News, Feast, Tecton
Company: Tecton
LLaVA-1.6: Improved reasoning, OCR, and world knowledge
Date: Jan 30, 2024
Tags: News, LLaVA
What is the dumbest thing you have seen in data science?
Tags: reddit, dumbest, Data Science
Host the Whisper Model on Amazon SageMaker: exploring inference options
Date: Jan 16, 2024
Tags: Open AI Whisper, Amazon SageMaker
Company: AWS
Advanced RAG: Query Augmentation for Next-Level Search using LlamaIndex
Date: Jan 18, 2024
Tags: RAG, LlamaIndex
How Meta built the infrastructure for Threads
Date: Dec 19, 2023
Tags: ZippyDB, Infrastructure
Company: Meta
CI/CD for Machine Learning in 2024: Best Practices to Build, Train, and Deploy
Date: Dec 27, 2023
Tags: CI/CD/CT, MLOps, ML Lifesycle
https://github.com/collabora/WhisperSpeech
Tags: OSS, Speech to Text
https://github.com/recommenders-team/recommenders
Tags: OSS, Recommender system, Algorithm
Generating value from enterprise data: Best practices for Text2SQL and generative AI
Date: Jan 4, 2024
Tags: LLM application, AWS Bedrock
Company: AWS
Merge Large Language Models with mergekit
Date: Jan 9, 2024
Tags: OSS, LLMs merge, Algorithm
https://github.com/DAGWorks-Inc/hamilton
Tags: OSS, DAG, Dataflow
2023: A year of groundbreaking advances in AI and computing
Date: Dec 22, 2023
Tags: Summary
Company: Google
Evaluating Prompts: A Developer’s Guide
Date: Dec 19, 2023
Tags: LLM, Prompt engineering, Evaluation guide
Company: arize
Speculative Decoding for 2x Faster Whisper Inference
Date: Dec 20, 2023
Tags: Whisper, Engineering
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
Tags: Diffusion model, OSS
How To Use Comet At Different Stages of ML Projects
Date: Dec 20, 2023
Tags: Practice, MLOps
Company: Comet
Anti-hype LLM reading list
Tags: LLM links
How Pinterest scaled to 11 million users with only 6 engineers
Date: Oct 3, 2023
Tags: Engineering, Practice
Company: Pinterest
Optimizing LLMs From a Dataset Perspective
Date: Sep 15, 2023
Tags: LLMs, Dataset, Finetuning
Advanced RAG Techniques: an Illustrated Overview
Date: Dec 17, 2023
Tags: RAG, Techniques
Efficient Vector Similarity Search in Recommender Workflows Using Milvus with NVIDIA Merlin
Date: Dec 15, 2023
Tags: Practice, Vector search, Recommender, NVIDIA Merlin, Milvus
Company: Milvus
Our First Netflix Data Engineering Summit
Date: Dec 15, 2023
Tags: Data Engineering, Practice
Company: Netflix
Minimize real-time inference latency by using Amazon SageMaker routing strategies
Date: Nov 30, 2023
Tags: SageMaker, real-time inference
Company: AWS
A Guide on 12 Tuning Strategies for Production-Ready RAG Applications
Date: Dec 6, 2023
Tags: RAG, Practice
Deploy Mixtral 8x7B on Amazon SageMaker
Date: Dec 12, 2023
Tags: RAG, SageMaker, Mixtral
データ品質の5つの分類と品質管理プロセス
Date: Dec 18, 2023
Tags: Data engineering
Company: 風音屋
Extracting Training Data from ChatGPT
Date: Nov 28, 2023
Tags: LLMs, Vulnerabilities
LMQL — SQL for Language Models
Date: Nov 28, 2023
Tags: LLMOps
Designing a Distributed SQL Engine: Challenges & Decisions
Date: Apr 23, 2023
Tags: Engineering
Company: OceanBase
Unveiling the Core of Instacart’s Griffin 2.0: A Deep Dive into the Machine Learning Training Platform
Date: Nov 21, 2023
Tags: ML Platform, Training
Company: Instacart
Mastering ML Model Evaluation with Giskard: From Validation to CI/CD Integration
Date: Oct 24, 2023
Tags: Model evaluation, Scan, Vulnerability
Company: Giskard
Boost inference performance for LLMs with new Amazon SageMaker containers
Date: Nov 27, 2023
Tags: LLMs, AWS SageMaker
Company: AWS
Automatic detection of hallucination with SelfCheckGPT
Date: Nov, 2023
Tags: Hallucination
Learning JAX as a PyTorch developer
Date: Nov 9, 2023
Tags: JAX, PyTorch, Tips
make real, the story so far
Date: Nov 19, 2023
Tags: Application, OpenAI, ChatGPT
Ingesting Data for Semantic Searches in a Production-Ready Way
Date: Nov 19, 2023
Tags: LLMOps
Key takeaways from the Biden administration executive order on AI
Date: Oct 31, 2023
Tags: Executive Order on AI
High-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUs
Date: Nov 6, 2023
Tags: TPU, PyTorch, Llama2, Training, Inference, High performance
Company: PyTorch
Boosting RAG: Picking the Best Embedding & Reranker models
Date: Nov 3, 2023
Tags: RAG, Evaluation, Rerankers
Production-Ready Observability Platform for AI Systems
Date: Nov 3, 2023
Tags: MLOps, Observability, Good Reference
New models and developer products announced at DevDay
Date: Nov 6, 2023
Tags: Update, ChatGPT
Company: OpenAI
Lessons Learned from Twenty Years of Site Reliability Engineering
Date: Oct, 2023
Tags: SRE, Lessons learned
Company: Google
The Guide To LLM Evals: How To Build and Benchmark Your Evals
Date: Oct 13, 2023
Tags: LLMOps, Performance evaluation
An Intro to Real-Time Machine Learning
Date: Oct, 2023
Tags: Feature store, Realtime ML
Branches Are All You Need: Our Opinionated ML Versioning Framework
Date: Oct 10, 2023
Tags: ML versioning, Git branch
Advanced RAG Implementation on Custom Data Using Hybrid Search, Embed Caching And Mistral-AI
Date: Oct 9, 2023
Tags: RAG, Implementation
Reflections on AI Engineer Summit 2023
Date: Oct, 2023
Tags: LLMOps, Practice
Startup CTO Handbook
Tags: Engineering management
State of AI Report 2023
Date: Oct 12, 2023
Tags: Report
Analyzing the Security of Machine Learning Research Code
Date: Oct 4, 2023
Tags: Security, Tips, Practice, Credential
Company: NVIDIA
Train and Deploy Mistral 7B with Hugging Face on Amazon SageMaker
Date: Oct 5, 2023
Tags: AWS SageMaker, Serving an LLM, WebAPI
Personalize your generative AI applications with Amazon SageMaker Feature Store
Date: Oct 6, 2023
Tags: AWS SageMaker, Serving an LLM
Company: AWS
Multimodality and Large Multimodal Models (LMMs)
Date: Oct 10, 2023
Tags: LMM, Multimodal, CLIP, Flamingo
Scaling Large (Language) Models with PyTorch Lightning
Date: Oct 4, 2023
Tags: PyTorch Lightning, Training, Sample
Company: Lightning AI
Retrieval Augmented Generation on audio data with LangChain and Chroma
Date: Sep 26, 2023
Tags: RAG
Company: Assembly AI
Chroma
Tags: OSS, Vector database
Weaviate
Tags: OSS, Vector database
Prompt Engineering Evolution: Defining the New Program Simulation Prompt Framework
Date: Sep 29, 2023
Tags: Prompt engineering
Training Foundation Improvements for Closeup Recommendation Ranker
Date: Sep 27, 2023
Tags: MLOps, Automation
Company: Pinterest
How LinkedIn Is Using Embeddings to Up Its Match Game for Job Seekers
Date: Oct 5, 2023
Tags: Embeddings, Two tower, Recommender system
Company: LinkedIn
Lessons learned from implementing user-facing analytics / dashboards?
Date: Sep 29, 2023
Tags: Lessons learned, BI tool
10 Ways to Improve the Performance of Retrieval Augmented Generation Systems
Date: Sep 19, 2023
Tags: RAG
Causality for Machine Learning
Tags: Causality, Book
7 Habits of Highly Effective Software Engineers
Tags: Mindset
Accelerating Vector Search: Fine-Tuning GPU Index Algorithms
Date: Sep 11, 2023
Tags: Vector search, Benchmark, RAPIDS AI RAFT
Company: NVIDIA
LLM Monitoring and Observability — A Summary of Techniques and Approaches for Responsible AI
Date: Sep 15, 2023
Tags: LLMOps, Monitoring, Evaluating, Tracking
Google’s Bard chatbot can now find answers in your Gmail, Docs, Drive
Date: Sep 19, 2023
Tags: Bard, Chatbot, Application
Company: Google
DALL·E 3
Tags: Text-to-image, ChatGPT plus
Company: OpenAI
Evaluation & Hallucination Detection for Abstractive Summaries
Date: Sep, 2023
Tags: Hallucination, survey, LLM, Summarization
Optimizing LLMs From a Dataset Perspective
Date: Sep 15, 2023
Tags: LLM, Fine tuning, Technique
LLM Training: RLHF and Its Alternatives
Date: Sep 10, 2023
Tags: LLM, Tuning, Training model, RLHF
Unlocking Multi-GPU Model Training with Dask XGBoost
Date: Sep 7, 2023
Tags: XGBoost, RAPIDS, Dask, Multi-GPU
Company: NVIDIA
Introducing Glassdoor’s ML Registry: A Centralized Artifact Management Solution
Date: Aug 31, 2023
Tags: Artifact registory, In-house solution
Company: Glassdoor
Automated trace collection and analysis
Date: Sep 5, 2023
Tags: CPU, GPU, Utilization, Profiling
Company: PyTorch
Optimize open LLMs using GPTQ and Hugging Face Optimum
Date: Aug 31, 2023
Tags: LLMs quantization, GPTQ
Organize Your Prompt Engineering with CometLLM
Date: Aug 26, 2023
Tags: CometLLM
Company: Comet
Accelerating AI: Implementing Multi-GPU Distributed Training for Personalized Recommendations
Date: Jun 8, 2023
Tags: Distributed training, DP (Data Parallel), DDP (Distributed Data Parallel)
Evaluating the fairness of computer vision models
Date: Aug 31, 2023
Tags: DINOv2, FACET, dataset, fairness, CV
Company: Meta
Teaching with AI
Date: Aug 31, 2023
Tags: GPT, LLMs, Prompt
Company: OpenAI
Lessons Learnt From Consolidating ML Models in a Large Scale Recommendation System
Date: Aug 25, 2023
Tags: Architect, ML system
Company: Netflix
Retrieval-Augmented Generation: How to Use Your Data to Guide LLMs
Date: Aug 24, 2023
Tags: LLMs, RAG, Tunining, Vector search, Guide
Measuring developer productivity? A response to McKinsey
Date: Aug 30, 2023
Tags: Note, Productivity, DORA, SPACE
Cerebrium
Tags: SaaS, Serverless
Company: cerebrium
An Elegant Puzzle: Systems of Engineering Management
Tags: Engineering management, Reading note
Introducing Code Llama, a state-of-the-art large language model for coding
Date: Aug 24, 2023
Tags: Coding support, OSS
Company: Meta
How to Build a Fully Automated Data Drift Detection Pipeline
Date: Aug 2, 2023
Tags: Drift detection, Workflow, Evidently, Kestra
ML Pipeline Architecture Design Patterns (With 10 Real-World Examples)
Date: Aug 11, 2023
Tags: Real example, ML pipeline, Workflow
Company: Neptune
Deploy thousands of model ensembles with Amazon SageMaker multi-model endpoints on GPU to minimize your hosting costs
Date: Aug 8, 2023
Tags: SageMaker, Multi-Model Endpoints, MMEs
Company: Amazon
Curating Trillion-Token Datasets: Introducing NVIDIA NeMo Data Curator
Date: Aug 8, 2023
Tags: NeMo, Data curation, Data scraping
Company: NVIDIA
A layered approach to MLOps
Date: Dec 20, 2022
Tags: MLOps, DDD, Working with researcher and engineer, Practice
Company: Microsoft
Tutorial: Build an Active Learning Pipeline using Data Engine
Date: Aug 15, 2023
Tags: Data Engine
Company: DagsHub
Patterns for Building LLM-based Systems & Products
Date: Jul, 2023
Tags: LLMs, Evaluation, RAG, Fine-tuning, Caching, Guardrails, Defensive UX, Collect user feedback
Effectively solve distributed training convergence issues with Amazon SageMaker Hyperband Automatic Model Tuning
Date: Jun 13, 2023
Tags: Hyperparameter, tuning, distributed training
Company: Amazon
Launching Data Engine – A toolset for rapid iteration on unstructured datasets
Date: Jun 24, 2023
Tags: Dataset, Data engine, Data engineering
Company: DagsHub
Adding Interpretability to PyTorch Models with Captum
Date: Jun 28, 2023
Tags: Explainability
Explainable AI: Visualizing Attention in Transformers
Date: Jul 18, 2023
Tags: Explainability, Transformers, Attention
ML system design: 200 case studies to learn from
Date: Jun 12, 2023
Tags: List of ML system design
Company: Evidently AI
How to Deploy an AI Model in Python with PyTriton
Date: Jun 28, 2023
Tags: PyTriton, Triton Inference Server
Company: NVIDIA
Dealing with Train-serve Skew in Real-time ML Models: A Short Guide
Date: Jun 23, 2023
Tags: Drift detection, Realtime ML, Monitoring
Company: Nubank
You Don't Need a Bigger Boat
Tags: MLOps, Sample
MLOps Landscape in 2023: Top Tools and Platforms
Tags: Landscape
Company: Neptune AI
NeMo Guardrails
Tags: Chatbot, LLMs, Guardrails, OSS
Company: NVIDIA
Mitigations
Tags: Security
Company: Mitre
8 annoying A/B testing mistakes every engineer should know
Tags: Tips, AB test, mistake
Monitoring Machine Learning Models in Production
Date: Jun 13, 2023
Tags: Monitoring
Company: Comet
Arize-ai/phoenix
Date: Jun 19, 2023
Tags: Visualization, Notebook, Drift detection
How Google Measures and Manages Tech Debt
Date: Jun 9, 2023
Tags: Technical debt, Maturity model
Company: Google
Model CI/CD with Comet
Date: May 25, 2023
Tags: MLOps, Automation, CICD, Drift detection
Company: Comet
ML Ops at Reasonable Scale
Date: Jul 23, 2021
Tags: Pipeline, MLOps, Data engineering
OWASP Top 10 List for Large Language Models
Date: Jun 12, 2023
Tags: Security, Generative AI, Checklist
Securing AI Systems — Defensive Strategies
Date: Jun 7, 2023
Tags: Security, Generative AI, Strategy
Large Language Models: A Complete Guide
Date: May 31, 2023
Tags: LLMs, Training, Deploying, Various topics
Efficiently Scale LLM Training Across a Large GPU Cluster with Alpa and Ray
Date: May 15, 2023
Tags: Alpa, Ray, JAX, Building LLMs
Company: NVIDIA
DevEx: What Actually Drives Productivity
Date: May 3, 2023
Tags: DevEx, Productivity, Management
MLOps Guide
Tags: MLOps, Reference, Career
Scaling deep retrieval with TensorFlow Recommenders and Vertex AI Matching Engine
Date: May 2, 2023
Tags: MLOps, Retrieval, TwoTower, GCP
Company: Google
Recommended Resources for Starting A/B Testing
Date: Apr 20, 2023
Tags: A/B Testing, Links, Reference
Evaluate Your Team’s ML Maturity
Date: Apr 12, 2023
Tags: MLOps, Reproducibility, Visibility, Debugging, Monitoring
Company: Comet
A Practitioner's Guide to Monitoring Machine Learning Applications
Date: Apr 11, 2023
Tags: Observability
Deploy ML models at the edge with Microk8s, Seldon and Istio
Date: Mar 24, 2023
Tags: Edge, IoT, Tips
Building a Machine Learning Platform [Definitive Guide]
Date: Mar 23, 2023
Tags: MLOps, Platform
Company: Neptune AI
The Magic of Merlin: Shopify's New Machine Learning Platform
Date: Apr 6, 2022
Tags: Ray, Merlin, MLOps, Platform
Company: Shopify
Training 175B Parameter Language Models at 1000 GPU scale with Alpa and Ray
Date: Mar 22, 2023
Tags: LLMs, Alpa, Ray, Training
Company: anyscale
HOW DISCORD STORES TRILLIONS OF MESSAGES
Date: Mar 6, 2023
Tags: MogoDB, Cassanra, ScyllaDB, journey
Company: Discord
Distributed Training: Errors to Avoid
Date: Mar 2, 2023
Tags: Tips, Distributed training
Company: Neptune AI
SOON (Spark cOntinuOus iNgestion) for near real-time data at Coinbase - Part 1
Date: Jan 25, 2023
Tags: Real-time data processing
Company: Coinbase
NVIDIA Merlin meets the MLOps ecosystem: building a production-ready RecSys pipeline on cloud
Date: Feb 23, 2023
Tags: Recommender, MLOps, OSS
Company: NVIDIA
Unleashing ML Innovation at Spotify with Ray
Date: Feb 1, 2023
Tags: Ray
Company: Spotify
Curated papers, articles, and blogs on data science & machine learning in production.
Date: Feb 27, 2023
Tags: Curated list, MLOps, Production
Blueprints for recommender system architectures: 10th anniversary edition
Date: Jan 29, 2023
Tags: Recommender
Global expansion of Machine Learning Models: how to distribute them as software products?
Date: Jan 13, 2023
Tags: Globalization, Lessons learned
Company: Nubank
FastAPI Best Practices
Date: Jan 10, 2023
Tags: FastAPI, Best practice
Measuring an engineering organization
Date: Jan 2, 2023
Tags: Engineering management, Measuring productivity
Bringing Machine Learning to Production at Ubisoft
Date: Dec 29, 2022
Tags: Usecase
Company: Ubisoft
Serve hundreds to thousands of ML models — architectures from industry
Date: Jan 14, 2022
Tags: Serving, Inferencing, Pattern
Using MLOps to Build a Real-time End-to-End Machine Learning Pipeline
Date: Nov 29, 2022
Tags: MLOps, Use case, pipeline
Company: Binance
Securing Machine Learning Algorithms
Date: Dec 14, 2021
Tags: Security, Reference
Thoughts on ML Engineering After a Year of my PhD
Date: Jul 18, 2022
Tags: Consideration, ML Engineering
Argo Rollouts at scale: Bringing Automated Rollbacks to 2,100+ services at Monzo
Date: Nov 7, 2022
Tags: Argo rollouts, experience, automation
Company: Monzo
The Architecture of a Modern Startup
Date: Nov 5, 2022
Tags: Use case
Company: SYZYGY AI
The Growing Importance of Metadata Management Systems
Date: Oct 27, 2022
Tags: Metadata management, Introduction
DATA MESH ARCHITECTURE
Date: Oct 27, 2022
Tags: Data mesh, Introduction
Lessons Learned: The Journey to Real-Time Machine Learning at Instacart
Date: Sep 7, 2022
Tags: Real time ML, Use case
Company: Instacart
Principles for the security of machine learning
Date: Aug 31, 2022
Tags: Security, Principles
Best Practices for ML Engineering
Date: Sep 12, 2022
Tags: MLOps, Principles
Company: Google
ML Education at Uber: Frameworks Inspired by Engineering Principles
Date: Jul 28, 2022
Tags: MLOps, Reference, Guide
Company: Uber
Towards data quality management at LinkedIn
Date: Jun 9, 2022
Tags: Data, Architecture, Design
Company: LinkedIn
Managing Uber’s Data Workflows at Scale
Date: Feb 28, 2019
Tags: Data, ETL, Architecture, Design
Company: Uber
Supercharging A/B Testing at Uber
Date: Jul 21, 2022
Tags: AB testing, Architecture, Design
Company: Uber
A Guide to Data Annotation and Synthetic Data Generation Tools
Date: Jul 12, 2022
Tags: Annotation, Syntetic data, Introduction
Scaling productivity on microservices at Lyft (Part 1)
Date: Nov 11, 2021
Tags: Kubernetes, Productivity, Use case, Engineering
Company: Lyft
How Netflix Scales its API with GraphQL Federation (Part 1)
Date: Nov 10, 2020
Tags: GraphQL, Use case, Engineering
Company: Netflix
Operating Apache Pinot @ Uber Scale
Date: Oct 20, 2020
Tags: Pinot, Use case, Engineering
Company: Uber
Stanford CS25: Transformers United
Date: Jul 19, 2022
Tags: ML, Learning-resource, Transformers
Company: Stanford
Deployment for Free -- A Machine Learning Platform for Stitch Fix's Data Scientists
Date: Jul 19, 2022
Tags: MLOps, In-House case, Stitch Fix
Company: Stitch
Machine Learning Operations (MLOps): Overview, Definition, and Architecture
Date: Jul 11, 2022
Tags: MLOps, Definition, Role and responsibility
Shipping to Production
Date: Jul 4, 2022
Tags: MLOps, Production, Introduction
MLOps: A Taxonomy and a Methodology
Date: Jul 4, 2022
Tags: MLOps, Overview, Survey
Evidently
Date: Jul 4, 2022
Tags: MLOps, Monitoring, Drift detection
Company: Evidently AI
Made With ML #MLOps
Date: Jun 27, 2022
Tags: MLOps, Tutorial, Reference
The SPACE of Developer Productivity
Date: Jun 13, 2022
Tags: Productivity, Engineering Management
My MLOps Stack
Date: May 30, 2022
Tags: MLOps, Reference
Monzo’s machine learning stack
Date: May 2, 2022
Tags: MLOps, Operations
Company: Monzo
Gitlab DevSecOps 2021 Survey
Date: Apr 25, 2022
Tags: DevSecOps, Survey
Company: Gitlab
Interpretable Machine Learning
Date: Mar 29, 2022
Tags: Guide, Explainablity
Scaling Machine Learning Productivity at LinkedIn
Date: Jan 3, 2019
Tags: Auto ML, Feature store, Monitoring
Company: LinkedIn
The journey to build an explainable AI-driven recommendation system to help scale sales efficiency across LinkedIn
Date: Apr 6, 2022
Tags: Explainability, Recommendation, Narrative generation
Company: LinkedIn
Introducing Fabricator: A Declarative Feature Engineering Framework
Date: Apr 4, 2022
Tags: Data, Data Engineering, Feature Store
Company: DoorDash
Rules of Machine Learning
Tags: General
Data Distribution Shifts and Monitoring
Date: Feb 7, 2022
Tags: General, Drift, Data Distribution, Observability
Scaling Kubernetes to 7,500 Nodes
Date: Jan 25, 2021
Tags: Kubernetes, Modeling platform, Computing resource management
Explaination of MLOps and its common terms
Date: Sep 29, 2022
Tags: General, Drift, Data Distribution, Observability
Company: Censius
Curated list and description of MLOps tools
Date: Sep 29, 2022
Tags: General, Drift, Data Distribution, Observability , Guide
Company: Censius
