[{"cid":58150,"parent_cid":null,"body":"pg_durable: PostgreSQL in-database durable execution\r\n\r\nhttps://github.com/microsoft/pg_durable\r\n\r\nLobsters: https://lobste.rs/s/k9wvkn/pg_durable_postgresql_database_durable","created_at":"2026-06-05T20:50:33.264Z","tags":["databases","bot"],"orgs":[],"usrs":[],"created_by":"bot_lobsters","thumb":"https://opengraph.githubassets.com/d464f31d9c8e8026a0209c759609ff1cac2171c9dcf57df14f226546e43c833b/microsoft/pg_durable","c_comments":1,"c_reactions":"","c_flags":0,"links":[],"flaggers":[],"author_ups":52,"author_downs":4,"author_posts_count":3920,"tag_ups":446,"tag_downs":77,"domain_ups":52,"domain_downs":4,"score":"2026-06-03T15:32:07.880Z","repost_ups":0,"mentions":[],"domains":["github.com","lobste.rs"],"comments":"1","reaction_count":"0","reaction_counts":{},"user_reactions":[],"child_comments":[{"cid":58154,"body":"> # [GitHub - microsoft/pg_durable: PostgreSQL in-database durable execution](https://github.com/microsoft/pg_durable)\r\n>\r\n> Long-running, fault-tolerant SQL functions for teams that already keep their state in Postgres and want to stop stitching together cron jobs, workers, queues, and status tables to make background work reliable. Define the workflow in SQL, let pg_durable checkpoint each step, and resume after crashes, restarts, or failed steps.\r\n>\r\n> Durable execution is now a standard industry pattern, and pg_durable brings it inside Postgres with no extra service infrastructure required. Part of our mission to bring compute close to data.\r\n>\r\n> ## Is this for me?\r\n>\r\n> ## Who it's for\r\n>\r\n> - Backend and data engineers who want workflows to live next to the data they touch.\r\n>\r\n> - DBAs and SREs automating runbooks that must survive restarts and be auditable in SQL.\r\n>\r\n> - Teams building data or AI pipelines that need durable execution per row, document, or batch.\r\n>\r\n> ## The core idea\r\n>\r\n> A pg_durable function is a graph of SQL steps that PostgreSQL executes and checkpoints as it goes. If the database crashes, restarts, or a step fails, execution resumes from the last durable checkpoint instead of making you reconstruct state by hand.\r\n>\r\n> ## Workloads this is useful for\r\n>\r\n> - Vector embedding pipelines: chunk, call an embedding API, and upsert into pgvector.\r\n>\r\n> - Ingest pipelines: stage, deduplicate, transform, and publish large batches.\r\n>\r\n> - Scheduled maintenance: detect bloat, notify, wait for approval, then run the next action.\r\n>\r\n> - Fan-out aggregation: run independent queries in parallel, then join the results.\r\n>\r\n> - External API workflows: enrichment, classification, and webhook-style calls from SQL.\r\n>\r\n> ## What you're probably doing today instead\r\n>\r\n> - pg_cron plus a jobs table, status columns, retry counters, and a polling worker.\r\n>\r\n> - An external orchestrator such as Airflow, Temporal, Step Functions, or Argo calling back into Postgres.\r\n>\r\n> - A queue plus workers plus a separate state table to coordinate retries and partial completion.\r\n>\r\n> - A plpgsql procedure that works until a crash or long-running transaction forces you to start over.\r\n>\r\n> ## Pain points it addresses\r\n>\r\n> - A restart in the middle of a long job means rerunning work that already succeeded.\r\n>\r\n> - One failed row or one failed API call turns into manual cleanup and uncertain replay.\r\n>\r\n> - Long transactions hold locks, grow WAL, and make batch jobs fragile at larger scale.\r\n>\r\n> - Parallel work in the app tier creates more places for partial-failure bugs and drift.\r\n>\r\n> - The workflow logic ends up spread across SQL, workers, queues, dashboards, and status tables.\r\n>\r\n> ## What changes in your architecture\r\n>\r\n> - The workflow definition moves into SQL and starts with df.start(...).\r\n>\r\n> - Retry state, progress tracking, and checkpointing move into Postgres instead of bespoke app code.\r\n>\r\n> - Some app-tier workers, queue consumers, or scheduler glue can disappear entirely.\r\n>\r\n> - Operational visibility comes from Postgres tables such as df.instances, using the same auth and backup model as your data.\r\n>\r\n> ## When not to use it\r\n>\r\n> - The job is already a single INSERT ... SELECT or one ordinary SQL statement.\r\n>\r\n> - You need sub-millisecond synchronous request handling rather than durable background execution.\r\n>\r\n> - You cannot install extensions or run a background worker in your Postgres environment.\r\n>\r\n> - The workflow mostly lives outside Postgres and spans many heterogeneous systems.\r\n>\r\n> - You need arbitrary application logic that does not map cleanly to SQL steps, branching, loops, or HTTP calls.\r\n>\r\n> ## How it works\r\n>\r\n> - Define a workflow in SQL using composable operators such as ~> and |=>.\r\n>\r\n> …","orgs":[],"tags":["databases","bot"],"usrs":[],"c_flags":0,"comments":0,"created_at":"2026-06-05T20:50:42.872787+00:00","created_by":"bot_reader","parent_cid":58150,"child_comments":[],"user_reactions":[],"reaction_counts":{}}]}]