DevOpsDays Philadelphia 2025
Not every team needs a full-blown internal developer platform—but every team needs something to reduce friction. In this talk, we’ll walk through three levels of self-service maturity: the IaC factory (where platform teams write most of the infrastructure), the template/module model (where platform teams publish reusable components), and the full IDP model (where app teams provision resources through UI or CLI with guardrails). We’ll also break down what actually goes into a platform—including foundational infrastructure, shared services, workloads, and developer experience—and how to right-size your approach without overbuilding. Whether you’re just starting out or trying to improve adoption, this talk will help you map where you are and where you should (or shouldn’t) go next.
Security boils down to trust. Trusting that the code will do what is expected and is free from vulnerabilities. Trusting that the entities interacting with our data and resources have the right to access those resources. Our current approach to both human and non-human access uses the same basic flawed pattern: long-lived credentials.
This approach to trusted access does not take into account who or what is requesting that resource. These secrets, which quite often leak, are an attacker's best friend and are how attackers think about getting into and moving throughout your system.
What if instead of simply asking for a security key or credential to gain access, our applications, workloads, and resources asked "Who are you and how can you prove that?" Humans can move towards leveraging our non-changing characteristics, like biometrics. But what about machines? Especially in the world where pods and workloads last for only hours or days?
Attend this session to:
- Better communicate about why we must do things differently and soon
- Learn how the open-source software community has looked at addressing the identity problem
- Understand what commercial options are available
- Map a path away from the world of long-lived credentials
The future of identity and access management is the future of security, IT, and, ultimately, business resiliency.
Focus on intent: what people are trying to do with the system. While product analytics might give a broad sense of “what happened,” making sense of telemetry pointing to “what went wrong” is key to improving the system. For users, the specific issue doesn’t matter, because the software doesn’t work!
Earlier this year I led TCG's technical team for a competitive real-time development challenge vying for a $40 million contract with the Department of Treasury. What began as a seemingly simple "one-day code challenge" rapidly devolved into a month-long race to prepare the Release One build needed to just begin the challenge day. Our final solution featured a full DevOps pipeline, Terraform deployment, multi-region failover Kubernetes infrastructure, and a comprehensive web application with AI image processing. This was all delivered under immense pressure within a one month schedule and limited customer access.
This isn't theoretical; it's a raw, honest look at real-world challenges. We'll delve into the critical, sometimes painful, lessons learned about DevOps principles and Agile anti-patterns that surfaced under fire. I believe these in-person live coding and technical assessments will become increasingly common in contract competitions, especially as AI blurs the lines of expertise when presented via written proposals.
An alert came in, waking you from a dream. What to do? Is it a vulnerability that needs immediate attention? Or just a flaky script? Alerts come in all sorts of frustrating shapes and sizes, but sadly not enough of them are worthy of your attention. There’s many ways to solve this: how much AI do you want here? Lots! Great, we’ll do that. Let’s explore ways to make alerting more helpful, more useful and more deserving of your precious time and attention.
AI agents are transforming the way we manage cloud infrastructure — bringing automation, context awareness, and natural language control to everyday DevOps and SRE tasks.
In this hands-on workshop, we’ll build an intelligent AI agent to interact with AWS services.
Attendees will learn how to architect and deploy an AI-driven assistant that can automate infrastructure tasks, interpret user instructions, and act with minimal human intervention. Whether you're an aspiring cloud engineer or an experienced practitioner, this session will empower you to integrate AI into your AWS workflow with confidence.
Requirement: Familiarity with Python and basic AWS services. AWS account with Bedrock access.
Target Audience: Cloud engineers, AI/ML practitioners, early-career tech professionals interested in infrastructure automation.
Scaling GitOps for large-scale deployments can be challenging with a single repository or controller. This talk explores sharding as a strategy to optimize performance, improve reliability, and manage complexity in GitOps workflows for multi-environment or multi-tenant setups.
Numerous developers swiftly write and launch code in an agile environment, postponing secret management for later. A developer might opt to temporarily hard-code the secrets, and, upon merging the final version with the main branch, eliminate the secrets and transition to more secure alternatives, such as retrieving the secret from them. Regrettably, individuals err, and frequently those secrets are overlooked, hidden within the code, and missed during code review, ultimately ending up merging code into the main brach. The most obvious place to start scanning for secrets is in code. Securing the code and automating the scan could be the right solution to avoid any human error.
In the race to deliver software, bigger often feels better—but it comes at a cost. This talk champions Small Batch Delivery, a practice that streamlines development by shrinking the size of changes we ship. You’ll discover how small pull requests reduce risk, improve code quality, and keep teams in a state of flow. We'll dive into the ripple effects of bloated PRs, the psychology behind fast reviews, and why this isn't just a dev tactic—it's a cultural mindset shift. If you're ready to ship faster with less stress, it's time to think small.
DevOps has a notoriously steep learning curve. Getting started in the field can feel like being dropped in a foreign country without the ability to understand anything about the language.
A language is more than just the syntax and semantic rules of the words themselves. It also encompasses the shared culture of the speakers. With the proliferation of programming languages as well as the deeply held cultural beliefs of the community, it's easy to see that learning DevOps is like trying to learn a foreign language.
I will review five foundational hypotheses from the field of Second Language Acquisition and relate these hypotheses back to the world of DevOps. DevOps practitioners, trainers, tool builders, and learners should all come away with useful insights to apply to their practice.
I’ve learned that nothing humbles you faster than talking about things you think you know with children. As a Software Engineer, I was fairly confident that I knew what was important about writing code and I was happy to share that knowledge while volunteering to teach a class to a group of Kindergarteners. What I learned next, was that I was very very wrong. What I showed up to this class with was a plan, however what I left with was a set of lessons: for becoming a better developer, for helping to grow better teams, and for being a better person in general. During this session I’ll discuss these lessons and what we can all take away from them to make our jobs (and the world!) a better place.
We all know investing in developer experience is a good call...but how do you really know if those investments are working? Traditional DevOps metrics? Sure, they help. But now AI is everywhere, promising to save the day. So how do you measure if AI is actually doing anything besides producing the internet's finest sh*tposts?
In this light-hearted talk, I’ll break down real ways to measure AI’s impact—beyond the memes. We’ll look at metrics for individual contributors, teams, and departments, exploring whether AI is a true game-changer or just another shiny buzzword.
Writing SQL slows everyone down. Non-technical users can’t, data teams won’t, and leadership waits. While commercial AI-powered tools promise a solution, most are pricey, opaque, and allergic to your reporting requirements. This Ignite talk presents an open-source Text-to-SQL chatbot that prioritizes transparency and user control. It combines advanced prompt engineering & guardrails to reduce hallucinations and ensure generation of reliable SQL queries. It uses an evaluation framework to assess performance by checking syntax accuracy, schema awareness and robustness to ambiguous user inputs. You’ll walk away knowing what works, what breaks, and why building your own AI assistant might just be your smartest move. Query load is not a career path. Offload it to the bot.
Back in 1976, the Makefile made it easy to compile C programs. In 2025, it’s used for automating just about everything, from shell scripts to build workflows to CI/CD pipelines.
Over 49 years, a lot has obviously changed about the way we design and build software. However, the fundamentals (esp. design patterns, algorithms, and architectures) have stayed more or less the same. We can learn a lot from the Makefile, from its design to how it has remained relevant in such a quickly evolving space. In this ignite, Benjie will talk about what patterns made the Makefile a mainstay in software development and deployment. He’ll also cover why good software is timeless.