<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Chander Inguva]]></title><description><![CDATA[Chander Inguva]]></description><link>https://blog.inguva.dev</link><generator>RSS for Node</generator><lastBuildDate>Thu, 09 Apr 2026 13:24:25 GMT</lastBuildDate><atom:link href="https://blog.inguva.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[How I Built an AI-Powered IPL Fantasy Cricket League for My Friend Group in a Weekend]]></title><description><![CDATA[Every IPL season, our friend group has the same problem: someone creates a WhatsApp poll, half the group forgets to pick their team, and the whole thing dies by match two. This year I decided to fix that properly — a real web app, with an AI that sug...]]></description><link>https://blog.inguva.dev/how-i-built-an-ai-powered-ipl-fantasy-cricket-league-for-my-friend-group-in-a-weekend</link><guid isPermaLink="true">https://blog.inguva.dev/how-i-built-an-ai-powered-ipl-fantasy-cricket-league-for-my-friend-group-in-a-weekend</guid><category><![CDATA[AI]]></category><category><![CDATA[cloudflare]]></category><category><![CDATA[cricket]]></category><category><![CDATA[React]]></category><category><![CDATA[TypeScript]]></category><dc:creator><![CDATA[Inguva Dev]]></dc:creator><pubDate>Sat, 28 Mar 2026 15:17:49 GMT</pubDate><content:encoded><![CDATA[<p>Every IPL season, our friend group has the same problem: someone creates a WhatsApp poll, half the group forgets to pick their team, and the whole thing dies by match two. This year I decided to fix that properly — a real web app, with an AI that suggests your Best XI, live match scores, and a leaderboard. Here's how I built it in a weekend and what I learned along the way.</p>
<hr />
<h2 id="heading-the-product">The Product</h2>
<p><strong>cric.inguva.dev</strong> — an IPL 2026 fantasy cricket league built for a small private group.</p>
<p>Features:</p>
<ul>
<li>Register / login (email+password or Google Sign-In)</li>
<li>Pick your fantasy XI from the full IPL 2026 squad with budget and role constraints</li>
<li>AI-generated Best XI suggestion powered by Claude</li>
<li>Post-toss playing 11 entry — AI re-picks only from confirmed players</li>
<li>Live match scores</li>
<li>Leaderboard with team drill-down</li>
<li>Share your AI suggestion to iMessage / clipboard with one tap</li>
</ul>
<hr />
<h2 id="heading-the-stack">The Stack</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Layer</td><td>Technology</td></tr>
</thead>
<tbody>
<tr>
<td>API</td><td><a target="_blank" href="https://hono.dev">Hono</a> on Cloudflare Workers</td></tr>
<tr>
<td>Database</td><td>Cloudflare D1 (SQLite at the edge)</td></tr>
<tr>
<td>Cache / KV</td><td>Cloudflare Workers KV</td></tr>
<tr>
<td>Frontend</td><td>React + Vite + TypeScript</td></tr>
<tr>
<td>Styling</td><td>Tailwind CSS</td></tr>
<tr>
<td>State</td><td>TanStack Query + Zustand</td></tr>
<tr>
<td>Auth</td><td>PBKDF2 password hashing + JWT, Google OAuth</td></tr>
<tr>
<td>AI</td><td>Anthropic Claude (claude-sonnet-4-6)</td></tr>
<tr>
<td>Deploy</td><td>Cloudflare Pages + Workers</td></tr>
</tbody>
</table>
</div><p>Everything runs on Cloudflare's free tier. No servers, no containers, no ops.</p>
<hr />
<h2 id="heading-architecture">Architecture</h2>
<pre><code>Browser (React SPA on Cloudflare Pages)
        │
        │  /api<span class="hljs-comment">/*  (proxied in dev, custom domain in prod)
        ▼
Cloudflare Worker (Hono router)
        │
        ├── D1 (SQLite) — users, players, fantasy teams
        └── KV — AI suggestion cache, toss/playing 11 data
                │
                └── Anthropic API — Best XI generation</span>
</code></pre><p>The frontend is a single-page React app deployed to Cloudflare Pages. The API is a Hono app running on a Cloudflare Worker. They talk over <code>cric-api.inguva.dev</code>. D1 holds all the relational data; KV is used purely as a cache and ephemeral store for the day's toss data.</p>
<hr />
<h2 id="heading-the-ai-best-xi-making-claude-a-fantasy-analyst">The AI Best XI — Making Claude a Fantasy Analyst</h2>
<p>This was the most interesting part of the build.</p>
<p>The prompt gives Claude the full eligible player list with IDs, roles, credits, overseas status, and historical points. It also includes today's match schedule (fetched live from the IPL stats feed), pitch/venue context, the confirmed toss result if available, and the full scoring system breakdown.</p>
<p>Claude returns a JSON object with:</p>
<ul>
<li>11 player IDs</li>
<li>Captain and vice-captain</li>
<li>Role counts, total credits, overseas count</li>
<li>Pitch analysis, strategy, captain reasoning, VC reasoning</li>
<li>2-3 differential picks with reasons</li>
</ul>
<pre><code class="lang-json">{
  <span class="hljs-attr">"players"</span>: [<span class="hljs-number">455</span>, <span class="hljs-number">461</span>, <span class="hljs-number">472</span>],
  <span class="hljs-attr">"captain_id"</span>: <span class="hljs-number">461</span>,
  <span class="hljs-attr">"vice_captain_id"</span>: <span class="hljs-number">472</span>,
  <span class="hljs-attr">"total_credits"</span>: <span class="hljs-number">98.5</span>,
  <span class="hljs-attr">"overseas_count"</span>: <span class="hljs-number">4</span>,
  <span class="hljs-attr">"role_counts"</span>: {<span class="hljs-attr">"WK"</span>: <span class="hljs-number">1</span>, <span class="hljs-attr">"BAT"</span>: <span class="hljs-number">4</span>, <span class="hljs-attr">"AR"</span>: <span class="hljs-number">2</span>, <span class="hljs-attr">"BOWL"</span>: <span class="hljs-number">4</span>},
  <span class="hljs-attr">"pitch_analysis"</span>: <span class="hljs-string">"Wankhede is a batting paradise..."</span>,
  <span class="hljs-attr">"strategy"</span>: <span class="hljs-string">"Load up on MI batters..."</span>,
  <span class="hljs-attr">"captain_reasoning"</span>: <span class="hljs-string">"Rohit Sharma opens at Wankhede..."</span>,
  <span class="hljs-attr">"differential_picks"</span>: [{<span class="hljs-attr">"id"</span>: <span class="hljs-number">502</span>, <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Tilak Varma"</span>, <span class="hljs-attr">"reason"</span>: <span class="hljs-string">"..."</span>}]
}
</code></pre>
<h3 id="heading-the-hard-part-claude-doesnt-always-follow-the-rules">The hard part: Claude doesn't always follow the rules</h3>
<p>Fantasy cricket has strict constraints: exactly 11 players, ≤100 credits, max 4 overseas, minimum role counts. Claude occasionally violates one of these — usually the budget (it picks too many premium players) or overseas count.</p>
<p>My fix was a two-layer validation system on the server:</p>
<p><strong>Layer 1 — Retry with correction prompt.</strong> If violations are found, I send Claude the original conversation plus a correction message listing exactly which constraints were broken. This fixes structural violations (wrong role counts, wrong player count) almost every time.</p>
<p><strong>Layer 2 — Algorithmic budget fix.</strong> If the team is still over 100 credits after the retry, I run a greedy swap: find the most expensive player whose role has surplus players, swap them with the cheapest available alternative of the same role. Repeat until within budget.</p>
<p>I never cache a violating suggestion, and I bump the KV cache key whenever the validation logic changes — otherwise stale suggestions stick around.</p>
<h3 id="heading-post-toss-only-pick-from-confirmed-players">Post-toss: only pick from confirmed players</h3>
<p>The real-world use case is: toss happens, playing 11 is announced, <em>then</em> you finalise your fantasy team. After someone enters the playing 11 (two text areas, one name per line), the AI should only consider those 22 players.</p>
<p>The tricky part was name matching. The IPL feed uses formats like "V Kohli" or "Virat Kohli" interchangeably. My fuzzy matcher:</p>
<pre><code class="lang-typescript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">isInPlaying11</span>(<span class="hljs-params">dbName: <span class="hljs-built_in">string</span>, playing11Names: <span class="hljs-built_in">string</span>[]</span>): <span class="hljs-title">boolean</span> </span>{
  <span class="hljs-keyword">const</span> norm = normName(dbName);
  <span class="hljs-keyword">const</span> normParts = norm.split(<span class="hljs-regexp">/\s+/</span>);
  <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> n <span class="hljs-keyword">of</span> playing11Names) {
    <span class="hljs-keyword">const</span> normN = normName(n);
    <span class="hljs-keyword">if</span> (normN === norm) <span class="hljs-keyword">return</span> <span class="hljs-literal">true</span>;
    <span class="hljs-keyword">const</span> shorter = normN.length &lt;= norm.length ? normN : norm;
    <span class="hljs-keyword">const</span> longer  = normN.length &lt;= norm.length ? norm  : normN;
    <span class="hljs-keyword">if</span> (shorter.length &gt;= <span class="hljs-number">5</span> &amp;&amp; longer.includes(shorter)) <span class="hljs-keyword">return</span> <span class="hljs-literal">true</span>;
    <span class="hljs-keyword">const</span> lastName = normParts[normParts.length - <span class="hljs-number">1</span>];
    <span class="hljs-keyword">if</span> (lastName.length &gt; <span class="hljs-number">4</span> &amp;&amp; normN.includes(lastName)) <span class="hljs-keyword">return</span> <span class="hljs-literal">true</span>;
  }
  <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span>;
}
</code></pre>
<h3 id="heading-the-race-condition-i-didnt-see-coming">The race condition I didn't see coming</h3>
<p>Cloudflare KV is eventually consistent. When the playing 11 is saved, I delete the AI suggestion cache key. But "delete" doesn't propagate globally in under a millisecond. If the frontend immediately fires a <code>GET /suggest/best11</code> (without <code>?refresh=1</code>), the Worker might read the old cached value before the delete has propagated — and serve the stale suggestion that includes players not in the playing 11.</p>
<p>The fix: after posting the playing 11, the frontend directly calls <code>GET /suggest/best11?refresh=1</code>, which skips the KV read entirely and forces a fresh generation. It then uses <code>setQueryData</code> to inject the result into TanStack Query's cache, avoiding a second fetch.</p>
<pre><code class="lang-typescript">onSuccess: <span class="hljs-keyword">async</span> () =&gt; {
  queryClient.invalidateQueries({ queryKey: [<span class="hljs-string">'toss-status'</span>] });
  <span class="hljs-keyword">const</span> fresh = <span class="hljs-keyword">await</span> api.suggest.best11(<span class="hljs-literal">true</span>); <span class="hljs-comment">// ?refresh=1 bypasses KV</span>
  queryClient.setQueryData([<span class="hljs-string">'ai-suggestion'</span>], fresh);
},
</code></pre>
<hr />
<h2 id="heading-hono-sub-router-gotcha">Hono Sub-Router Gotcha</h2>
<p>Hono v4 has a subtle behavior with sub-router root paths. If you do:</p>
<pre><code class="lang-typescript">app.route(<span class="hljs-string">'/api/teams'</span>, teamsRouter);
teamsRouter.post(<span class="hljs-string">'/'</span>, handler); <span class="hljs-comment">// does NOT match POST /api/teams/ in production</span>
</code></pre>
<p>The root path of a mounted sub-router doesn't match in Cloudflare Workers production (it works fine in local dev, which makes it extra confusing). The fix is to give every route a non-empty path:</p>
<pre><code class="lang-typescript">teamsRouter.post(<span class="hljs-string">'/create'</span>, handler); <span class="hljs-comment">// works</span>
</code></pre>
<p>This cost me about an hour of debugging a "not found" error that only appeared in production.</p>
<hr />
<h2 id="heading-player-credits-the-calibration-problem">Player Credits: The Calibration Problem</h2>
<p>I initially set player credits on a 7–13 scale because I thought bigger numbers looked more meaningful. Bad idea. The real fantasy.iplt20.com uses a 7–10.5 scale, which means the 100-credit budget is genuinely tight — you have to make real trade-offs between premium players and value picks. With a 13-credit ceiling you can pack in premium players and the budget constraint becomes trivial.</p>
<p>I re-seeded the entire database with accurate credits. One gotcha: Cloudflare D1 (SQLite) auto-increment IDs never reset on <code>DELETE</code> — they continue from the last highest ID. So re-seeding bumps all player IDs. I bumped the KV cache key to invalidate all stale AI suggestions.</p>
<hr />
<h2 id="heading-the-share-button">The Share Button</h2>
<p>One of the most-used features turned out to be the simplest: a Share button that formats the AI suggestion as text and sends it to the iMessage group.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">handleShare</span>(<span class="hljs-params"></span>) </span>{
  <span class="hljs-keyword">const</span> text = buildShareText();
  <span class="hljs-keyword">if</span> (navigator.share) {
    <span class="hljs-keyword">await</span> navigator.share({ title: <span class="hljs-string">'AI Best XI'</span>, text });
    <span class="hljs-keyword">return</span>;
  }
  <span class="hljs-keyword">await</span> navigator.clipboard.writeText(text);
  setCopied(<span class="hljs-literal">true</span>);
  <span class="hljs-built_in">setTimeout</span>(<span class="hljs-function">() =&gt;</span> setCopied(<span class="hljs-literal">false</span>), <span class="hljs-number">2500</span>);
}
</code></pre>
<p><code>navigator.share()</code> triggers the native iOS share sheet (perfect for iMessage). Desktop falls back to clipboard copy with a 2.5-second "Copied!" confirmation. Two lines of product code, but the thing people actually use most.</p>
<hr />
<h2 id="heading-whats-next">What's Next</h2>
<ul>
<li>Score entry UI (admin updates player points after each match)</li>
<li>Auto-scoring via IPL stats feed</li>
<li>Transfer window — limited swaps after the tournament starts</li>
<li>Head-to-head mini-leagues</li>
</ul>
<hr />
<h2 id="heading-final-thoughts">Final Thoughts</h2>
<p>The whole thing took a weekend. Cloudflare's stack (Workers + D1 + KV + Pages) is genuinely excellent for this kind of project — you get a globally distributed backend with zero cold starts, a relational database, a cache, and static hosting, all on a free tier with a single <code>wrangler deploy</code>.</p>
<p>The AI integration was the most fun to build and the most work to get right. Claude is good at fantasy cricket strategy but needs guardrails — the constraint validation + algorithmic fallback pattern is something I'd reuse in any domain where an LLM needs to output structured data that satisfies hard rules.</p>
<p>If you're building a small internal tool for a friend group, skip the traditional backend infra and go straight to Workers + D1. You'll spend your time on product, not ops.</p>
<hr />
<p><em>Built with Cloudflare Workers, Hono, React, and Claude. Live at <a target="_blank" href="https://cric.inguva.dev">cric.inguva.dev</a>.</em></p>
]]></content:encoded></item><item><title><![CDATA[Building a Description Templates App for Jira with Atlassian Forge]]></title><description><![CDATA[If your team creates a lot of Jira issues, you've probably noticed that the description field is almost always blank. People fill it in differently every time or not at all. This post covers how I bui]]></description><link>https://blog.inguva.dev/building-a-description-templates-app-for-jira-with-atlassian-forge</link><guid isPermaLink="true">https://blog.inguva.dev/building-a-description-templates-app-for-jira-with-atlassian-forge</guid><category><![CDATA[atlassian-forge]]></category><category><![CDATA[JIRA]]></category><category><![CDATA[atlassian]]></category><category><![CDATA[Productivity]]></category><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Inguva Dev]]></dc:creator><pubDate>Thu, 19 Mar 2026 23:32:22 GMT</pubDate><content:encoded><![CDATA[<p>If your team creates a lot of Jira issues, you've probably noticed that the description field is almost always blank. People fill it in differently every time or not at all. This post covers how I built a Forge app that pre-fills the description field in Jira's create dialog based on the issue type, so teams always start from a consistent template.</p>
<h2>What it does</h2>
<ul>
<li><p>Project admins configure rich text templates per issue type in Project Settings</p>
</li>
<li><p>When anyone opens the "Create issue" dialog for a configured issue type, the description is automatically pre-filled with the template</p>
</li>
<li><p>Users can freely edit it before submitting, it's just a starting point</p>
</li>
</ul>
<p>The app appears in the project sidebar under <strong>Apps → Description Templates</strong>.</p>
<img src="https://cdn.hashnode.com/uploads/covers/69b71990f4eb2f8b04ec0ee4/1a0dfd2a-28d7-492d-813b-8f4a787521da.png" alt="" style="display:block;margin:0 auto" />

<hr />
<h2>Tech stack</h2>
<ul>
<li><p><strong>Atlassian Forge</strong> :: serverless platform for building Jira/Confluence apps</p>
</li>
<li><p><strong>Forge UI Kit 2</strong> :: React-based component library (<code>@forge/react</code>)</p>
</li>
<li><p><strong>Jira UI Modifications API</strong> :: the mechanism that injects content into the create dialog</p>
</li>
<li><p><strong>Forge Storage</strong> :: key-value store for persisting templates</p>
</li>
</ul>
<hr />
<h2>The UI</h2>
<h3>Empty state</h3>
<p>When no templates are configured, the page shows a clear empty state with an "Add template" button in the top right , one clear call to action, no clutter.</p>
<img src="https://cdn.hashnode.com/uploads/covers/69b71990f4eb2f8b04ec0ee4/000ac4a8-92a8-4a4c-a58f-ec410a2ab841.png" alt="" style="display:block;margin:0 auto" />

<h3>Adding a template</h3>
<p>Clicking "Add template" opens the add view. You pick a work type from a dropdown (only unconfigured types appear),</p>
<img src="https://cdn.hashnode.com/uploads/covers/69b71990f4eb2f8b04ec0ee4/bc7b5d7b-52f8-4f38-a891-a64b282b1574.png" alt="" style="display:block;margin:0 auto" />

<p>then write the template in a full rich text editor, the same <code>CommentEditor</code> component Jira uses natively. You get headings, lists, code blocks, links, colors, and more.</p>
<img src="https://cdn.hashnode.com/uploads/covers/69b71990f4eb2f8b04ec0ee4/58cbf961-a4c9-419e-9d37-987a27fe81c0.png" alt="" style="display:block;margin:0 auto" />

<h3>List view with Edit and Delete</h3>
<p>Once saved, the template appears in the list with <strong>Edit</strong> and <strong>Delete</strong> actions. A success banner confirms the save. Each configured work type gets its own row.</p>
<img src="https://cdn.hashnode.com/uploads/covers/69b71990f4eb2f8b04ec0ee4/19ace3c6-217b-4a00-a771-fae23b71d367.png" alt="" style="display:block;margin:0 auto" />

<p>The "Add template" button stays visible for any remaining unconfigured types.</p>
<h3>Editing an existing template</h3>
<p>Clicking Edit takes you straight to the editor pre-filled with the existing template. No need to re-select the work type.</p>
<img src="https://cdn.hashnode.com/uploads/covers/69b71990f4eb2f8b04ec0ee4/51f31a7d-0fff-4dfd-a59c-b4f69c6f9b7a.png" alt="" style="display:block;margin:0 auto" />

<p>You can also toggle to a <strong>Preview</strong> mode to see how the template will render before saving.</p>
<img src="https://cdn.hashnode.com/uploads/covers/69b71990f4eb2f8b04ec0ee4/47c64c31-ca9f-4fcc-88d5-6ee6ff6e2fe6.png" alt="" style="display:block;margin:0 auto" />

<hr />
<h2>The payoff - create dialog pre-fill</h2>
<p>When a user opens the create dialog for a configured issue type, the description field is already filled in with the template. They just fill in the blanks. Zero extra clicks.</p>
<img src="https://cdn.hashnode.com/uploads/covers/69b71990f4eb2f8b04ec0ee4/a56b3ad6-4092-4f6a-bf7b-134f42cb42d6.png" alt="" style="display:block;margin:0 auto" />

<hr />
<h2>Architecture</h2>
<p>The app has three parts:</p>
<h3>1. Settings page (<code>jira:projectSettingsPage</code>)</h3>
<p>A React UI (UI Kit 2) where admins pick a work type and write a template using <code>CommentEditor</code>. Templates are saved to Forge Storage and a UI Modification is registered via the Jira REST API.</p>
<h3>2. Resolver functions</h3>
<p>Serverless functions that handle:</p>
<ul>
<li><p><code>getIssueTypesWithTemplates</code> : fetches issue types for the project and merges in saved templates</p>
</li>
<li><p><code>saveTemplate</code> : persists the ADF to storage and registers/updates/deletes the UI Modification</p>
</li>
</ul>
<h3>3. UIM script (<code>jira:uiModifications</code>)</h3>
<p>A lightweight browser bundle that runs when the create dialog opens. It reads the ADF from the registered UI Modification and calls <code>api.getFieldById('description').setValue(adf)</code>.</p>
<hr />
<h2>Key lessons learned</h2>
<h3>1. Always call <code>ForgeReconciler.render()</code></h3>
<p>UI Kit 2 apps show a skeleton forever if you forget this at the bottom of your entry file:</p>
<pre><code class="language-js">ForgeReconciler.render(&lt;App /&gt;);
</code></pre>
<p>It's easy to miss when starting from scratch.</p>
<h3>2. Use <code>asUser()</code> for project reads, <code>asApp()</code> for UI Modification CRUD</h3>
<p>The Jira UI Modifications API requires app-level credentials , <code>asUser()</code> returns 403. But reading project data works better with <code>asUser()</code> since it uses the logged-in user's permissions.</p>
<pre><code class="language-js">// Fetch issue types - use asUser()
const res = await asUser().requestJira(route`/rest/api/3/project/${projectId}`, {
  headers: { Accept: 'application/json' },
});

// Register UI Modification - use asApp()
const postRes = await asApp().requestJira(route`/rest/api/3/uiModifications`, {
  method: 'POST',
  headers: { 'Content-Type': 'application/json', Accept: 'application/json' },
  body: JSON.stringify(payload),
});
</code></pre>
<h3>3. Always use the <code>route</code> tagged template literal</h3>
<pre><code class="language-js">// ❌ Wrong - throws "You must create your route using the 'route' export"
await requestJira(`/rest/api/3/project/${id}`);

// ✅ Correct
import { route } from '@forge/api';
await requestJira(route`/rest/api/3/project/${id}`);
</code></pre>
<h3>4. <code>viewType</code> must be <code>GIC</code>, not <code>CREATE_ISSUE</code></h3>
<p>The correct viewType for the create dialog is <code>GIC</code> (Global Issue Create). Using <code>CREATE_ISSUE</code> returns a 400 Bad Request with a confusing error message.</p>
<pre><code class="language-js">contexts: [{ projectId, issueTypeId, viewType: 'GIC' }]
</code></pre>
<p>And in <code>manifest.yml</code>:</p>
<pre><code class="language-yaml">jira:uiModifications:
  - key: description-template-uim
    resource: uim-resource
    viewType:
      - GIC
</code></pre>
<p>Without <code>viewType</code> in the manifest, Jira refuses to load the UIM script and shows: <em>"We couldn't load some of the UI modifications apps for this page, because they don't have required scopes."</em> , a misleading error that actually means the module isn't configured correctly.</p>
<h3>5. Don't mix classic and granular scopes</h3>
<p>Mixing them causes UIM scripts to silently fail to load. Stick to classic scopes only:</p>
<pre><code class="language-yaml">permissions:
  scopes:
    - read:jira-user
    - read:jira-work
    - write:jira-work
    - manage:jira-configuration
    - storage:app
</code></pre>
<h3>6. The UIM <code>onInit</code> callback must be synchronous</h3>
<p><code>uiModificationsApi.onInit</code> doesn't await promises. If you pass an async function, <code>invoke</code> calls will never resolve and the field won't be set. Keep it synchronous and read the ADF directly from <code>uiModifications[0].data</code>:</p>
<pre><code class="language-js">import { uiModificationsApi } from '@forge/jira-bridge';

uiModificationsApi.onInit(
  ({ api, uiModifications }) =&gt; {
    if (!uiModifications?.length) return;
    const rawData = uiModifications[0].data;
    if (!rawData) return;
    let adf;
    try { adf = JSON.parse(rawData); } catch { return; }
    api.getFieldById('description')?.setValue(adf);
  },
  () =&gt; ['description']
);
</code></pre>
<h3>7. For team-managed projects, use the project endpoint for issue types</h3>
<p>The global <code>/rest/api/3/issuetype</code> endpoint returns an empty array for team-managed (next-gen) projects. Fetch issue types from the project endpoint instead:</p>
<pre><code class="language-js">const res = await asUser().requestJira(route`/rest/api/3/project/${projectId}`);
const body = await res.json();
const issueTypes = body.issueTypes.filter((it) =&gt; !it.subtask);
</code></pre>
<hr />
<h2>Wrapping up</h2>
<p>The combination of Forge Storage + Jira UI Modifications is a powerful pattern for contextual defaults in Jira. The main gotchas are around scopes, the <code>viewType</code> value, and keeping the UIM script synchronous. Once those are sorted, the result is seamless and users get a pre-filled description the moment they open the create dialog, with no extra clicks required.</p>
]]></content:encoded></item><item><title><![CDATA[How I Built Kernel: An AI-Powered IT Helpdesk That Deflects 80% of Support Tickets]]></title><description><![CDATA[A story of LangGraph, Claude AI, Okta, Slack, and the chaos of deploying to GKE without a CI/CD pipeline.

The Problem That Started It All
It was another Monday morning, and my Slack was already drown]]></description><link>https://blog.inguva.dev/how-i-built-kernel-an-ai-powered-it-helpdesk-that-deflects-80-of-support-tickets</link><guid isPermaLink="true">https://blog.inguva.dev/how-i-built-kernel-an-ai-powered-it-helpdesk-that-deflects-80-of-support-tickets</guid><category><![CDATA[AI]]></category><category><![CDATA[Python]]></category><category><![CDATA[FastAPI]]></category><category><![CDATA[langgraph]]></category><category><![CDATA[okta]]></category><category><![CDATA[slack]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[GCP]]></category><category><![CDATA[Devops]]></category><category><![CDATA[IT Automation]]></category><dc:creator><![CDATA[Inguva Dev]]></dc:creator><pubDate>Thu, 19 Mar 2026 02:47:36 GMT</pubDate><content:encoded><![CDATA[<p><em>A story of LangGraph, Claude AI, Okta, Slack, and the chaos of deploying to GKE without a CI/CD pipeline.</em></p>
<hr />
<h2>The Problem That Started It All</h2>
<p>It was another Monday morning, and my Slack was already drowning.</p>
<blockquote>
<p>"Hey, can someone add me to the GitHub security team?" "I forgot my VPN password again." "Who do I ask for access to Salesforce?" "Is the dev environment down or just slow?"</p>
</blockquote>
<p>Same questions. Different people. Every single week.</p>
<p>Our IT team was spending more time copy-pasting the same Confluence links and filing the same Jira tickets than actually solving hard problems. We weren't understaffed, we were just inefficient. And I had a hypothesis: <strong>most of these requests follow a pattern</strong>. If they follow a pattern, they can be automated.</p>
<p>So I built <strong>Kernel</strong>, an AI-powered IT deflection system that lives in Slack, understands what employees need, and either resolves it automatically or escalates it gracefully to Jira Service Management.</p>
<p>This is the story of how it works, what I learned, and why building it almost broke me (in the best possible way).</p>
<hr />
<h2>What Kernel Does (The 60-Second Version)</h2>
<ol>
<li><p>An employee asks something in Slack , "Can I get access to the Lucid App in Okta?"</p>
</li>
<li><p>Kernel intercepts it, classifies the intent, and checks if there's a published playbook or KB article.</p>
</li>
<li><p>If it's an access request, Kernel looks up the right Okta group, finds the approver, and sends them a DM with approve/reject buttons.</p>
</li>
<li><p>If approved, it automatically provisions the access via the Okta API.</p>
</li>
<li><p>If it's a how-to question, it retrieves the most relevant Confluence docs using semantic search and replies with a formatted answer.</p>
</li>
<li><p>If it's something it can't handle, it creates a Jira ticket and keeps the user informed.</p>
</li>
</ol>
<p>The result: <strong>~80% of routine IT requests resolved without human involvement</strong>.</p>
<hr />
<h2>The Architecture: Standing on Many Shoulders</h2>
<p>Before I dive into the code, here's the tech stack at a glance:</p>
<table>
<thead>
<tr>
<th>Layer</th>
<th>Technology</th>
</tr>
</thead>
<tbody><tr>
<td><strong>AI Orchestration</strong></td>
<td>LangGraph (stateful agent graph)</td>
</tr>
<tr>
<td><strong>Language Model</strong></td>
<td>Claude Sonnet via Anthropic API / Vertex AI</td>
</tr>
<tr>
<td><strong>Backend</strong></td>
<td>FastAPI (Python 3.11, async throughout)</td>
</tr>
<tr>
<td><strong>Database</strong></td>
<td>PostgreSQL 16 + pgvector (for RAG embeddings)</td>
</tr>
<tr>
<td><strong>Cache / Broker</strong></td>
<td>Redis 7</td>
</tr>
<tr>
<td><strong>Async Tasks</strong></td>
<td>Celery (3 queues: critical, default, low)</td>
</tr>
<tr>
<td><strong>Identity</strong></td>
<td>Okta (SSO, user/group API, SCIM provisioning, OIDC)</td>
</tr>
<tr>
<td><strong>Messaging</strong></td>
<td>Slack Bolt for Python</td>
</tr>
<tr>
<td><strong>Ticketing</strong></td>
<td>Jira Service Management REST API</td>
</tr>
<tr>
<td><strong>KB</strong></td>
<td>Confluence (with incremental sync via CQL)</td>
</tr>
<tr>
<td><strong>Infrastructure</strong></td>
<td>GKE (Google Kubernetes Engine), Cloud SQL, Memorystore, Secret Manager</td>
</tr>
</tbody></table>
<p>It sounds like a lot, because it is. But each piece has a very clear job.</p>
<hr />
<h2>The Brain: A LangGraph Agent</h2>
<p>The heart of Kernel is a <strong>LangGraph state machine</strong> — not a simple LLM call, but a directed graph of nodes that each do one thing well.</p>
<p>Here's how the graph flows:</p>
<pre><code class="language-plaintext">User Message
     │
     ▼
[intent_classifier]
     │
     ├─── unclear ──► [clarification_asker] ──► END
     │
     ▼
[playbook_matcher]
     │
     ├─── match ──► [playbook_executor] ──► END
     │
     ▼
[rag_retriever]
     │
     ├─── access_request ──► [okta_checker] ──► [response_composer] ──► END
     │
     ├─── KB hit ──────────────────────────► [response_composer] ──► END
     │
     └─── KB miss ──► [jira_escalator] ──► [response_composer] ──► END
</code></pre>
<p>Why LangGraph? Because I needed <strong>stateful, branching logic</strong> — not a flat chain of prompts. When a user asks for access, I need to:</p>
<ol>
<li><p>Identify which system they want access to</p>
</li>
<li><p>Find the right Okta group (with fuzzy matching and semantic ranking)</p>
</li>
<li><p>Check if they already have access</p>
</li>
<li><p>Determine who the approver is</p>
</li>
<li><p>Compose a different response depending on all of the above</p>
</li>
</ol>
<p>A simple LLM call can't do that reliably. A graph can.</p>
<p>The state object that flows through the graph has <strong>over 60 fields</strong>, everything from the original Slack message to the matched Okta group, confidence scores, playbook outputs, and the final Block Kit response.</p>
<hr />
<h2>The Intent Classifier: Where It All Starts</h2>
<p>Every message starts with classification. I use Claude to categorize the request into one of five intents:</p>
<ul>
<li><p><code>access_request</code> : "Can I get access to X?"</p>
</li>
<li><p><code>how_to</code> : "How do I configure Y?"</p>
</li>
<li><p><code>incident</code> : "Z is broken / down"</p>
</li>
<li><p><code>password_reset</code> : "I can't log into W"</p>
</li>
<li><p><code>other</code> : "I need to talk to someone"</p>
</li>
</ul>
<p>The confidence thresholds are configurable per intent (and tuned from real data):</p>
<pre><code class="language-python">THRESHOLD_ACCESS_REQUEST = 0.30  # Low threshold — better to try than miss
THRESHOLD_PASSWORD_RESET = 0.85  # High threshold — wrong action causes user pain
THRESHOLD_INCIDENT = 0.75
THRESHOLD_HOW_TO = 0.70
</code></pre>
<p>The low threshold for access requests was intentional. If someone says "I need to get into the finance Jira project" , that's almost certainly an access request even if it's phrased ambiguously. Better to engage the access flow than ignore it.</p>
<hr />
<h2>RAG: Teaching Kernel to Know What the IT Team Knows</h2>
<p>For <code>how_to</code> requests, Kernel retrieves answers from our internal knowledge base using <strong>Retrieval-Augmented Generation (RAG)</strong> with pgvector.</p>
<p>The pipeline:</p>
<ol>
<li><p><strong>Ingestion</strong>: A background job pulls pages from Confluence (via CQL polling — no webhook admin access needed), chunks them, and generates embeddings using a sentence-transformer model.</p>
</li>
<li><p><strong>Retrieval</strong>: At query time, the user's message is embedded and compared against the KB using cosine similarity (<code>pgvector</code> operator <code>&lt;=&gt;</code>) to find the top-K most relevant chunks.</p>
</li>
<li><p><strong>Generation</strong>: Claude synthesizes those chunks into a clear, formatted answer with links to source pages.</p>
</li>
</ol>
<p>The incremental sync is particularly clever, instead of re-indexing everything on a schedule, it uses CQL's <code>lastModified</code> filter to only pull pages changed since the last run:</p>
<pre><code class="language-python">cql = f"space in ({space_list}) AND lastModified &gt;= '{since_str}' ORDER BY lastModified ASC"
</code></pre>
<p>This keeps the index fresh without hammering the Confluence API.</p>
<hr />
<h2>The Okta Problem: Matching Groups at Scale</h2>
<p>Here's the part that surprised me most: <strong>resolving which Okta group a user actually wants</strong>.</p>
<p>When someone says "Can I get access to the data engineering Slack channel?", they don't say "okta-group-data-eng-slack-notifications-prod". They say "data engineering Slack channel."</p>
<p>I built a multi-signal matching pipeline:</p>
<ol>
<li><p><strong>Alias matching</strong> — each Okta group has an <code>AKA</code> custom attribute (e.g. "de-slack", "data engineering", "data-eng")</p>
</li>
<li><p><strong>Fuzzy string matching</strong> — Levenshtein distance for typos</p>
</li>
<li><p><strong>Semantic ranking</strong> — embedding similarity between the request and group descriptions</p>
</li>
<li><p><strong>Claude reranking</strong> — final pass using the LLM with full context</p>
</li>
</ol>
<p>The approver for each group is also stored as a custom Okta attribute , so Kernel knows exactly who to ping for approval without any hardcoded config.</p>
<p>When access is approved, a <strong>Celery task</strong> on the <code>critical</code> queue provisions the membership via the Okta Groups API within seconds. If it fails, there's a dead-letter mechanism that logs to Redis and alerts via Slack.</p>
<hr />
<h2>Playbooks: IT Automation Without Code</h2>
<p>One of my favorite features is the <strong>Playbook system</strong>. It lets IT admins define multi-step workflows in a no-code/low-code editor that Kernel can execute.</p>
<p>A playbook might look like:</p>
<ol>
<li><p>Show the user a form asking for their department and use case</p>
</li>
<li><p>Make an HTTP call to Workato to trigger an RPA workflow</p>
</li>
<li><p>Based on the response, branch: if approved → message user; if pending → create Jira ticket</p>
</li>
</ol>
<p>The playbook executor handles:</p>
<ul>
<li><p><strong>Form rendering</strong> in Slack Block Kit modals</p>
</li>
<li><p><strong>Conditional branching</strong> based on LLM decisions or API response codes</p>
</li>
<li><p><strong>HTTP steps</strong> with templated bodies (user data interpolated from form inputs)</p>
</li>
<li><p><strong>Slack message steps</strong> with rich formatting</p>
</li>
</ul>
<p>Test versions of playbooks can be run in a dedicated test channel without affecting real users which made iteration fast.</p>
<hr />
<h2>JML: The Joiner/Mover/Leaver Automation</h2>
<p>One of the highest-ROI features wasn't AI at all , it was <strong>lifecycle automation</strong>.</p>
<p>Kernel listens to Okta Event Hook webhooks for three lifecycle events:</p>
<ul>
<li><p><strong>Joiner</strong> (new hire activates) → auto-add to standard groups, send welcome DM, create onboarding Jira ticket</p>
</li>
<li><p><strong>Mover</strong> (department change) → trigger access review, notify manager</p>
</li>
<li><p><strong>Leaver</strong> (deactivation) → revoke all access, open offboarding ticket, notify IT</p>
</li>
</ul>
<p>This replaced a manual checklist that took 30-45 minutes per employee. For a company onboarding dozens of people a month, the time savings added up fast.</p>
<hr />
<h2>The Background Task Architecture</h2>
<p>Kernel runs <strong>5 batches of background tasks</strong>, staggered on startup to avoid thundering-herd spikes on the database and Redis:</p>
<pre><code class="language-python"># Batch 0 (0s): follow-up checker + approval checker
# Batch 1 (3s): Okta sync + access expiry
# Batch 2 (6s): Confluence sync + SLA alerts + stale tickets
# Batch 3 (9s): digest + tips + access revocation + incident detector
# Batch 4 (12s): playbook scheduler + queue escalation + weekly report
# Batch 5 (15s): KB gap analysis + user profiles + shadow IT + trend forecast
</code></pre>
<p>Each batch introduces a 3-second delay before spawning its children. This simple trick eliminated the startup spike we were seeing in Cloud SQL connection pool metrics.</p>
<hr />
<h2>The Dashboard: Okta SSO + Redis Sessions</h2>
<p>The admin dashboard is a FastAPI-served HTML/JS single-page app protected by <strong>Okta OIDC authentication</strong>.</p>
<p>The flow:</p>
<ol>
<li><p>User hits <code>/</code> → checks Redis for a valid <code>kernel_session</code> cookie</p>
</li>
<li><p>If no session → redirect to <code>/auth/login</code> → redirect to Okta authorize endpoint</p>
</li>
<li><p>Okta redirects back to <code>/auth/callback?code=...&amp;state=...</code></p>
</li>
<li><p>State is verified against a Redis key (CSRF protection), code is exchanged for tokens</p>
</li>
<li><p>User info is fetched from Okta's <code>/v1/userinfo</code> endpoint</p>
</li>
<li><p><strong>Admin group membership is checked</strong> — only members of <code>App-Kernel-Admins</code> can proceed</p>
</li>
<li><p>Session token stored in Redis with configurable TTL, HTTP-only secure cookie set</p>
</li>
</ol>
<p>One bug that bit me hard: the Okta admin group check was <strong>case-sensitive</strong>. Our configmap had <code>APP-Kernel-Admins</code> but the actual Okta group was <code>App-Kernel-Admins</code>. Every login attempt was silently denied. It took me longer than I'd like to admit to spot that one.</p>
<hr />
<h2>SCIM: Letting Okta Manage Users Automatically</h2>
<p>Instead of manually managing which users have dashboard access, Kernel implements the <strong>SCIM 2.0 protocol</strong> — so Okta can automatically provision and deprovision dashboard accounts.</p>
<p>When an Okta admin assigns someone to the Kernel app:</p>
<ol>
<li><p>Okta sends a <code>POST /scim/v2/Users</code> request to Kernel</p>
</li>
<li><p>Kernel creates or updates the user in the database</p>
</li>
<li><p>The user can immediately log in with their Okta credentials</p>
</li>
</ol>
<p>The SCIM endpoint is protected by a Bearer token (<code>SCIM_BEARER_TOKEN</code>), and the entire <code>/scim/v2</code> path is whitelisted through Cloud Armor.</p>
<p>Speaking of Cloud Armor — connecting Okta's SCIM provisioning to a Cloud Armor-protected endpoint required allowlisting <strong>269 unique Okta egress IPs</strong> across 27 firewall rules. That was a fun afternoon.</p>
<hr />
<h2>Secrets: GCP Secret Manager in Production</h2>
<p>In production, there's no <code>.env</code> file. Secrets are loaded from <strong>GCP Secret Manager</strong> at startup, before any <code>Settings</code> objects are initialized:</p>
<pre><code class="language-python"># api/main.py — must run before ANYTHING else
from core.secret_manager import load_secrets_into_env
load_secrets_into_env()
</code></pre>
<p>The secret manager pulls a predefined list of secrets by name, injects them into <code>os.environ</code>, and then Pydantic's <code>Settings</code> picks them up as if they were environment variables.</p>
<p>This means local dev uses a <code>.env</code> file and production uses Secret Manager — with zero code changes. The <code>KERNEL_ENV</code> variable is the only switch:</p>
<pre><code class="language-plaintext">KERNEL_ENV=local      → use .env file
KERNEL_ENV=production → use GCP Secret Manager
</code></pre>
<hr />
<h2>Deploying to GKE (Without a CI/CD Pipeline)</h2>
<p>When I first needed to test changes in the dev cluster, I didn't have a CI/CD pipeline. So I learned the manual deploy workflow the hard way.</p>
<p>The gotcha that cost me an hour: <strong>building Docker images on Apple Silicon (M2) for GKE (x86_64)</strong>.</p>
<p>If you just run <code>docker build</code> on an M2 Mac, you get an ARM image. Deploy that to GKE and you get:</p>
<pre><code class="language-plaintext">exec /usr/local/bin/python3: exec format error
</code></pre>
<p>The fix is always:</p>
<pre><code class="language-bash">docker buildx build --platform linux/amd64 -t gcr.io/your-project/kernel:tag . --push
</code></pre>
<p>The deployment steps I use:</p>
<pre><code class="language-bash"># 1. Build and push (always linux/amd64)
docker buildx build --platform linux/amd64 \
  -t gcr.io/GCP-PROJECT-ID/kernel:$(git rev-parse --short HEAD) . --push

# 2. Update the deployment image
kubectl set image deployment/kernel-api \
  kernel=gcr.io/GCP-PROJECT-ID/kernel:$(git rev-parse --short HEAD) \
  -n kernel

# 3. Watch the rollout
kubectl rollout status deployment/kernel-api -n kernel

# 4. Check logs
kubectl logs -l app=kernel,role=api -n kernel --tail=50 -f
</code></pre>
<hr />
<h2>Observability: Knowing When Things Break</h2>
<h3>Sentry for Error Tracking</h3>
<p>Sentry is integrated with three integrations — FastAPI, SQLAlchemy, and Redis — and a custom <code>before_send</code> hook that strips PII before anything leaves the server:</p>
<pre><code class="language-python">def _before_send(event: dict, hint: dict) -&gt; dict | None:
    return redact_dict(event)
</code></pre>
<p>Health check routes are excluded from traces to avoid noise.</p>
<h3>PII Redaction in Logs</h3>
<p>Every log line passes through a <code>PIIRedactingFilter</code> that strips emails, phone numbers, SSNs, and API keys using regex patterns. This is non-negotiable when you're logging Slack messages that might contain personal data.</p>
<h3>Celery Worker Health</h3>
<p>A background loop pings Celery every 5 minutes and alerts the <code>#it-ops</code> Slack channel if no workers are detected. Okta provisioning runs on Celery, so a dead worker means access requests silently stall — exactly the kind of failure that's invisible until an employee escalates.</p>
<hr />
<h2>What I'd Do Differently</h2>
<p><strong>1. Start with playbooks, not custom agent code.</strong> The playbook system ended up being more powerful and more maintainable than custom agent nodes. I should have built it first and used it to prototype workflows before hardcoding anything.</p>
<p><strong>2. Set up CI/CD before anything else.</strong> Manually building Docker images and running kubectl commands is fine for a prototype. For anything beyond that, it creates too much friction. The deployment steps are well-documented now, but they should be automated.</p>
<p><strong>3. pgvector is deceptively powerful.</strong> I almost used a dedicated vector database (Pinecone, Weaviate). Using pgvector meant one fewer service to manage, and PostgreSQL's ACID guarantees made the KB index updates much simpler to reason about.</p>
<p><strong>4. Confidence thresholds need real data to tune.</strong> My initial thresholds were guesses. It took a few weeks of real traffic to calibrate them properly. Build in an A/B testing mechanism from the start.</p>
<p><strong>5. The Okta group AKA system saved us.</strong> Storing aliases as custom Okta attributes (instead of a separate database table) meant there was one source of truth. IT admins could update them directly in Okta without touching Kernel.</p>
<hr />
<h2>The Numbers (After 8 Weeks)</h2>
<ul>
<li><p><strong>78% deflection rate</strong> — 4 in 5 requests resolved without a human</p>
</li>
<li><p><strong>~45 seconds</strong> average time to resolution for access requests (vs. 2–4 hours manually)</p>
</li>
<li><p><strong>0 manual onboarding tickets</strong> since JML automation went live</p>
</li>
<li><p><strong>$0 in vector database costs</strong> — pgvector handles the load fine</p>
</li>
</ul>
<hr />
<h2>Open Questions and What's Next</h2>
<p>A few things I'm still working through:</p>
<ul>
<li><p><strong>Multi-tenant support</strong>: Right now Kernel is single-tenant. The architecture supports it, but the Okta group model would need per-tenant scoping.</p>
</li>
<li><p><strong>Teams adapter</strong>: There's a disabled Microsoft Teams route in the codebase. If we ever need it, the Slack Bolt patterns translate pretty cleanly.</p>
</li>
<li><p><strong>LLM evaluation</strong>: I want a proper offline eval suite so I can test model upgrades without deploying to prod first.</p>
</li>
<li><p><strong>Playbook versioning</strong>: Right now there's a "test" and a "published" version. A proper version history with rollback would make playbook management much safer.</p>
</li>
</ul>
<hr />
<h2>Final Thoughts</h2>
<p>Building Kernel taught me that <strong>the hardest problems weren't the AI parts</strong> — they were the integration problems. Getting Okta groups to match reliably. Getting Cloud Armor to cooperate with Okta's egress IPs. Getting Celery to behave gracefully when Redis restarts.</p>
<p>The AI is almost the easy part. Claude is remarkably good at intent classification and response composition when you give it well-structured context. LangGraph makes the stateful orchestration manageable. pgvector makes semantic search approachable without a PhD.</p>
<p>What makes a system like this actually work in production is <strong>all the boring stuff around the AI</strong>: the dead-letter queues, the PII redaction, the circuit breakers, the health checks, the SCIM provisioning, the audit logs.</p>
<p>If you're thinking about building something similar for your team, I'd encourage you to start small, just the intent classifier and a single escalation path to Jira. Get real data. Then expand. The architecture scales, but your mental model of the system needs to scale with it.</p>
<hr />
<p><em>Built with FastAPI, LangGraph, Claude AI (Anthropic), Okta, Slack Bolt, PostgreSQL + pgvector, Redis, Celery, and a lot of patience.</em></p>
]]></content:encoded></item><item><title><![CDATA[From Spreadsheets to Automation: Rethinking SOX User Access Reviews with Airflow, Okta, and AI]]></title><description><![CDATA[Every quarter, someone at your company exports a spreadsheet of who has access to what, emails it to a dozen app owners, and then spends the next two weeks chasing responses. When the responses finall]]></description><link>https://blog.inguva.dev/from-spreadsheets-to-automation-rethinking-sox-user-access-reviews-with-airflow-okta-and-ai</link><guid isPermaLink="true">https://blog.inguva.dev/from-spreadsheets-to-automation-rethinking-sox-user-access-reviews-with-airflow-okta-and-ai</guid><category><![CDATA[okta]]></category><category><![CDATA[Identity]]></category><category><![CDATA[automation]]></category><category><![CDATA[AI]]></category><category><![CDATA[airflow]]></category><category><![CDATA[SOX Compliance ]]></category><dc:creator><![CDATA[Inguva Dev]]></dc:creator><pubDate>Wed, 18 Mar 2026 04:53:20 GMT</pubDate><content:encoded><![CDATA[<hr />
<p>Every quarter, someone at your company exports a spreadsheet of who has access to what, emails it to a dozen app owners, and then spends the next two weeks chasing responses. When the responses finally come in, someone else manually revokes access, takes a screenshot, and drops it in a shared drive folder called something like "Q1 2026 UAR Evidence FINAL v3."</p>
<p>I've been that person. I've also been the engineer sitting next to that person thinking — this entire workflow is automatable.</p>
<p>So I automated it.</p>
<hr />
<h2>What I built</h2>
<p>A fully automated SOX User Access Review pipeline using tools I already had running:</p>
<ul>
<li><p>Apache Airflow on a self-hosted GCP VM</p>
</li>
<li><p>Okta developer tenant</p>
</li>
<li><p>Terraform for test data</p>
</li>
<li><p>Jira and Confluence free tier</p>
</li>
<li><p>Claude API for AI-powered risk scoring</p>
</li>
<li><p>Slack for notifications</p>
</li>
</ul>
<p>No enterprise licenses. No professional services engagement. Just Python, APIs, and a free Sunday.</p>
<hr />
<h2>The actual problem with UAR</h2>
<p>SOX compliance requires quarterly reviews of who has access to important systems like NetSuite, Salesforce, Workday, GitHub, whatever your company uses. The process needs three things:</p>
<p>A point-in-time snapshot of access that can't be retroactively edited. Certification or revocation from the app owner. Documented proof that revocations actually happened.</p>
<p>The reason this lives in spreadsheets is inertia, not complexity. The data is all there . Okta knows exactly who has access to what. The problem is nobody's connected the dots into an automated workflow.</p>
<hr />
<h2>Starting with realistic test data</h2>
<p>Before I could automate a review, I needed something worth reviewing. I used the Okta Terraform provider to create four SOX-scoped groups and ten test users spread across Finance, IT, HR, Sales, and one contractor.</p>
<p>The important part: I intentionally embedded real audit findings into the data.</p>
<p><code>dave.kim</code> is an IT Manager with access to <code>SOX-NetSuite-Admins</code>. That's a Segregation of Duties violation because IT shouldn't have admin access to the ERP that Finance uses.</p>
<p><code>ivan.petrov</code> is a Finance Contractor sitting in <code>SOX-NetSuite-Users</code>. Contractors with persistent ERP access is one of the first things external auditors flag.</p>
<p><code>carol.wong</code> is a Controller assigned to both NetSuite Admin and Workday Admin groups. Dual financial privilege across systems.</p>
<p>These three findings aren't hypothetical . I've seen all three in real environments. Having them in the test data made every demo conversation immediately credible.</p>
<hr />
<h2>The quarterly snapshot DAG</h2>
<p>The <code>uar_quarterly</code> DAG fires on the first of January, April, July, and October.</p>
<p>Task one pulls every Okta group prefixed with <code>SOX-</code>, fetches their current members, and ships the data to Claude with a prompt that reads roughly like a briefing to a SOX auditor: here's the access list, find SoD violations, contractor access, excessive privilege, and dormant accounts, return your findings as JSON.</p>
<p>Claude returns something like this:</p>
<pre><code class="language-json">{
  "system": "NetSuite",
  "overall_risk": "HIGH",
  "findings": [
    {
      "user": "dave.kim@company.com",
      "risk": "HIGH",
      "finding": "IT Manager with NetSuite Admin access. Creates Segregation of Duties violation. Recommend revoking Admin group membership."
    }
  ]
}
</code></pre>
<p>That JSON gets embedded into a Confluence page with a blue audit evidence panel at the top showing the exact UTC timestamp, the Airflow run ID, and a note to export to PDF for audit submission. The timestamp comes from Atlassian's server, not from my code which is what makes it credible as audit evidence.</p>
<p>Task two creates a Jira ticket per SOX system with the AI risk summary at the top and the Confluence page linked. Anything Claude flagged as HIGH risk gets Priority: Highest automatically.</p>
<p>Task three sends a Slack message with links to all the tickets and calls out which systems need immediate attention.</p>
<hr />
<h2>The revocation DAG</h2>
<p>This one runs daily during the 30-day review window.</p>
<p>App owners respond to their Jira ticket with a comment using a simple syntax:</p>
<pre><code class="language-plaintext">REVOKE: ivan.petrov@company.com
CERTIFY: all
</code></pre>
<p>The DAG reads every open UAR ticket, parses comments for those keywords, and for any REVOKE instruction it calls Okta's API directly to remove the user from the group. It then appends a red revocation record panel to the Confluence page timestamped, with the requester's name, the affected user, and the specific groups removed from. When everything is certified the ticket moves to Done automatically.</p>
<p>The Confluence page ends up being a complete audit trail. Snapshot at the top. Revocation evidence appended at the bottom. Auditors get a single URL they can export to PDF.</p>
<hr />
<h2>Wiring Jira to Airflow</h2>
<p>I didn't want app owners to need to know Airflow exists. They comment on a Jira ticket and things just happen.</p>
<p>A Jira Automation rule watches for comments matching <code>CERTIFY:</code> or <code>REVOKE:</code> on tickets labeled <code>uar</code>, and fires a webhook to Airflow's REST API to trigger the revocation DAG. The full loop of comment to Okta revocation to Confluence update to Slack notification just runs in under 30 seconds.</p>
<hr />
<h2>What auditors actually get</h2>
<table>
<thead>
<tr>
<th>Artifact</th>
<th>Where it lives</th>
<th>Why it holds up</th>
</tr>
</thead>
<tbody><tr>
<td>Access snapshot</td>
<td>Confluence page</td>
<td>Atlassian server timestamp in page history</td>
</tr>
<tr>
<td>AI risk findings</td>
<td>Embedded in snapshot</td>
<td>Reproducible from the same Okta data</td>
</tr>
<tr>
<td>App owner certification</td>
<td>Jira comment</td>
<td>Author and timestamp recorded by Jira</td>
</tr>
<tr>
<td>Revocation record</td>
<td>Confluence panel</td>
<td>Okta API call timestamp plus Airflow run ID</td>
</tr>
<tr>
<td>PDF export</td>
<td>Confluence export</td>
<td>System-generated header with timestamps</td>
</tr>
</tbody></table>
<p>The key design decision throughout was making sure timestamps come from the systems, not from my code. An auditor can verify the Confluence page history in Atlassian directly. They can check the Jira comment timestamp. They're not trusting my Python.</p>
<hr />
<h2>Three things I learned building this</h2>
<p>Terraform is the right tool for test data. A Python seed script would have worked, but Terraform gives you version-controlled, reviewable, idempotent data. When Okta's API rejected one of my test configurations, I pivoted to a different approach in minutes because the state was explicit.</p>
<p>Claude makes identity risk legible. The raw output from Okta a list of users and group memberships means nothing to an auditor. A sentence like "IT Manager with NetSuite Admin access creates Segregation of Duties violation :: recommend revoking Admin access" means everything. The AI layer doesn't replace the review, it makes the review faster and more consistent.</p>
<p>Free-tier constraints make better architecture. No enterprise Okta, no managed Airflow, no Jira Premium. Every design decision had to work within real limits, which meant simpler integrations, fewer dependencies, and a result that's more portable and easier to explain.</p>
<hr />
<h2>What I'm building next</h2>
<p>A reminder DAG that pings app owners daily when their review window is approaching 30 days.</p>
]]></content:encoded></item><item><title><![CDATA[Infrastructure as Code: Managing Okta, GCP, and Cloudflare with Terraform]]></title><description><![CDATA[Yesterday I automated employee onboarding with Okta and Airflow. The day before that, I built the entire platform from scratch for $10/month.
Today I asked a different question: what happens when I ne]]></description><link>https://blog.inguva.dev/infrastructure-as-code-managing-okta-gcp-and-cloudflare-with-terraform</link><guid isPermaLink="true">https://blog.inguva.dev/infrastructure-as-code-managing-okta-gcp-and-cloudflare-with-terraform</guid><category><![CDATA[Terraform]]></category><category><![CDATA[okta]]></category><category><![CDATA[IAM]]></category><category><![CDATA[cloudflare]]></category><category><![CDATA[Devops]]></category><dc:creator><![CDATA[Inguva Dev]]></dc:creator><pubDate>Tue, 17 Mar 2026 03:20:01 GMT</pubDate><content:encoded><![CDATA[<hr />
<p>Yesterday I automated employee onboarding with Okta and Airflow. The day before that, I built the entire platform from scratch for $10/month.</p>
<p>Today I asked a different question: what happens when I need to rebuild all of it?</p>
<p>Without Infrastructure as Code, the answer is: hours of clicking through dashboards, hoping you remember every setting, every DNS record, every Okta app configuration. With Terraform, the answer is: <code>terraform apply</code>.</p>
<p>This is the story of how I took everything I built and turned it into code.</p>
<hr />
<h2>Why Terraform</h2>
<p>I've been managing Okta, Cloudflare, and GCP through their respective UIs. It works — until it doesn't.</p>
<p>The problems with manual infrastructure management are subtle at first. A DNS record gets changed and nobody remembers why. An Okta app's redirect URI gets updated during a migration and the old value is lost. A firewall rule exists but nobody can explain when it was added or what it's for.</p>
<p>Terraform solves all of this. Every resource is defined in a <code>.tf</code> file, committed to Git, and applied through a controlled workflow. The state of your infrastructure becomes a fact, not a memory.</p>
<hr />
<h2>The Stack</h2>
<p>By the end of today, three providers are fully managed as code:</p>
<pre><code class="language-plaintext">terraform-inguva/
├── gcp/          ← VM, static IP, firewall rules
├── okta/         ← SSO app, groups, users, assignments
└── cloudflare/   ← A, CNAME, TXT, DMARC, SPF records
</code></pre>
<p>State for all three is stored in <strong>Terraform Cloud</strong> (free tier, up to 500 resources). Every plan and apply runs remotely with a full audit log.</p>
<hr />
<h2>Phase 1: GCP</h2>
<p>The GCP setup was straightforward. One VM, one static IP, two firewall rules. The interesting part was importing existing resources rather than creating new ones.</p>
<pre><code class="language-hcl">resource "google_compute_instance" "airflow_server" {
  name         = "airflow-server"
  machine_type = "e2-medium"
  zone         = var.zone

  boot_disk {
    initialize_params {
      image = "ubuntu-os-cloud/ubuntu-2204-lts"
      size  = 20
    }
  }

  lifecycle {
    prevent_destroy = true
  }
}
</code></pre>
<p>The <code>lifecycle.prevent_destroy = true</code> block is worth highlighting. It's a safety net — Terraform will refuse to destroy this resource even if you accidentally write code that would do so. For a production VM running Airflow, that's non-negotiable.</p>
<p>Importing existing resources is done with <code>terraform import</code>:</p>
<pre><code class="language-bash">terraform import \
  google_compute_instance.airflow_server \
  &lt;GCP_PROJECT_ID&gt;/us-central1-a/airflow-server
</code></pre>
<p>One command, and Terraform now knows about a resource that's been running for weeks.</p>
<hr />
<h2>Phase 2: Okta</h2>
<p>This is where it gets interesting for IAM engineers.</p>
<p>The Okta Terraform provider is officially maintained by Okta and covers nearly everything: apps, groups, users, policies, authorization servers, and more. For our setup, the key resources are:</p>
<pre><code class="language-hcl"># OIDC app for Airflow SSO
resource "okta_app_oauth" "airflow_sso" {
  label          = "Airflow SSO"
  type           = "web"
  grant_types    = ["authorization_code"]
  response_types = ["code"]

  redirect_uris = [
    "https://&lt;your-airflow-domain&gt;/oauth-authorized/okta"
  ]

  lifecycle {
    prevent_destroy = true
    ignore_changes  = [consent_method, hide_web, issuer_mode, login_mode]
  }
}

# Groups for role-based access
resource "okta_group" "airflow_admins" {
  name        = "airflow-admins"
  description = "Airflow administrators — mapped to Admin role"
}

# Assign groups to the app
resource "okta_app_group_assignment" "airflow_admins" {
  app_id   = okta_app_oauth.airflow_sso.id
  group_id = okta_group.airflow_admins.id
}
</code></pre>
<p>The <code>ignore_changes</code> lifecycle block deserves explanation. Some Okta app attributes get set by Okta itself after creation and differ from what you'd specify in code. Without <code>ignore_changes</code>, every <code>terraform plan</code> would show a diff for those attributes even though nothing meaningful has changed. This is a common pattern when importing existing resources into Terraform state.</p>
<p>The most powerful thing about managing Okta with Terraform is the dependency graph. When you write:</p>
<pre><code class="language-hcl">group_id = okta_group.airflow_admins.id
</code></pre>
<p>Terraform automatically knows to create the group before the assignment. You never have to think about order of operations.</p>
<hr />
<h2>Phase 3: Cloudflare</h2>
<p>Every DNS record for my domain is now code:</p>
<pre><code class="language-hcl">resource "cloudflare_record" "airflow" {
  zone_id         = var.zone_id
  name            = "airflow"
  content         = "&lt;VM_IP&gt;"
  type            = "A"
  proxied         = false
  allow_overwrite = false
}
</code></pre>
<p>The import process revealed something interesting: Cloudflare's MX records and DKIM records for Email Routing are marked <code>read_only</code> and cannot be managed via API. Terraform returned a clear error:</p>
<pre><code class="language-plaintext">Error: This record is managed by Email Routing.
Disable Email Routing to modify/remove this record. (1046)
</code></pre>
<p>The right response wasn't to fight it — it was to remove those records from state and document them as comments. Not everything needs to be in Terraform. The goal is to manage what you can, document what you can't, and never let the perfect be the enemy of the good.</p>
<hr />
<h2>The Import Pattern</h2>
<p>The most underrated Terraform skill is importing existing infrastructure. Most Terraform tutorials start from scratch. Real-world IAM engineering never does.</p>
<p>The workflow is:</p>
<ol>
<li><p>Write the resource block in <code>.tf</code> to match what exists</p>
</li>
<li><p>Run <code>terraform import &lt;resource&gt; &lt;id&gt;</code></p>
</li>
<li><p>Run <code>terraform plan</code> — if you see no changes, your code matches reality</p>
</li>
<li><p>If you see changes, adjust <code>ignore_changes</code> or fix the values</p>
</li>
</ol>
<p>This is exactly how you'd onboard an existing Okta org, an existing GCP project, or an existing DNS setup into Terraform management. It's one of the most valuable practical skills for a Senior IAM Engineer or IT Platform Engineer.</p>
<hr />
<h2>What's Next</h2>
<p>The Terraform foundation is in place. Three logical next steps:</p>
<p><strong>Modules</strong> — the current code has duplication. A reusable <code>okta-app</code> module that takes an app name and redirect URI as inputs would make adding new SSO apps a 5-line operation.</p>
<p><strong>for_each</strong> — the Cloudflare A records are nearly identical. Refactoring them into a single <code>for_each</code> block would be cleaner and easier to maintain.</p>
<p><strong>CI/CD</strong> — right now Terraform runs from my Mac. The next step is a GitHub Actions workflow that runs <code>terraform plan</code> on every PR and <code>terraform apply</code> on merge to main. Automated, auditable, and safe.</p>
<p>The code is at <a href="https://github.com/chanderinguva/terraform-inguva">github.com/chanderinguva/terraform-inguva</a> if you want to see the full implementation.</p>
<hr />
<h2>The Bigger Picture</h2>
<p>Managing identity infrastructure manually doesn't scale. As soon as you have more than a handful of Okta apps, more than one engineer touching DNS, or more than one environment to maintain, the lack of version control becomes a liability.</p>
<p>Terraform changes the conversation from "what did we change?" to "what does our infrastructure look like, and here's the commit that shows why."</p>
<p>For IAM engineers specifically, this is the difference between being the person who clicks through the Okta admin console and being the person who owns the identity platform as code.</p>
]]></content:encoded></item><item><title><![CDATA[How I mapped an entire platform stack to one domain using Cloudflare]]></title><description><![CDATA[One domain. Six subdomains. Airflow, Okta, Hashnode, email routing, Atlassian verification, and a redirect rule — all managed through Cloudflare's free DNS for $10/year.


When I set out to build a pe]]></description><link>https://blog.inguva.dev/how-i-mapped-an-entire-platform-stack-to-one-domain-using-cloudflare</link><guid isPermaLink="true">https://blog.inguva.dev/how-i-mapped-an-entire-platform-stack-to-one-domain-using-cloudflare</guid><category><![CDATA[cloudflare]]></category><category><![CDATA[dns]]></category><category><![CDATA[Platform Engineering ]]></category><category><![CDATA[Devops]]></category><category><![CDATA[infrastructure]]></category><dc:creator><![CDATA[Inguva Dev]]></dc:creator><pubDate>Mon, 16 Mar 2026 04:55:50 GMT</pubDate><content:encoded><![CDATA[<blockquote>
<p>One domain. Six subdomains. Airflow, Okta, Hashnode, email routing, Atlassian verification, and a redirect rule — all managed through Cloudflare's free DNS for $10/year.</p>
</blockquote>
<hr />
<p>When I set out to build a personal automation platform, I wanted everything to live under a single professional domain. No more auto-generated vendor subdomains — just clean, memorable URLs that look like real production infrastructure. Here's exactly how I mapped an entire stack to a single custom domain using Cloudflare.</p>
<hr />
<h2>The complete domain map</h2>
<table>
<thead>
<tr>
<th>Subdomain</th>
<th>Points to</th>
<th>Record type</th>
</tr>
</thead>
<tbody><tr>
<td><code>airflow.yourdomain.dev</code></td>
<td>Self-hosted app on GCP VM</td>
<td>A record</td>
</tr>
<tr>
<td><code>blog.yourdomain.dev</code></td>
<td>Hashnode blog</td>
<td>CNAME</td>
</tr>
<tr>
<td><code>login.yourdomain.dev</code></td>
<td>Okta tenant</td>
<td>Redirect Rule</td>
</tr>
<tr>
<td><code>you@yourdomain.dev</code></td>
<td>Forwards to Gmail</td>
<td>MX + Email Routing</td>
</tr>
<tr>
<td><code>admin@yourdomain.dev</code></td>
<td>Service signups (Okta, GCP)</td>
<td>MX + Email Routing</td>
</tr>
<tr>
<td><code>dev@yourdomain.dev</code></td>
<td>Developer tools (GitHub, etc.)</td>
<td>MX + Email Routing</td>
</tr>
</tbody></table>
<hr />
<h2>Why Cloudflare for DNS?</h2>
<p>I registered my <code>.dev</code> domain through Cloudflare Registrar at cost price (~$10/year) with no markup. The real value is the feature set that comes completely free: WHOIS privacy, SSL proxying, redirect rules, email routing, and a clean API for automation.</p>
<blockquote>
<p>Cloudflare's free DNS tier is genuinely enterprise-grade. The same infrastructure that protects Fortune 500 companies handles a $10/year personal domain identically.</p>
</blockquote>
<hr />
<h2>DNS record by record</h2>
<h3>1. Self-hosted app — A record</h3>
<p>The simplest record type. An A record maps a hostname directly to an IPv4 address. For any self-hosted service on a cloud VM with a reserved static IP, this is the right record type.</p>
<table>
<thead>
<tr>
<th>Type</th>
<th>Name</th>
<th>Content</th>
<th>Proxy</th>
</tr>
</thead>
<tbody><tr>
<td>A</td>
<td>airflow</td>
<td><code>&lt;your static IP&gt;</code></td>
<td>DNS only</td>
</tr>
</tbody></table>
<p><strong>Key detail:</strong> proxy status must be DNS only (grey cloud), not proxied (orange cloud). When your VM handles its own SSL via Let's Encrypt and Nginx, Cloudflare proxying would cause a double-SSL conflict. Always use DNS only for self-managed certificates.</p>
<hr />
<h3>2. Third-party hosted service — CNAME record</h3>
<p>A CNAME is an alias — it says "resolve this hostname as if it were that other hostname." Use it whenever a SaaS platform gives you a target hostname to point at rather than an IP address.</p>
<table>
<thead>
<tr>
<th>Type</th>
<th>Name</th>
<th>Target</th>
<th>Proxy</th>
</tr>
</thead>
<tbody><tr>
<td>CNAME</td>
<td>blog</td>
<td>hashnode.network</td>
<td>DNS only</td>
</tr>
</tbody></table>
<p>Again DNS only — the hosting provider provisions its own SSL certificate for your custom domain. Cloudflare proxying would intercept that certificate handshake and break it.</p>
<hr />
<h3>3. Vendor tenant redirect — Cloudflare Redirect Rule</h3>
<p>This is where it gets interesting. Some SaaS platforms (like Okta's free tier) don't support custom domains natively. Cloudflare's redirect rules let you create a branded subdomain that redirects to the vendor URL — giving you a clean URL without needing the vendor's paid plan.</p>
<pre><code class="language-plaintext">Rule: If hostname equals login.yourdomain.dev
Then: Static redirect → https://your-tenant.okta.com
Status: 302
</code></pre>
<p><strong>Important:</strong> redirect rules only fire on proxied DNS records. You need a dummy A record pointing to a placeholder IP with the orange cloud enabled. Cloudflare intercepts the request before it ever reaches that IP and fires the redirect.</p>
<table>
<thead>
<tr>
<th>Type</th>
<th>Name</th>
<th>Content</th>
<th>Proxy</th>
</tr>
</thead>
<tbody><tr>
<td>A</td>
<td>login</td>
<td>192.0.2.1 (placeholder)</td>
<td>Proxied (orange cloud)</td>
</tr>
<tr>
<td>Rule</td>
<td>login.*</td>
<td>→ your-tenant.okta.com</td>
<td>—</td>
</tr>
</tbody></table>
<hr />
<h3>4. Business email — MX + Email Routing</h3>
<p>Cloudflare Email Routing is one of the most underrated free features in DNS management. It lets you create unlimited custom email addresses that forward to any destination inbox — with zero mail server setup.</p>
<table>
<thead>
<tr>
<th>Type</th>
<th>Name</th>
<th>Content</th>
<th>Priority</th>
</tr>
</thead>
<tbody><tr>
<td>MX</td>
<td>@</td>
<td>route1.mx.cloudflare.net</td>
<td>13</td>
</tr>
<tr>
<td>MX</td>
<td>@</td>
<td>route2.mx.cloudflare.net</td>
<td>30</td>
</tr>
<tr>
<td>MX</td>
<td>@</td>
<td>route3.mx.cloudflare.net</td>
<td>19</td>
</tr>
<tr>
<td>TXT</td>
<td>@</td>
<td>v=spf1 include:_spf.mx.cloudflare.net ~all</td>
<td>—</td>
</tr>
</tbody></table>
<p>I created three addresses — a personal one, an admin one for service signups, and a dev one for developer tools — all forwarding to Gmail. Using separate addresses makes filtering and org-level email management trivial.</p>
<hr />
<h3>5. Third-party domain verification — CNAME + TXT</h3>
<p>Many enterprise platforms (Atlassian, Google Workspace, etc.) require domain ownership verification before they trust your custom domain for email or SSO. They typically give you a set of CNAME records for DKIM email signing and a TXT record for verification.</p>
<table>
<thead>
<tr>
<th>Type</th>
<th>Name</th>
<th>Purpose</th>
</tr>
</thead>
<tbody><tr>
<td>CNAME</td>
<td><code>&lt;vendor-prefix&gt;._domainkey</code></td>
<td>DKIM signing (primary)</td>
</tr>
<tr>
<td>CNAME</td>
<td><code>&lt;vendor-prefix&gt;._domainkey</code></td>
<td>DKIM signing (fallback)</td>
</tr>
<tr>
<td>CNAME</td>
<td><code>&lt;vendor&gt;-bounces</code></td>
<td>Email bounce handling</td>
</tr>
<tr>
<td>TXT</td>
<td>@</td>
<td>Domain ownership proof</td>
</tr>
</tbody></table>
<p>The name format varies by vendor — always copy the exact values from their DNS setup wizard rather than typing them manually.</p>
<hr />
<h2>Key gotchas</h2>
<p><strong>Proxy status is the most important setting to get right.</strong> The orange cloud (proxied) routes traffic through Cloudflare's network — redirect rules fire, DDoS protection activates, but your own SSL won't work. The grey cloud (DNS only) passes traffic straight to your server — your own SSL works, but Cloudflare features don't apply. Know which mode each record needs before you add it.</p>
<p><strong>Multiple TXT records on the root domain are perfectly valid.</strong> SPF, DMARC, vendor verification tokens — they all stack on <code>@</code> without conflict. DNS supports multiple TXT records on the same name.</p>
<p><strong>Never include your apex domain in the Name field.</strong> If your domain is <code>yourdomain.dev</code> and you want <code>airflow.yourdomain.dev</code>, just enter <code>airflow</code> as the Name — Cloudflare appends the domain automatically.</p>
<p><strong>Redirect rules require a proxied DNS record to exist first.</strong> You can't target a hostname in a redirect rule unless there's a proxied record for it. Create a placeholder A record pointing to <code>192.0.2.1</code> with the orange cloud — Cloudflare will intercept before the request ever reaches that IP.</p>
<hr />
<h2>The complete DNS picture</h2>
<table>
<thead>
<tr>
<th>Type</th>
<th>Name</th>
<th>Purpose</th>
<th>Proxy</th>
</tr>
</thead>
<tbody><tr>
<td>A</td>
<td>airflow</td>
<td>Self-hosted app VM</td>
<td>DNS only</td>
</tr>
<tr>
<td>CNAME</td>
<td>blog</td>
<td>Hosted blog platform</td>
<td>DNS only</td>
</tr>
<tr>
<td>A</td>
<td>login</td>
<td>Redirect rule placeholder</td>
<td>Proxied</td>
</tr>
<tr>
<td>MX</td>
<td>@</td>
<td>Email routing (x3)</td>
<td>—</td>
</tr>
<tr>
<td>CNAME</td>
<td>vendor._domainkey (x2)</td>
<td>DKIM email signing</td>
<td>DNS only</td>
</tr>
<tr>
<td>CNAME</td>
<td>vendor-bounces</td>
<td>Email bounce handling</td>
<td>DNS only</td>
</tr>
<tr>
<td>TXT</td>
<td>@</td>
<td>SPF + vendor verification</td>
<td>—</td>
</tr>
<tr>
<td>TXT</td>
<td>_dmarc</td>
<td>DMARC policy</td>
<td>—</td>
</tr>
<tr>
<td>Rule</td>
<td>login.*</td>
<td>Okta tenant redirect</td>
<td>—</td>
</tr>
</tbody></table>
<hr />
<h2>Total cost</h2>
<table>
<thead>
<tr>
<th>Component</th>
<th>Cost</th>
</tr>
</thead>
<tbody><tr>
<td>Domain registration (.dev)</td>
<td>$10/yr</td>
</tr>
<tr>
<td>Cloudflare DNS</td>
<td>Free</td>
</tr>
<tr>
<td>Email routing (unlimited addresses)</td>
<td>Free</td>
</tr>
<tr>
<td>Redirect rules</td>
<td>Free</td>
</tr>
<tr>
<td>SSL proxying</td>
<td>Free</td>
</tr>
<tr>
<td>WHOIS privacy</td>
<td>Free</td>
</tr>
<tr>
<td><strong>Total</strong></td>
<td><strong>$10/yr</strong></td>
</tr>
</tbody></table>
<hr />
<h2>What's next</h2>
<p>A few additions I'm planning: a <code>status.yourdomain.dev</code> uptime page, an <code>api.yourdomain.dev</code> subdomain for internal APIs, and eventually using Cloudflare Origin Certificates to properly terminate SSL at the edge rather than on the VM directly.</p>
<p>If you're setting up something similar or have questions about any of these DNS patterns, reach out at <a href="mailto:chander@inguva.dev">chander@inguva.dev</a>.</p>
<hr />
<p><em>Built with Cloudflare · Apache Airflow · Okta · Hashnode · Atlassian · GCP</em></p>
]]></content:encoded></item><item><title><![CDATA[How I automated employee onboarding and offboarding with Okta, Jira, and Airflow]]></title><description><![CDATA[Building a real enterprise identity automation pipeline for $10/month


The problem
Every IT and IAM team faces the same painful reality: onboarding a new employee means manually creating accounts acr]]></description><link>https://blog.inguva.dev/how-i-automated-employee-onboarding-and-offboarding-with-okta-jira-and-airflow</link><guid isPermaLink="true">https://blog.inguva.dev/how-i-automated-employee-onboarding-and-offboarding-with-okta-jira-and-airflow</guid><category><![CDATA[okta]]></category><category><![CDATA[airflow]]></category><category><![CDATA[slack]]></category><category><![CDATA[JIRA]]></category><category><![CDATA[IAM]]></category><category><![CDATA[automation]]></category><dc:creator><![CDATA[Inguva Dev]]></dc:creator><pubDate>Sun, 15 Mar 2026 22:38:07 GMT</pubDate><content:encoded><![CDATA[<blockquote>
<p>Building a real enterprise identity automation pipeline for $10/month</p>
</blockquote>
<hr />
<h2>The problem</h2>
<p>Every IT and IAM team faces the same painful reality: onboarding a new employee means manually creating accounts across a dozen systems. Offboarding is even worse — miss one system and you've got a security gap. I wanted to automate this entire lifecycle using the tools I work with every day: Okta, Jira, Apache Airflow, and Slack.</p>
<p>The goal was simple: when a user is provisioned in Okta, everything else should happen automatically. When they leave, everything should be revoked — with a full audit trail in Jira.</p>
<hr />
<h2>The stack</h2>
<table>
<thead>
<tr>
<th>Component</th>
<th>Tool</th>
<th>Cost</th>
</tr>
</thead>
<tbody><tr>
<td>Identity Provider</td>
<td>Okta Integrator Free Plan</td>
<td>Free</td>
</tr>
<tr>
<td>Workflow Orchestration</td>
<td>Apache Airflow 2.9 (self-hosted)</td>
<td>~$10/mo GCP</td>
</tr>
<tr>
<td>Ticketing</td>
<td>Jira Software (Automation Hub)</td>
<td>Free</td>
</tr>
<tr>
<td>Notifications</td>
<td>Slack Webhooks</td>
<td>Free</td>
</tr>
<tr>
<td>Domain</td>
<td><a href="http://inguva.dev">inguva.dev</a></td>
<td>$10/yr</td>
</tr>
</tbody></table>
<hr />
<h2>Architecture</h2>
<p>Here's how the full pipeline works:</p>
<pre><code class="language-plaintext">New hire added in Okta
    ↓
Okta SCIM → Airflow user provisioned
    ↓
Airflow DAG triggered (okta_onboarding)
    ↓
Jira ticket created (AUTO-XX) with full checklist
    ↓
Slack alert sent to IT team with ticket link
</code></pre>
<p>For offboarding it's the reverse:</p>
<pre><code class="language-plaintext">Employee departure confirmed
    ↓
Airflow DAG triggered (okta_offboarding)
    ↓
Jira ticket created with revocation checklist
    ↓
Slack alert with orange warning to IT team
    ↓
Okta account deactivated via SCIM
</code></pre>
<hr />
<h2>Building the onboarding DAG</h2>
<p>The onboarding DAG accepts a JSON config with user details and does three things: creates a Jira ticket with a full onboarding checklist, sends a Slack notification with all the details, and logs everything for audit purposes.</p>
<p>The Jira ticket includes a structured checklist covering every system that needs provisioning:</p>
<ul>
<li><p>Okta account created</p>
</li>
<li><p>Jira/Confluence access granted</p>
</li>
<li><p>Airflow access granted</p>
</li>
<li><p>Slack workspace invited</p>
</li>
<li><p>Laptop provisioned</p>
</li>
<li><p>Equipment shipped</p>
</li>
<li><p>Day 1 schedule sent</p>
</li>
</ul>
<p>Triggering it is as simple as running:</p>
<pre><code class="language-bash">airflow dags trigger okta_onboarding \
  --conf '{
    "username": "john.doe@inguva.dev",
    "full_name": "John Doe",
    "department": "Engineering",
    "start_date": "2026-03-16"
  }'
</code></pre>
<p>Within seconds, a Jira ticket (<code>AUTO-3</code>) is created and the IT team gets a Slack message with all the details and a direct link to the ticket.</p>
<hr />
<h2>Building the offboarding DAG</h2>
<p>Offboarding is where security really matters. A missed deprovisioning step means a former employee could still have access to sensitive systems. The offboarding DAG creates a comprehensive revocation checklist in Jira:</p>
<ul>
<li><p>Okta account deactivated</p>
</li>
<li><p>Jira/Confluence access revoked</p>
</li>
<li><p>Airflow access revoked</p>
</li>
<li><p>Slack deactivated</p>
</li>
<li><p>Laptop return scheduled</p>
</li>
<li><p>Data backup completed</p>
</li>
<li><p>Exit interview scheduled</p>
</li>
<li><p>Final paycheck processed</p>
</li>
</ul>
<p>The Slack notification uses an orange color to signal urgency — the IT team knows immediately that action is required.</p>
<hr />
<h2>The SCIM bridge</h2>
<p>What makes this particularly interesting is the custom SCIM bridge I built. Okta's SCIM 2.0 provisioning protocol sends HTTP requests to provision users, but Airflow has no native SCIM server. I wrote a lightweight Flask app that:</p>
<ol>
<li><p>Receives SCIM requests from Okta</p>
</li>
<li><p>Translates them into Airflow REST API calls</p>
</li>
<li><p>Creates or deactivates users in Airflow automatically</p>
</li>
</ol>
<p>When you assign someone to the Airflow app in Okta, they appear in Airflow within seconds — no manual steps required.</p>
<hr />
<h2>What I learned</h2>
<p>The most valuable insight was understanding the difference between authentication and provisioning. SSO handles authentication (can this person log in?) while SCIM handles provisioning (does this person have an account?). Most teams get SSO right but forget about automated provisioning, which means users get created manually and — critically — often don't get cleaned up when they leave.</p>
<p>Building the SCIM bridge forced me to read the actual SCIM 2.0 RFC and understand exactly what Okta sends over the wire. That knowledge transfers to any identity system, not just Airflow.</p>
<p>The other big takeaway: Airflow is an incredibly powerful orchestration engine for IT automation, not just data pipelines. The DAG model — where you define tasks, dependencies, and failure handling — maps perfectly to onboarding and offboarding workflows.</p>
<hr />
<h2>The outcome</h2>
<ul>
<li><p>Onboarding time reduced from manual multi-step process to one triggered DAG</p>
</li>
<li><p>Full audit trail in Jira for every onboarding and offboarding event</p>
</li>
<li><p>IT team gets instant Slack notification with all details</p>
</li>
<li><p>Zero missed deprovisioning steps — the checklist is always generated</p>
</li>
<li><p>Entire stack runs for ~$10/month on a GCP e2-medium VM</p>
</li>
</ul>
<hr />
<h2>What's next</h2>
<p>The natural next step is triggering these DAGs automatically from Okta webhooks — so the moment a user is activated or deactivated in Okta, the DAG fires without any manual trigger. I'm also planning to add an access review DAG that periodically checks for dormant accounts and flags them for review.</p>
<p>If you're building identity automation or want to talk IAM engineering, reach out at <a href="mailto:chander@inguva.dev">chander@inguva.dev</a>.</p>
<hr />
<p><em>Built with Apache Airflow · Okta · Jira · Flask · GCP · Slack ·</em> <a href="http://inguva.dev"><em>inguva.dev</em></a></p>
]]></content:encoded></item><item><title><![CDATA[From Zero to Production: Building an Identity + Automation Stack for $10/mo]]></title><description><![CDATA[Senior Systems Engineer · IAM · IT Automation · Platform Engineering

Tags: IAM Engineering, IT Automation, Okta, Apache Airflow, GCP, Platform Engineering

The problem I was solving
As a Senior Syste]]></description><link>https://blog.inguva.dev/from-zero-to-production-building-an-identity-automation-stack-for-10-mo</link><guid isPermaLink="true">https://blog.inguva.dev/from-zero-to-production-building-an-identity-automation-stack-for-10-mo</guid><category><![CDATA[okta]]></category><category><![CDATA[airflow]]></category><category><![CDATA[IAM]]></category><category><![CDATA[automation]]></category><category><![CDATA[GCP]]></category><dc:creator><![CDATA[Inguva Dev]]></dc:creator><pubDate>Sun, 15 Mar 2026 21:02:04 GMT</pubDate><content:encoded><![CDATA[<blockquote>
<p>Senior Systems Engineer · IAM · IT Automation · Platform Engineering</p>
</blockquote>
<p><strong>Tags:</strong> IAM Engineering, IT Automation, Okta, Apache Airflow, GCP, Platform Engineering</p>
<hr />
<h2>The problem I was solving</h2>
<p>As a Senior Systems Engineer with a decade of experience across IAM, identity governance, and IT automation, I kept running into the same challenge: it's hard to demonstrate hands-on platform skills without a live environment. Reading docs is one thing. Actually wiring Okta SCIM to a custom Flask endpoint that provisions users into Airflow in real time is something else entirely.</p>
<p>I also wanted a personal automation backbone — something I could use to run scheduled jobs, get Slack alerts, and build new workflows without spinning up a paid SaaS tool every time.</p>
<blockquote>
<p>The goal: production-grade identity + automation stack. Constraint: keep it under $15/month, build it myself, own every layer.</p>
</blockquote>
<hr />
<h2>The stack</h2>
<table>
<thead>
<tr>
<th>Component</th>
<th>Tool</th>
<th>Cost</th>
</tr>
</thead>
<tbody><tr>
<td>Compute</td>
<td>GCP e2-medium</td>
<td>~$8/mo with schedule</td>
</tr>
<tr>
<td>Orchestration</td>
<td>Apache Airflow 2.9</td>
<td>Free (self-hosted)</td>
</tr>
<tr>
<td>Identity</td>
<td>Okta Integrator Free Plan</td>
<td>Free</td>
</tr>
<tr>
<td>Domain</td>
<td><a href="http://inguva.dev">inguva.dev</a> (Cloudflare)</td>
<td>$10/yr</td>
</tr>
<tr>
<td>SSL</td>
<td>Let's Encrypt</td>
<td>Free</td>
</tr>
<tr>
<td>Notifications</td>
<td>Slack Webhooks</td>
<td>Free</td>
</tr>
<tr>
<td>SCIM Bridge</td>
<td>Custom Flask app</td>
<td>Runs on same VM</td>
</tr>
<tr>
<td>Job alerts</td>
<td>GitHub Actions</td>
<td>Free (public repo)</td>
</tr>
</tbody></table>
<hr />
<h2>How it came together</h2>
<h3>Step 1 — VM + domain</h3>
<p>Spun up a GCP e2-medium (Ubuntu 22.04), reserved a static IP, bought <a href="http://inguva.dev">inguva.dev</a> on Cloudflare, and set up email routing to forward @<a href="http://inguva.dev">inguva.dev</a> addresses to Gmail — all free except the $10/yr domain.</p>
<h3>Step 2 — Airflow in standalone mode</h3>
<p>Installed Airflow 2.9.2 into a Python venv, ran it in standalone mode (single process, SQLite backend), managed by Supervisor for auto-restart. No Docker overhead — uses ~600MB RAM comfortably on the e2-medium.</p>
<h3>Step 3 — HTTPS with Nginx + Let's Encrypt</h3>
<p>Set up Nginx as a reverse proxy, got a free SSL cert via Certbot, and configured auto-renewal. Airflow is now live at <a href="https://airflow.inguva.dev">https://airflow.inguva.dev</a> with a valid cert.</p>
<h3>Step 4 — Okta SSO via OIDC</h3>
<p>Created an Okta OIDC app integration, configured Airflow's webserver_<a href="http://config.py">config.py</a> with AUTH_OAUTH, and wired up the OAuth endpoints. Users can now click "Sign in with Okta" — no username/password needed.</p>
<h3>Step 5 — Custom SCIM provisioning bridge</h3>
<p>This was the most interesting part. Okta's SCIM 2.0 protocol sends HTTP requests to provision users — but Airflow has no native SCIM server. I wrote a lightweight Flask app that translates Okta SCIM calls into Airflow REST API calls, handling user create, update, and deactivation. When you assign someone to the Airflow app in Okta, they appear in Airflow within seconds.</p>
<h3>Step 6 — Slack DAG for daily reports + failure alerts</h3>
<p>Built a Python DAG that sends a daily Airflow health report to Slack every morning — VM CPU, memory, disk, uptime, and run status. Added on_failure_callback so any DAG failure triggers an instant Slack alert with a direct link to the logs.</p>
<h3>Step 7 — LinkedIn job alerts via GitHub Actions</h3>
<p>Wrote a Python scraper that checks LinkedIn every 3 hours for new Senior IAM / IT Automation / Atlassian Engineer roles, deduplicates against a JSON file committed to the repo, and posts new listings to Slack. Runs free on GitHub Actions.</p>
<hr />
<h2>What I learned</h2>
<p>The SCIM bridge was the most valuable piece to build. Every enterprise IAM environment has some version of this problem: you have an identity provider and a target application that speaks a slightly different dialect. The real skill is knowing how to read the SCIM spec, intercept the protocol, and adapt it to whatever API the downstream system exposes.</p>
<p>I also deepened my appreciation for how much managed services abstract away. Running Airflow on a raw VM means you own the process management, SSL renewal, log rotation, and restart behavior. Supervisor, Nginx, and Certbot are unglamorous but critical — and knowing how they fit together makes you a much stronger platform engineer.</p>
<blockquote>
<p>The biggest unlock: once you understand what Okta SCIM actually sends over the wire, you can provision users into almost anything — not just apps with native SCIM support.</p>
</blockquote>
<hr />
<h2>The outcomes</h2>
<ul>
<li><p><strong>$10/mo</strong> — total infrastructure cost</p>
</li>
<li><p><strong>&lt;1 second</strong> — Okta → Airflow user provisioning time</p>
</li>
<li><p><strong>Every 3 hours</strong> — LinkedIn job alert cadence</p>
</li>
<li><p><strong>0</strong> — third-party SaaS tools needed</p>
</li>
</ul>
<hr />
<h2>What's next</h2>
<p>A few things I'm planning to add: swapping SQLite for PostgreSQL to make Airflow production-ready, setting up a GitHub Actions pipeline to auto-deploy DAGs on push, and building an Okta user activity digest DAG that pulls from the Okta System Log API and posts a weekly access report to Slack.</p>
<p>If you're in IAM, IT automation, or platform engineering and want to talk about any of this — reach out at <a href="mailto:chander@inguva.dev">chander@inguva.dev</a>.</p>
<hr />
<p><em>Built with Apache Airflow · Okta · GCP · Flask · Nginx · Let's Encrypt · GitHub Actions · Slack · Cloudflare</em></p>
]]></content:encoded></item></channel></rss>