DevOps

Why Ansible include_tasks Variables Are Not Defined in the Parent Play

Why Ansible include_tasks Variables Are Not Defined in the Parent Play You define a variable inside an include_tasks file. The task runs successfully. Then the very next task in your parent play fails with variable is undefined. What just happened? This is one of the most confusing Ansible behaviors, and it bites people repeatedly because it looks like a bug. It’s not — it’s a scoping decision — but understanding why it works this way helps you fix it fast. ...

Why Ansible register Variable Is Undefined in the Next Task

Why Ansible register Variable Is Undefined in the Next Task You registered a variable in one task, and the next task fails with "msg": "The task includes an option with an undefined variable". The variable is right there — you can see it in the task above. What’s going on? This is one of the most common Ansible debugging rabbit holes. There are four distinct causes, and they look identical until you know what to look for. ...

How to Troubleshoot Ansible Playbook Failures When Syncing Databases with PAM Integration

How to Troubleshoot Ansible Playbook Failures When Syncing Databases with PAM Integration Automating database synchronization in environments with Privileged Access Management (PAM) systems like CyberArk can be a powerful way to streamline operations. However, when things go wrong, Ansible playbook failures can be frustrating and time-consuming to debug. Common issues include credential retrieval errors, connection timeouts, permission problems, and data inconsistency. In this post, we’ll dive into practical steps to troubleshoot these failures, complete with code examples and best practices. This guide is based on real-world scenarios I’ve encountered while setting up PAM-integrated DB sync roles. ...

Fix: Permission Denied /var/run/docker.sock

You run a Docker command and get this: G P d o o i t s a t l p e " u r h n m t i i t x s p s : i / a n % r 2 / d F r e v u n a n i r / e % d d 2 o F c w r k h u e i n r l % . e 2 s F o t d c r o k y c : i k n e c g r o . n t s n o o e c c c k t o / : n v n 1 p e . e c 2 r t 4 m / i t c s o o s n i t t o h a n e i n d D e e o r n c s i k / e e c d r r e d a a t e e m " o : n s o c k e t a t u n i x : / / / v a r / r u n / d o c k e r . s o c k : This happens because Docker runs as root by default, and your user doesn’t have permission to access the Docker socket. Here’s how to fix it properly. ...

Why Your systemd Service Fails Silently (And How to Actually Debug It)

Your systemd service is “active (running)” but the application isn’t responding. No errors in systemctl status. The journal shows it started. Everything looks fine. Except it isn’t. This is one of the most frustrating debugging scenarios in Linux administration. Here’s how to actually figure out what’s wrong. The Problem: Green Status, Dead Application 1 2 3 4 5 $ systemctl status myapp ● myapp.service - My Application Loaded: loaded (/etc/systemd/system/myapp.service; enabled) Active: active (running) since Fri 2026-03-20 10:00:00 EDT; 2h ago Main PID: 12345 (myapp) Looks healthy. But curl localhost:8080 times out. What’s happening? ...

Docker Secrets Management: From Development to Production

Secrets management in Docker is where most teams get bitten. Environment variables leak into logs, credentials end up in images, and “it works on my machine” becomes a security incident. Here’s how to handle secrets properly at every stage. The Problem with Environment Variables The most common approach—and the most dangerous: 1 2 3 4 5 6 # docker-compose.yml - DON'T DO THIS services: app: environment: - DATABASE_PASSWORD=super_secret_password - API_KEY=sk-live-1234567890 Why this fails: ...

Git Hooks for Automation: Enforce Quality Before It Hits the Repo

Git hooks run scripts at key points in your workflow. Use them to catch problems before they become pull requests. Hook Basics Hooks live in .git/hooks/. Each hook is an executable script named after the event. 1 2 3 4 5 6 # List available hooks ls .git/hooks/*.sample # Make a hook active cp .git/hooks/pre-commit.sample .git/hooks/pre-commit chmod +x .git/hooks/pre-commit Common Hooks Hook When Use Case pre-commit Before commit created Lint, format, test prepare-commit-msg After default message Add ticket numbers commit-msg After message entered Validate format pre-push Before push Run full tests post-merge After merge Install dependencies post-checkout After checkout Environment setup Pre-commit Hook Basic Linting 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 #!/bin/bash # .git/hooks/pre-commit echo "Running pre-commit checks..." # Get staged files STAGED_FILES=$(git diff --cached --name-only --diff-filter=ACM) # Python files PYTHON_FILES=$(echo "$STAGED_FILES" | grep '\.py$') if [ -n "$PYTHON_FILES" ]; then echo "Linting Python files..." ruff check $PYTHON_FILES || exit 1 black --check $PYTHON_FILES || exit 1 fi # JavaScript files JS_FILES=$(echo "$STAGED_FILES" | grep -E '\.(js|ts)$') if [ -n "$JS_FILES" ]; then echo "Linting JavaScript files..." eslint $JS_FILES || exit 1 fi echo "All checks passed!" Prevent Debug Statements 1 2 3 4 5 6 7 8 9 #!/bin/bash # .git/hooks/pre-commit # Check for debug statements if git diff --cached | grep -E '(console\.log|debugger|import pdb|breakpoint)'; then echo "ERROR: Debug statements found!" echo "Remove console.log, debugger, pdb, or breakpoint() before committing." exit 1 fi Check for Secrets 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 #!/bin/bash # .git/hooks/pre-commit # Patterns that might be secrets PATTERNS=( 'password\s*=\s*["\047][^"\047]+' 'api_key\s*=\s*["\047][^"\047]+' 'secret\s*=\s*["\047][^"\047]+' 'AWS_SECRET_ACCESS_KEY' 'PRIVATE KEY' ) for pattern in "${PATTERNS[@]}"; do if git diff --cached | grep -iE "$pattern"; then echo "ERROR: Possible secret detected!" echo "Pattern: $pattern" exit 1 fi done Commit Message Hook Enforce Conventional Commits 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 #!/bin/bash # .git/hooks/commit-msg COMMIT_MSG_FILE=$1 COMMIT_MSG=$(cat "$COMMIT_MSG_FILE") # Conventional commit pattern PATTERN="^(feat|fix|docs|style|refactor|test|chore)($.+$)?: .{1,50}" if ! echo "$COMMIT_MSG" | grep -qE "$PATTERN"; then echo "ERROR: Invalid commit message format!" echo "" echo "Expected: <type>(<scope>): <description>" echo "Types: feat, fix, docs, style, refactor, test, chore" echo "" echo "Examples:" echo " feat(auth): add OAuth2 support" echo " fix: resolve memory leak in worker" echo "" exit 1 fi Add Ticket Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 #!/bin/bash # .git/hooks/prepare-commit-msg COMMIT_MSG_FILE=$1 BRANCH_NAME=$(git symbolic-ref --short HEAD) # Extract ticket number from branch (e.g., feature/JIRA-123-description) TICKET=$(echo "$BRANCH_NAME" | grep -oE '[A-Z]+-[0-9]+') if [ -n "$TICKET" ]; then # Prepend ticket number if not already present if ! grep -q "$TICKET" "$COMMIT_MSG_FILE"; then sed -i.bak "1s/^/[$TICKET] /" "$COMMIT_MSG_FILE" fi fi Pre-push Hook Run Tests Before Push 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 #!/bin/bash # .git/hooks/pre-push echo "Running tests before push..." # Run test suite npm test if [ $? -ne 0 ]; then echo "ERROR: Tests failed! Push aborted." exit 1 fi # Check coverage threshold npm run coverage COVERAGE=$(cat coverage/coverage-summary.json | jq '.total.lines.pct') if (( $(echo "$COVERAGE < 80" | bc -l) )); then echo "ERROR: Coverage below 80%! Current: $COVERAGE%" exit 1 fi echo "All checks passed. Pushing..." Prevent Force Push to Main 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 #!/bin/bash # .git/hooks/pre-push PROTECTED_BRANCHES="main master" CURRENT_BRANCH=$(git symbolic-ref --short HEAD) # Check if force pushing while read local_ref local_sha remote_ref remote_sha; do if echo "$PROTECTED_BRANCHES" | grep -qw "$CURRENT_BRANCH"; then if [ "$remote_sha" != "0000000000000000000000000000000000000000" ]; then # Check if this is a force push if ! git merge-base --is-ancestor "$remote_sha" "$local_sha" 2>/dev/null; then echo "ERROR: Force push to $CURRENT_BRANCH is not allowed!" exit 1 fi fi fi done Post-merge Hook Auto-install Dependencies 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 #!/bin/bash # .git/hooks/post-merge CHANGED_FILES=$(git diff-tree -r --name-only ORIG_HEAD HEAD) # Check if package.json changed if echo "$CHANGED_FILES" | grep -q "package.json"; then echo "package.json changed, running npm install..." npm install fi # Check if requirements.txt changed if echo "$CHANGED_FILES" | grep -q "requirements.txt"; then echo "requirements.txt changed, running pip install..." pip install -r requirements.txt fi # Check if migrations added if echo "$CHANGED_FILES" | grep -q "migrations/"; then echo "New migrations detected!" echo "Run: python manage.py migrate" fi Using pre-commit Framework The pre-commit framework manages hooks declaratively. ...

YAML Gotchas: The Traps That Bite Every Developer

YAML looks simple until it isn’t. These gotchas have broken production configs and wasted countless debugging hours. Learn them once, avoid them forever. The Norway Problem 1 2 3 4 5 # These are ALL booleans in YAML 1.1 country: NO # false answer: yes # true enabled: on # true disabled: off # false Fix: Always quote strings that could be interpreted as booleans. 1 2 country: "NO" answer: "yes" YAML 1.2 fixed this, but many parsers (including PyYAML by default) still use 1.1 rules. ...

Kubernetes Debugging: From Pod Failures to Cluster Issues

Kubernetes abstracts away infrastructure until something breaks. Then you need to peel back the layers. These debugging patterns will help you find problems fast. First Steps: Get the Lay of the Land 1 2 3 4 5 6 7 8 9 10 # Cluster health kubectl cluster-info kubectl get nodes kubectl top nodes # Namespace overview kubectl get all -n myapp # Events (recent issues surface here) kubectl get events -n myapp --sort-by='.lastTimestamp' Pod Debugging Check Pod Status 1 2 3 4 5 6 7 8 9 10 11 12 13 # List pods with status kubectl get pods -n myapp # Detailed pod info kubectl describe pod mypod -n myapp # Common status meanings: # Pending - Waiting for scheduling or image pull # Running - At least one container running # Succeeded - All containers completed successfully # Failed - All containers terminated, at least one failed # CrashLoopBackOff - Container crashing repeatedly # ImagePullBackOff - Can't pull container image View Logs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # Current logs kubectl logs mypod -n myapp # Previous container (after crash) kubectl logs mypod -n myapp --previous # Follow logs kubectl logs -f mypod -n myapp # Specific container (multi-container pod) kubectl logs mypod -n myapp -c mycontainer # Last N lines kubectl logs mypod -n myapp --tail=100 # Since timestamp kubectl logs mypod -n myapp --since=1h Execute Commands in Pod 1 2 3 4 5 6 7 8 # Shell into running container kubectl exec -it mypod -n myapp -- /bin/sh # Run specific command kubectl exec mypod -n myapp -- cat /etc/config/app.yaml # Specific container kubectl exec -it mypod -n myapp -c mycontainer -- /bin/sh Debug Crashed Containers 1 2 3 4 5 6 7 8 # Check why it crashed kubectl describe pod mypod -n myapp | grep -A 10 "Last State" # View previous logs kubectl logs mypod -n myapp --previous # Run debug container (K8s 1.25+) kubectl debug mypod -n myapp -it --image=busybox --target=mycontainer Common Pod Issues ImagePullBackOff 1 2 3 4 5 6 7 8 9 10 11 12 13 # Check events for details kubectl describe pod mypod -n myapp | grep -A 5 Events # Common causes: # - Wrong image name/tag # - Private registry without imagePullSecrets # - Registry rate limiting (Docker Hub) # Verify image exists docker pull myimage:tag # Check imagePullSecrets kubectl get pod mypod -n myapp -o jsonpath='{.spec.imagePullSecrets}' CrashLoopBackOff 1 2 3 4 5 6 7 8 9 10 11 12 # Get exit code kubectl describe pod mypod -n myapp | grep "Exit Code" # Exit codes: # 0 - Success (shouldn't be crashing) # 1 - Application error # 137 - OOMKilled (out of memory) # 139 - Segmentation fault # 143 - SIGTERM (graceful shutdown) # Check resource limits kubectl describe pod mypod -n myapp | grep -A 5 Limits Pending Pods 1 2 3 4 5 6 7 8 9 10 11 # Check why not scheduled kubectl describe pod mypod -n myapp | grep -A 10 Events # Common causes: # - Insufficient resources # - Node selector/affinity not matched # - Taints without tolerations # - PVC not bound # Check node resources kubectl describe nodes | grep -A 5 "Allocated resources" Resource Issues Memory Problems 1 2 3 4 5 6 7 8 # Check pod resource usage kubectl top pod mypod -n myapp # Check for OOMKilled kubectl describe pod mypod -n myapp | grep OOMKilled # View memory limits kubectl get pod mypod -n myapp -o jsonpath='{.spec.containers[*].resources}' CPU Throttling 1 2 3 4 5 # Check CPU usage vs limits kubectl top pod mypod -n myapp # In container, check throttling kubectl exec mypod -n myapp -- cat /sys/fs/cgroup/cpu/cpu.stat Networking Debugging Service Connectivity 1 2 3 4 5 6 7 8 9 10 11 12 # Check service exists kubectl get svc -n myapp # Check endpoints (are pods backing the service?) kubectl get endpoints myservice -n myapp # Test from within cluster kubectl run debug --rm -it --image=busybox -- /bin/sh # Then: wget -qO- http://myservice.myapp.svc.cluster.local # DNS resolution kubectl run debug --rm -it --image=busybox -- nslookup myservice.myapp.svc.cluster.local Pod-to-Pod Communication 1 2 3 4 5 6 7 8 # Get pod IPs kubectl get pods -n myapp -o wide # Test connectivity from one pod to another kubectl exec mypod1 -n myapp -- wget -qO- http://10.0.0.5:8080 # Check network policies kubectl get networkpolicies -n myapp Ingress Issues 1 2 3 4 5 6 7 8 # Check ingress configuration kubectl describe ingress myingress -n myapp # Check ingress controller logs kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx # Verify backend service kubectl get svc myservice -n myapp ConfigMaps and Secrets 1 2 3 4 5 6 7 8 9 10 11 12 # Verify ConfigMap exists and has expected data kubectl get configmap myconfig -n myapp -o yaml # Check if mounted correctly kubectl exec mypod -n myapp -- ls -la /etc/config/ kubectl exec mypod -n myapp -- cat /etc/config/app.yaml # Verify Secret kubectl get secret mysecret -n myapp -o jsonpath='{.data.password}' | base64 -d # Check environment variables kubectl exec mypod -n myapp -- env | grep MY_VAR Persistent Volumes 1 2 3 4 5 6 7 8 9 10 11 12 # Check PVC status kubectl get pvc -n myapp # Describe for binding issues kubectl describe pvc mypvc -n myapp # Check PV kubectl get pv # Verify mount in pod kubectl exec mypod -n myapp -- df -h kubectl exec mypod -n myapp -- ls -la /data Node Issues 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # Node status kubectl get nodes kubectl describe node mynode # Check conditions kubectl get nodes -o custom-columns=NAME:.metadata.name,CONDITIONS:.status.conditions[*].type # Node resource pressure kubectl describe node mynode | grep -A 5 Conditions # Pods on specific node kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=mynode # Drain node for maintenance kubectl drain mynode --ignore-daemonsets --delete-emptydir-data Control Plane Debugging 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # API server health kubectl get --raw='/healthz' # Component status (deprecated but useful) kubectl get componentstatuses # Check system pods kubectl get pods -n kube-system # API server logs (if accessible) kubectl logs -n kube-system kube-apiserver-master # etcd health kubectl exec -n kube-system etcd-master -- etcdctl endpoint health Useful Debug Containers 1 2 3 4 5 6 7 8 # Network debugging kubectl run netdebug --rm -it --image=nicolaka/netshoot -- /bin/bash # DNS debugging kubectl run dnsdebug --rm -it --image=tutum/dnsutils -- /bin/bash # General debugging kubectl run debug --rm -it --image=busybox -- /bin/sh Systematic Debugging Checklist Events first: kubectl get events --sort-by='.lastTimestamp' Describe the resource: kubectl describe <resource> <name> Check logs: kubectl logs <pod> (and --previous) Verify dependencies: ConfigMaps, Secrets, Services, PVCs Check resources: CPU, memory limits and usage Test connectivity: DNS, service endpoints, network policies Compare with working: Diff against known good configuration Quick Reference 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # Pod not starting kubectl describe pod <name> kubectl get events # Pod crashing kubectl logs <pod> --previous kubectl describe pod <name> | grep "Exit Code" # Can't connect to service kubectl get endpoints <service> kubectl run debug --rm -it --image=busybox -- wget -qO- http://<service> # Resource issues kubectl top pods kubectl describe node | grep -A 5 "Allocated" # Config issues kubectl exec <pod> -- env kubectl exec <pod> -- cat /path/to/config Kubernetes debugging is methodical. Start with events, drill into describe output, check logs, and verify each dependency. Most issues are configuration mismatches—wrong image tags, missing secrets, insufficient resources. ...

AWS CLI Essentials: Patterns for Daily Operations

The AWS CLI is the fastest path from question to answer. These patterns cover the operations you’ll use daily. Setup and Configuration 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Configure default profile aws configure # Configure named profile aws configure --profile production # Use specific profile aws --profile production ec2 describe-instances # Or set environment variable export AWS_PROFILE=production # Verify identity aws sts get-caller-identity Multiple Accounts 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # ~/.aws/credentials [default] aws_access_key_id = AKIA... aws_secret_access_key = ... [production] aws_access_key_id = AKIA... aws_secret_access_key = ... # ~/.aws/config [default] region = us-east-1 output = json [profile production] region = us-west-2 output = json EC2 Operations List Instances 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # All instances aws ec2 describe-instances # Just the essentials aws ec2 describe-instances \ --query 'Reservations[].Instances[].[InstanceId,State.Name,InstanceType,PrivateIpAddress,Tags[?Key==`Name`].Value|[0]]' \ --output table # Running instances only aws ec2 describe-instances \ --filters "Name=instance-state-name,Values=running" \ --query 'Reservations[].Instances[].[InstanceId,PrivateIpAddress]' \ --output text # By tag aws ec2 describe-instances \ --filters "Name=tag:Environment,Values=production" # By instance ID aws ec2 describe-instances --instance-ids i-1234567890abcdef0 Start/Stop/Reboot 1 2 3 4 5 6 7 8 9 10 11 # Stop aws ec2 stop-instances --instance-ids i-1234567890abcdef0 # Start aws ec2 start-instances --instance-ids i-1234567890abcdef0 # Reboot aws ec2 reboot-instances --instance-ids i-1234567890abcdef0 # Terminate (careful!) aws ec2 terminate-instances --instance-ids i-1234567890abcdef0 Get Console Output 1 aws ec2 get-console-output --instance-id i-1234567890abcdef0 --output text SSH Key Pairs 1 2 3 4 5 6 7 8 9 # List aws ec2 describe-key-pairs # Create aws ec2 create-key-pair --key-name mykey --query 'KeyMaterial' --output text > mykey.pem chmod 400 mykey.pem # Delete aws ec2 delete-key-pair --key-name mykey S3 Operations List and Navigate 1 2 3 4 5 6 7 8 9 10 11 # List buckets aws s3 ls # List bucket contents aws s3 ls s3://mybucket/ # Recursive listing aws s3 ls s3://mybucket/ --recursive # With human-readable sizes aws s3 ls s3://mybucket/ --recursive --human-readable --summarize Copy and Sync 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # Upload file aws s3 cp myfile.txt s3://mybucket/ # Download file aws s3 cp s3://mybucket/myfile.txt ./ # Upload directory aws s3 cp ./mydir s3://mybucket/mydir --recursive # Sync (only changed files) aws s3 sync ./local s3://mybucket/remote # Sync with delete (mirror) aws s3 sync ./local s3://mybucket/remote --delete # Exclude patterns aws s3 sync ./local s3://mybucket/remote --exclude "*.log" --exclude ".git/*" Delete 1 2 3 4 5 6 7 8 # Single file aws s3 rm s3://mybucket/myfile.txt # Directory aws s3 rm s3://mybucket/mydir/ --recursive # Empty bucket aws s3 rm s3://mybucket/ --recursive Presigned URLs 1 2 3 4 5 # Generate download URL (expires in 1 hour) aws s3 presign s3://mybucket/myfile.txt --expires-in 3600 # Upload URL aws s3 presign s3://mybucket/upload.txt --expires-in 3600 Bucket Operations 1 2 3 4 5 6 7 8 # Create bucket aws s3 mb s3://mynewbucket # Delete bucket (must be empty) aws s3 rb s3://mybucket # Force delete (removes contents first) aws s3 rb s3://mybucket --force IAM Operations Users 1 2 3 4 5 6 7 8 9 10 11 # List users aws iam list-users # Create user aws iam create-user --user-name newuser # Delete user aws iam delete-user --user-name olduser # List user's access keys aws iam list-access-keys --user-name myuser Roles 1 2 3 4 5 6 7 8 # List roles aws iam list-roles # Get role details aws iam get-role --role-name MyRole # List attached policies aws iam list-attached-role-policies --role-name MyRole Policies 1 2 3 4 5 6 7 # List policies aws iam list-policies --scope Local # Get policy document aws iam get-policy-version \ --policy-arn arn:aws:iam::123456789012:policy/MyPolicy \ --version-id v1 CloudWatch Logs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # List log groups aws logs describe-log-groups # List log streams aws logs describe-log-streams --log-group-name /aws/lambda/myfunction # Get recent logs aws logs get-log-events \ --log-group-name /aws/lambda/myfunction \ --log-stream-name '2024/01/01/[$LATEST]abc123' \ --limit 100 # Tail logs (requires aws-cli v2) aws logs tail /aws/lambda/myfunction --follow # Filter logs aws logs filter-log-events \ --log-group-name /aws/lambda/myfunction \ --filter-pattern "ERROR" \ --start-time $(date -d '1 hour ago' +%s)000 Lambda 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # List functions aws lambda list-functions # Invoke function aws lambda invoke \ --function-name myfunction \ --payload '{"key": "value"}' \ output.json # Get function config aws lambda get-function-configuration --function-name myfunction # Update function code aws lambda update-function-code \ --function-name myfunction \ --zip-file fileb://function.zip # View recent invocations aws lambda get-function --function-name myfunction RDS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # List instances aws rds describe-db-instances # Instance details aws rds describe-db-instances --db-instance-identifier mydb \ --query 'DBInstances[0].[DBInstanceIdentifier,DBInstanceStatus,Endpoint.Address]' # Create snapshot aws rds create-db-snapshot \ --db-instance-identifier mydb \ --db-snapshot-identifier mydb-snapshot-$(date +%Y%m%d) # List snapshots aws rds describe-db-snapshots --db-instance-identifier mydb Secrets Manager 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # List secrets aws secretsmanager list-secrets # Get secret value aws secretsmanager get-secret-value --secret-id mysecret \ --query 'SecretString' --output text # Create secret aws secretsmanager create-secret \ --name mysecret \ --secret-string '{"username":"admin","password":"secret"}' # Update secret aws secretsmanager put-secret-value \ --secret-id mysecret \ --secret-string '{"username":"admin","password":"newsecret"}' SSM Parameter Store 1 2 3 4 5 6 7 8 9 10 11 12 # Get parameter aws ssm get-parameter --name /myapp/database/password --with-decryption # Put parameter aws ssm put-parameter \ --name /myapp/database/password \ --value "mysecret" \ --type SecureString \ --overwrite # List parameters by path aws ssm get-parameters-by-path --path /myapp/ --recursive --with-decryption Query and Filter Patterns JMESPath Queries 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # Select specific fields aws ec2 describe-instances \ --query 'Reservations[].Instances[].[InstanceId,State.Name]' # Filter in query aws ec2 describe-instances \ --query 'Reservations[].Instances[?State.Name==`running`].[InstanceId]' # First result only aws ec2 describe-instances \ --query 'Reservations[0].Instances[0].InstanceId' # Flatten nested arrays aws ec2 describe-instances \ --query 'Reservations[].Instances[].Tags[?Key==`Name`].Value[]' Output Formats 1 2 3 4 5 6 7 8 9 10 11 # JSON (default) aws ec2 describe-instances --output json # Table (human readable) aws ec2 describe-instances --output table # Text (tab-separated, good for scripts) aws ec2 describe-instances --output text # YAML aws ec2 describe-instances --output yaml Scripting Patterns Loop Through Resources 1 2 3 4 5 6 7 8 # Stop all instances with specific tag for id in $(aws ec2 describe-instances \ --filters "Name=tag:Environment,Values=dev" \ --query 'Reservations[].Instances[].InstanceId' \ --output text); do echo "Stopping $id" aws ec2 stop-instances --instance-ids "$id" done Wait for State 1 2 3 4 5 6 7 8 # Wait for instance to be running aws ec2 wait instance-running --instance-ids i-1234567890abcdef0 # Wait for instance to stop aws ec2 wait instance-stopped --instance-ids i-1234567890abcdef0 # Wait for snapshot completion aws ec2 wait snapshot-completed --snapshot-ids snap-1234567890abcdef0 Pagination 1 2 3 4 5 6 7 # Auto-pagination (default in CLI v2) aws s3api list-objects-v2 --bucket mybucket # Manual pagination aws s3api list-objects-v2 --bucket mybucket --max-items 100 # Use NextToken from output for next page aws s3api list-objects-v2 --bucket mybucket --starting-token "token..." Useful Aliases 1 2 3 4 5 6 7 8 9 10 # ~/.bashrc or ~/.zshrc # Quick instance list alias ec2ls='aws ec2 describe-instances --query "Reservations[].Instances[].[InstanceId,State.Name,InstanceType,PrivateIpAddress,Tags[?Key==\`Name\`].Value|[0]]" --output table' # Who am I? alias awswho='aws sts get-caller-identity' # S3 bucket sizes alias s3sizes='aws s3 ls | while read _ _ bucket; do aws s3 ls s3://$bucket --recursive --summarize 2>/dev/null | tail -1; echo $bucket; done' Troubleshooting 1 2 3 4 5 6 7 8 9 10 11 # Debug mode aws ec2 describe-instances --debug # Dry run (check permissions without executing) aws ec2 run-instances --dry-run --image-id ami-12345 --instance-type t2.micro # Check CLI version aws --version # Clear credential cache rm -rf ~/.aws/cli/cache/* The AWS CLI rewards muscle memory. Start with the operations you do daily, build aliases for common patterns, and gradually expand. ...