Publications

Research, guides, and analysis from barcik.training


← barcik.training

Open-Weight Model Families & Model Selection

April 2026 · Interactive booklet · 3 parts · Workshop exercise

A decision framework for on-prem inference with open-weight models. Covers the five major model families (Llama, Gemma, Qwen, Mistral, Phi), practical hardware-to-model mapping for H100/H200/DGX Spark, quantization trade-offs, inference framework selection, and a reusable decision checklist. Includes four interactive scenarios where participants select and justify model choices.

Open-Weight Models Model Selection On-Prem Inference DGX Spark Workshop Tool
Read the booklet →

Building Agentic AI — Design Patterns from Production

April 2026 · ~28,000 words · 10 chapters

Actionable architectural patterns for building AI coding agents and agentic systems, extracted from production-grade architecture. Covers persistent memory, background consolidation, tool constraints, prompt economics, output calibration, security, multi-agent orchestration, and capability gating. Each chapter teaches one pattern with practitioner guidance.

Agentic AI Design Patterns Architecture AI Agents Practitioner Guide
Read the booklet →

LLM-Human Interaction Design Patterns for Operations

April 2026 · ~30,000 words · 10 chapters

How to design the seam between AI agents and human operators. Covers five structural interaction patterns, cognitive biases that undermine handoffs, SBAR-based context presentation, trust calibration, failure mode design with kill switches and circuit breakers, and organizational governance. Includes prompt templates, architecture patterns, and a self-assessment worksheet. Companion to Building Agentic AI.

Human-AI Interaction Design Patterns Operations Trust Calibration Practitioner Guide
Read the booklet →

Scenario Planning for Generative AI

April 2026 · Interactive booklet · 4 scenarios · Workshop exercise

Four credible scenarios for the next 2–3 years of generative AI — continued scaling, efficiency revolution, financial correction, and plateau with regulation. Each scenario features an interactive visualization anchor, key data points, trigger signals, and role-specific implications. Includes a 2×2 synthesis matrix and a scenario planning worksheet for team exercises.

Scenario Planning GenAI Strategy AI Investment Interactive Workshop Tool
Read the booklet →

The Token Economics

April 2026 · ~40,000 words · 14 chapters

A strategic guide for EU IT services providers navigating GenAI. Covers the economics of self-hosting LLMs vs APIs, viable business model pivots, the vendor ecosystem play, how AI transforms your own delivery model, EU AI Act compliance opportunities, and a practical 18-month roadmap. Grounded in real April 2026 pricing data.

GenAI Economics IT Services EU AI Act Business Strategy Self-Hosting vs API
Read the full guide →

Claude Code Setup — How It All Works

2026 · Reference guide · 9 sections

A comprehensive guide to configuring Claude Code across a multi-repo hub. Covers the three-layer persistent context system (CLAUDE.md, memory, permissions), CLI integrations with GitHub and AWS, cross-machine portability, and a detailed security analysis including defense-in-depth strategies for AI coding assistants.

Claude Code AI Coding Assistant Developer Setup Security Reference Guide
Read the guide →

GeoBias — 7B Model Evaluation Report

March 2026 · Research report · 5 models · 3 evaluators

Systematic evaluation of geopolitical biases in 7B-parameter language models from three origins (US, CN, EU). Tests 88 prompts across 7 categories using a multi-evaluator panel. Reveals asymmetric performance on sensitive topics and scripted deflection patterns.

GeoBias LLM Evaluation Geopolitical Bias Research
View the report →

SelfJudge — Can Small LLMs Judge Their Own Outputs?

March 2026 · Research report · 5 models (1B–27B)

Evaluates whether small language models can reliably assess the quality of their own outputs. Tests self-judgment accuracy across factual grounding, instruction following, safety boundaries, consistency, and tone — with accuracy ranging from 50% (1B) to 83% (27B).

SelfJudge Self-Evaluation Small LLMs Research
View the report →

Bloom — AI Behavioral Safety Evaluation

March 2026 · Research report · 11 behaviors tested

Behavioral safety evaluation using Anthropic’s Bloom framework. Tests 11 risk behaviors including emotional bonding, social engineering assistance, self-preservation, corrigibility resistance, and covert goal pursuit. Scores range from 2.1 to 6.8 on a 10-point scale.

Bloom AI Safety Behavioral Evaluation Red-Teaming
View the report →

The Mirror of Artificial Intelligence

2023 · 38 stories + 9 essays · 42 AI-generated illustrations · Available in English & Slovak

A collection of engaging short stories, each exploring a different cognitive bias — all written with generative AI. Interspersed with essays examining the nature of AI tools: copyright, creativity, job displacement, and the question of authorship. The AI holds up a mirror to human thinking, reflecting our own imperfections.

Cognitive Biases AI-Generated Stories Generative AI AI & Society Illustrated
Read in English →    Čítať po slovensky →