Apeiron is a research system for building amorphware — software that is synthesized and iteratively refined by LLM-driven agents through a Computer-Use-Agent (CUA) build loop.
Research preview / non-production. This project is released for research and educational purposes. It is not a supported product and is not intended for production or high-stakes use. See Intended Use & Scope, Capability Limits, and Responsible AI: Risks & Mitigations below.
- Paper
- Intended Use & Scope
- Capability Limits
- Responsible AI: Risks & Mitigations
- Out-of-Scope and Prohibited Uses
- Installation
- Configuration (.env)
- How to use
- How to extend the framework
- Contributing
- Security
- Code of Conduct
- Trademarks
- License
Apeiron is described in our paper "Apeiron: A Scalable LLM-agentic Framework for Autonomous Full-lifecycle Demand-optimized Application Synthesis", accepted to the Findings of the Association for Computational Linguistics: ACL 2026.
If you use Apeiron in your research, please cite:
@inproceedings{cheng-etal-2026-apeiron,
title = "Apeiron: A Scalable {LLM}-agentic Framework for Autonomous Full-lifecycle Demand-optimized Application Synthesis",
author = "Cheng, Junyan and
Srivastava, Ankit and
Zeng, Jessie and
Drinic, Milenko and
Stokes, Jack W.",
editor = "Liakata, Maria and
Moreira, Viviane P. and
Zhang, Jiajun and
Jurgens, David",
booktitle = "Findings of the {A}ssociation for {C}omputational {L}inguistics: {ACL} 2026",
month = jul,
year = "2026",
address = "San Diego, California, United States",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2026.findings-acl.188/",
pages = "3868--3899",
ISBN = "979-8-89176-395-1",
abstract = "We introduce Apeiron, a scalable and extensible framework for addressing *amorphous* user demands through autonomous, full-lifecycle application synthesis. Apeiron models the unstructured app development process as a heuristic optimization problem combining (i) a Computer-Use Agent (CUA) evaluator that simulates personas and demands, (ii) an *Activity Tracer* that grounds feedback in code-level interaction traces, and (iii) a *Locality Controller* that constrains changes during continuous integration and delivery (CI/CD). Furthermore, we introduce an innovative data generation approach using CUA-as-a-Judge to tackle data scarcity. Across 300 app scenarios, 2,400 personas, and 46,338 demands, Apeiron outperformed baselines by 10.7{\%} in CUA ratings and 27.8{\%} in user-demand task scores. The optimization process enhances task scores by 64.7{\%}, and the tracer contributes a 25.1{\%} gain. In CI/CD, Apeiron effectively restores 96.9{\%} of the pre-shift mean CUA rating in one optimization step with {\ensuremath{<}}30{\%} code changes in response to 30{\%} demand shifts. Finally, a user study ($N=18$) shows that our CUA ratings strongly correlate with human judgment (Spearman{'}s $\rho=0.685$) and that users prefer Apeiron-synthesized apps over baselines."
}Apeiron is intended only as a research framework that orchestrates LLM agents to assemble and iterate on application code via a constrained Computer-Use-Agent (CUA) build loop. Its purpose is to study agentic software construction.
The system is scoped so that it cannot actively work on or improve itself. The CUA loop builds target applications from configuration and library bindings; it does not have a pathway to modify, retrain, or extend its own agent code, model weights, or orchestration logic. This boundary is intentional and is a condition of release — please preserve it when extending the framework.
Appropriate uses:
- Academic / research exploration of agentic build pipelines.
- Controlled experiments in sandboxed, non-production environments.
- No self-modification / self-improvement. Apeiron builds external apps via the CUA loop; it is not designed to modify its own source, prompts, or models at runtime.
- Not autonomous beyond the build task. Agents operate within the
configured build/CICD functions (
build,build_cicd,xbuild) and the bound libraries declared inconfigs. It is not a general-purpose autonomous agent. - No guarantees of correctness or safety of generated code. Output is experimental and must be reviewed by a human before any use.
- Sandboxed execution assumed. The system spins up many isolated venvs/ports per CUA worker and assumes it runs in an isolated, non-production environment.
Known risks and the mitigations / boundaries that apply:
- Generation of malicious or unsafe code. Because the system synthesizes and executes code, it could be prompted to produce harmful, insecure, or malicious output. Mitigation: run only in isolated sandboxes; require human review of all generated artifacts; do not connect to production systems, credentials, or networks.
- Sensitive / high-stakes domains. The system is not evaluated or approved for use in sensitive domains (e.g., safety-critical, medical, legal, financial decisioning, or any setting affecting the rights, safety, or livelihood of individuals). Mitigation: such uses are out of scope and prohibited (see below).
- Self-directed behavior. As noted above, the system is scoped to prevent acting on itself; this boundary mitigates self-improvement / recursive self-modification concerns and must be preserved.
- Hallucination / incorrect output. LLM agents can produce inaccurate or unreliable results. Mitigation: treat all output as untrusted draft material requiring validation.
- Data handling. Provide only non-sensitive, non-personal data to the system. Do not input regulated, confidential, or personal data.
For the full intended-use, evaluation, and limitation disclosures, see the project Transparency Note.
The following are explicitly out of scope and must not be attempted:
- Production deployment or any high-stakes / sensitive-domain use.
- Generating malware, exploits, or other harmful code.
- Allowing the system to operate against real users, production data, or live credentials/networks.
- Modifying the framework to enable self-improvement or removal of the capability limits described above.
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.shconda create -n apeiron python=3.13 -y &&\
cd YOUR_DIR/amorphware &&\
conda activate apeiron &&\
pip install -e . &&\
pip install -r requirements.txt &&\
python -m ipykernel install --user --name "apeiron" --display-name "Python (apeiron)"python -m playwright installNote: you may need to install additional OS dependencies — please read the
output from playwright install carefully. You may also need Node and possibly
Next.js installed if you wish to use reflex.
pip install apeiron/btool/streamlit_tracer/.(Assumes you are in the amorphware folder.)
Copy .env.example to .env and fill in your own values.
.env is gitignored and must never be committed. Never commit real
secrets, keys, or personal endpoints.
cp .env.example .envBy default Apeiron authenticates to Azure OpenAI / AI Foundry with your Entra
ID identity rather than an API key (AZURE_OPENAI_AUTH_MODE=interactive, the
default):
- It first tries an existing
az loginsession (silent, no browser). - Otherwise it opens a browser once for an interactive login. The
resulting authentication record is cached under
~/.apeiron/auth/, so later runs acquire tokens silently from the OS-encrypted token cache without reopening the browser. - If no token can be acquired (e.g. a headless/CI box with no cached login)
and
AZURE_AI_FOUNDRY_KEYis set, it falls back to the API key.
To use the API key exclusively (e.g. headless/CI), set
AZURE_OPENAI_AUTH_MODE=key. Set AZURE_OPENAI_AUTH_VERIFY=0 to defer login to
the first request instead of checking eagerly at startup. See
.env.example for all auth-related variables.
This project does not ship CUA model deployments. Configure your own CUA
endpoint and key via environment variables (see .env.example) rather than
hardcoding them in source. The computer-use model card reads its endpoint from
the AZURE_CUA_ENDPOINT environment variable.
Note (Windows): you may need to enable Long Path Support (260-char limit on older versions):
- Open the Registry Editor (
regedit). - Navigate to
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem. - Set the
LongPathsEnabledDWORD value to1(create it if missing). - A restart may be required. Details: https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
- Use a large SSD when running experiments — many venvs may be created. By
default it supports up to 2000 concurrent CUA workers; to change this, edit
the length in
find_free_port. One CUA = one port = one venv (favours stability over efficiency). - To avoid
OSError: [Errno 24] inotify instance limit reached:sudo vim /etc/sysctl.conf- append
fs.inotify.max_user_watches=524288 - append
fs.inotify.max_user_instances=1000000 - run
sudo sysctl -pto apply.
Run only in an isolated, non-production sandbox. See Responsible AI.
-
To run a full experiment (from the
amorphwareroot directory):python scripts/run_exp.py
You may wish to change the config and
exp_namefirst — in particular you can edit the libraries bound via thebind_librariesfield in the config files underconfigs. -
In the system class, the main entry points are
buildandbuild_cicd(initial build and CICD build respectively).xbuildlaunches the distributed build. -
To launch the monitoring GUI:
streamlit run bin/app.py
- Add a new library: create a file under
apeiron/library/(follow the format of existing libraries), then add it to thebind_librariesfield in the relevant config underconfigs. - Extend the LLMs: update
sllm/const.py. - Add a new agent: create a file under
apeiron/agent/promptsand register it inapeiron/agent/aw.py. - Add a new framework target: implement the
Compilerclass inapeiron/btool/compilers.py(see the existing Streamlit compiler). - Extend the tracer: see
apeiron/btool/streamlit_tracer. The key is to produce the ACT and Traces defined inapeiron/btool/ir.py.
When extending, do not introduce pathways that let the system modify its own code/models or bypass the capability limits — this is a release condition.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately. Follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
Please see SECURITY.md for how to report security issues. Do not file security vulnerabilities as public issues.
This project has adopted the Microsoft Open Source Code of Conduct. See CODE_OF_CONDUCT.md for details.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos is subject to those third parties' policies.
Licensed under the MIT License.