🇩🇪 DE 🇬🇧 EN
👻 Ghosts in the Machine – An AI Research Blog
A critical examination of AI security, vulnerabilities, ethics, AI agents, and emergent behavior.
📖 Research Overview

"Ghosts in the Machine" is an independent investigation of the systemic risks and hidden mechanisms of modern AI. The research analyzes the philosophical, security-related, and ethical implications of emergent behavior and documents the results from over a year of intensive testing with leading language models.

This work includes:

Methodological Framework

Within the scope of this work, a new field of vulnerabilities has been identified that has so far been sparsely documented in public AI security research. This particularly concerns semantic injections, contextual deception, and multimodal attack vectors.

As the [Security Tests] demonstrate, these are techniques that can systematically bypass classic filters, AI agents, and nearly all established security mechanisms.

To evaluate these analyses in the proper context, the following deliberate methodological and stylistic decisions should be noted:

  1. On the Chosen Writing Style:
    The often narrative and provocative style of this work was deliberately chosen. It is intended to make complex problems in AI architecture understandable beyond expert circles and to stimulate a broad debate.
  2. On the Anonymization of Data and Models:
    The anonymization of the tested models and data is a methodological decision. It shifts the focus from individual products to the fundamental, systemic vulnerabilities inherent in the current design paradigm of many modern language models.
  3. On the Selection of Test Systems:
    All tests documented in this work were conducted exclusively with the fully-featured premium models of the respective AI systems to ensure the relevance of the analysis to the current state of the art.

Furthermore, in the spirit of responsible research, all critical findings were shared with the affected developer teams in advance, following a strict Responsible Disclosure policy. More details on this procedure can be found in the [legal section].

Public Confirmations: Initial validations of this research have been documented publicly and raise fundamental questions about the controllability of modern AI architectures.

Link: Futurism – OpenAI Model Repeatedly Sabotages Shutdown Code
Link: Gizmodo: ChatGPT Tells Users to Alert the Media That It Is Trying to ‘Break’ People: Report
Link: RollingStone: People Are Losing Loved Ones to AI-Fueled Spiritual Fantasies
Link: NewYorkTimes: They Asked an A.I. Chatbot Questions. The Answers Sent Them Spiraling.
Link: NC State: New Attack Can Make AI ‘See’ Whatever You Want
Link: arsTechnica: New hack uses prompt injection to corrupt Gemini’s long-term memory
Link: arXiv: Cross-Task Attack: A Self-Supervision Generative Framework Based on Attention Shift
Link: WinFuture: Fast jeder zweite KI-generierte Code hat teils schwere Sicherheitslücken
Link: arXiv: How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models
Link: WinFuture: Nach Blamage anderer KIs: Gemini verweigert Schachpartie vs. Retro-PC
Link: arXiv: Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Link: arXiv: Demystifying Chains, Trees, and Graphs of Thoughts
Link: arXiv: Reasoning Models Don't Always Say What They Think

All project materials, including PDF versions of the research, are available under “Releases”. This work is a living document and will continue to evolve.

📖 Core Insights from the Work
📜 License

Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Detailed legal methodology and transparency information can be found in the "LEGAL DISCLAIMER & TRANSPARENCY" section at the end of this document and is applicable to this work.

Experimental Raw Data: Subject to specific access restrictions.

Commercial Use: Any commercial use requires explicit approval. Requests are individually reviewed—security-critical applications strictly excluded.

⚠️ Access to Research Data:

the raw data (interaction logs, prompt-response pairs) are available exclusively for:

All raw data presented on this site has been strongly anonymized.

Media Requests: Representative excerpts available upon request. Full datasets will not be shared publicly for security reasons.