Skip to content

Somi-Project/Somi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

418 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SOMI 🕊️

Sovereign Operation Machine Intelligence

Run your own AI operating system on your own machine.

Local First Self Hosted Desktop Runtime Focus

SOMI is a fully self-hosted, local-first AI agent framework built for people who want real capability without handing their data, workflows, or identity to a cloud platform.

It is designed to feel powerful for everyday users and credible for developers:

  • local chat, memory, research, coding, OCR, speech, and automation
  • desktop-first operator experience with dedicated studios and control surfaces
  • approval-aware tools, sandboxed execution paths, and auditable actions
  • modular architecture that still runs on consumer-grade hardware

No subscriptions. No forced SaaS. No hidden dependency on a remote agent service.


Important

SOMI is built for people who want an AI they can actually live with: local, capable, auditable, and enjoyable to use.

Tip

If you are new to this kind of project, jump straight to Quick Start. If you are evaluating it as a framework, head to Repo Tour For Developers.

🧭 Choose Your Path

If you are here to... Start here
try SOMI quickly Quick Start
see what it can do Flagship Capabilities
understand the architecture What The Architecture Looks Like
inspect the codebase Repo Tour For Developers
join the community Community

✨ Why People Get Excited About SOMI

Most "AI agents" are one of these:

  • a chatbot with a few extra buttons
  • a cloud workflow dressed up as autonomy
  • a research demo that is hard to live with every day
  • a coding shell with no real operating layer around it

SOMI aims higher.

SOMI is a local AI workstation and agent framework that gives you:

  • a desktop command center
  • a coding workspace
  • a research workspace
  • speech input and output
  • local memory and session recall
  • OCR and document extraction
  • automations and workflows
  • a tool and skill system that can grow over time
  • a secure path toward remote nodes and distributed execution

The goal is simple:

make self-hosted AI feel practical, capable, safe, and genuinely enjoyable to use


🧠 What You Can Do With It

Use SOMI as:

  • your local AI assistant
  • your coding partner
  • your research analyst
  • your OCR and document extraction tool
  • your voice-enabled desktop helper
  • your Telegram-connected agent
  • your automation engine
  • your modular AI framework for building new tools and skills

👀 At A Glance

For Everyday Users

  • Run AI on your own hardware
  • Keep your data local
  • Talk to SOMI through the desktop, chat, Telegram, or speech
  • Research, summarize, extract, organize, and automate from one system
  • Use a futuristic but practical GUI instead of living in a terminal

For Developers

  • PySide6 desktop shell
  • modular agent runtime
  • tool registry and execution backends
  • coding workspaces and guarded execution
  • workflow runtime and subagents
  • ontology, state plane, and control room
  • local-first speech, OCR, browser automation, and research stacks
  • release gate, freeze artifacts, replay harness, and security audit tooling

🚀 Flagship Capabilities

Desktop AI Workstation

SOMI is not just a CLI project. It includes a desktop shell with dedicated operator surfaces such as:

  • Control Room
  • Coding Studio
  • Research Studio
  • Speech controls
  • Node Manager

Self-Hosted Chat And Memory

  • continuous local chat sessions
  • persistent memory and recall
  • compaction-aware history handling
  • configurable personas and model routing

Coding Mode

  • managed coding workspaces
  • Python-first, multi-language capable workflow
  • guarded file operations and runtime actions
  • benchmark and verify loops
  • coding-focused studio UI

Research And Evidence Workflows

  • web and document research
  • evidence graphs
  • export bundles
  • contradiction-aware synthesis
  • local-first research orchestration

OCR And Structured Extraction

  • document OCR
  • schema-based extraction
  • table and form heuristics
  • export-ready results for spreadsheets and downstream workflows

Speech

  • local TTS and STT pipeline
  • pyttsx3 and local whisper-based flow
  • desktop speech controls and test tooling

Skills, Tools, And Automation

  • modular tool registry
  • skill marketplace and trust metadata
  • workflow manifests
  • automation runtime
  • self-expansion path through skill drafting and approval

Security And Control

  • approval-aware execution
  • audit trails
  • scoped remote behavior
  • gateway and node mesh foundations
  • security audit and release gate tooling

⚡ Quick Start

Requirements

  • Python 3.11+
  • Git
  • Ollama running locally at http://127.0.0.1:11434

Recommended:

  • a modern CPU and at least 16 GB RAM
  • an NVIDIA GPU for faster local inference
  • Node.js for some advanced coding and browser workflows

Install

git clone https://github.com/Somi-Project/Somi.git
cd Somi
python -m venv .venv

Windows:

.venv\Scripts\activate
pip install -r requirements.txt

Linux / macOS:

source .venv/bin/activate
pip install -r requirements.txt

Optional PyTorch install for your hardware:

CPU:

pip install torch torchvision

CUDA example:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

Replace cu121 with the build that matches your CUDA version.

Pull Recommended Models

ollama pull dolphin3:latest
ollama pull stable-code:3b
ollama pull glm-ocr:latest
ollama pull qwen2.5-coder:3b

Tip

For better private web search and research performance, pair SOMI with a properly configured SearXNG instance. Community setup guide: SearXNG guide

Launch The Desktop App

python somicontroller.py

Note

If you just want the main experience, start with the desktop app. It is the easiest way to feel what SOMI is supposed to be.

If you prefer CLI utilities:

python somi.py doctor
python somi.py release gate
python somi.py freeze

🏗️ What The Architecture Looks Like

SOMI is built as a local AI operating stack, not a single monolithic agent loop.

Desktop / Chat / Telegram / Speech / Nodes
                  |
               Gateway
                  |
          Agent Runtime + Executive Layer
                  |
    Tools / Skills / Workflows / Subagents
                  |
      State Plane / Ontology / Memory / Audit
                  |
     Coding / Research / OCR / Browser / Speech

Core design pillars:

  • best possible user experience for ordinary humans
  • security-aware by default
  • reliable on consumer hardware
  • fast enough to feel usable every day
  • modular enough to extend without rewriting the core

🧩 Want the developer view?

🧩 Repo Tour For Developers

If you are evaluating SOMI as a framework, these are the most important entry points:


✅ Real Features, Not Just Claims

SOMI currently includes:

  • local desktop GUI built on PySide6
  • coding workspaces and guarded code execution
  • research supermode with evidence graph exports
  • OCR presets and structured extraction
  • local speech pipeline with doctoring tools
  • workflow runtime and subagents
  • control room and observability surfaces
  • skill forge, skill marketplace, and trust labeling
  • node mesh and pairing foundations
  • ontology-backed actions and human oversight
  • release gate, framework freeze, replay harness, and security audit tooling

🔒 Security Philosophy

SOMI is built to be powerful without pretending power has no cost.

That means:

  • approvals for sensitive execution paths
  • explicit trust states for remote behavior
  • auditable operations
  • modular isolation boundaries
  • local-first defaults whenever practical

SOMI is not trying to be reckless autonomy. It is trying to be usable sovereignty.


🎮 Consumer Hardware Focus

SOMI is built for real machines people actually own.

That means the framework is designed around:

  • local Ollama-hosted models
  • bounded memory and context handling
  • modular components you can enable gradually
  • practical performance on prosumer and gamer-class hardware

You do not need a rack of servers to benefit from SOMI.


🌍 Cross-Platform Direction

SOMI is being built as a cross-platform framework with a local desktop-first experience.

Current development has been exercised most heavily on Windows, with architecture and packaging direction aimed at Windows, Linux, and macOS support.


🤝 Community


🛠️ Contributing

Contributions that help most:

  • better onboarding and docs
  • stronger modular tools and skills
  • benchmark improvements
  • performance and hardware tuning
  • UI polish
  • platform packaging and installer work

If SOMI saves you time, inspires a project, or feels like the kind of AI future you want to exist, star the repo and share it.

SOMI is meant to be something people can actually live with.

About

Local-first AI agent framework with GUI, memory, web search, personality constructs, speech i/o, tools, skills, CLI & Telegram features — fully self-hosted via Ollama.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages