中文说明见 README.zh-CN.md。
OmniSearch is a local-first tool layer for AI agents, now primarily focused on A-share stock data workflows.
It is still not a search engine. Today the primary use case is the stock data layer, while the original web tooling remains available as supporting capability:
/searchbacked by SearXNG/extractbacked byrequests+ Trafilatura/researchas a minimal search + extract orchestrationGET /company/*as the primary A-share stock data layer
OmniSearch exposes one FastAPI service and keeps responsibilities narrow:
- web search stays behind
/search - content extraction stays behind
/extract - research orchestration stays behind
/research - stock data is normalized into local SQLite and served from
/company/*
There is no indexing engine, ranking model, auth, billing, admin dashboard, or LLM summarization here.
The stock layer is the primary contract of this repository.
Storage defaults to SQLite for local use. The repository layer is kept narrow so a future PostgreSQL implementation can replace it without changing the API contract.
Core internal entities:
company_profileeventfinancial_summaryprice_daily
Primary endpoint:
GET /company/{ticker}/overview
Supporting stock endpoints:
GET /company/{ticker}GET /company/{ticker}/eventsGET /company/{ticker}/financialsGET /company/{ticker}/pricesGET /company/{ticker}/timelineGET /company/{ticker}/risk-flags
Overview is the recommended entry point for agents because it aggregates:
- normalized company profile
- latest financial summary
- latest daily price
- recent events
- risk flags
- derived overview signals
- per-section
data_statusfor company, financials, prices, events, and risk flags
Additional docs:
app/
api/
collectors/
core/
db/
extractors/
models/
normalizers/
providers/
research/
schemas/
services/
- Python 3.11+
- Docker and Docker Compose for the bundled SearXNG setup
TUSHARE_TOKENif you want company profile and financial summary collection
make upEquivalent:
docker compose up --buildThis starts:
- OmniSearch API on
http://localhost:8000 - SearXNG on
http://localhost:8080
The API container persists local stock cache data under ./data.
- Create env file:
cp .env.example .env- Install dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt- Start SearXNG:
docker compose up -d searxng- Start the API:
uvicorn app.main:app --reloadThe repository currently uses pytest directly. There is no dedicated make test target yet.
Run the full suite:
pytestUseful focused runs:
pytest -x -vv
pytest tests/test_stock_api.py
pytest tests/test_research_api.py
pytest tests/test_content_extractor.py
pytest tests/test_stock_service.pyCurrent test coverage is split into three practical layers:
- API tests: FastAPI route contract checks with
TestClient - unit tests: normalizers, planners, collectors, extractor helpers, and service logic
- lightweight integration tests: script and data-flow behavior with fakes
Representative test files:
tests/test_stock_api.pytests/test_research_api.pytests/test_stock_service.pytests/test_stock_normalizers.pytests/test_content_extractor.pytests/test_sync_stock_script.py
Areas that still need more coverage:
/searchroute behavior/extractroute behaviorapp/providers/searxng.py- the full extraction path in
app/extractors/content.py - direct SQLite repository tests in
app/db/sqlite.py
API_PORT=8000
SEARXNG_BASE_URL=http://localhost:8080
SQLITE_DB_PATH=./data/omnisearch.db
RESEARCH_PLANNER=rule
TUSHARE_TOKEN=
TUSHARE_BASE_URL=
CNINFO_ANNOUNCEMENTS_URL=https://www.cninfo.com.cn/new/hisAnnouncement/query
STOCK_DATA_TTL_HOURS=24Notes:
TUSHARE_TOKENis required for/company/{ticker}and/company/{ticker}/financials.- If you run a local Tushare-compatible proxy, set
TUSHARE_BASE_URL=http://tushare.xyz. - AKShare is used for daily prices.
- CNInfo is used for announcement and event collection.
- Stock data is cached locally in SQLite and refreshed on demand.
Client / Agent
-> OmniSearch API (FastAPI)
-> SearXNG
-> requests + Trafilatura
-> SQLite
-> Tushare / CNInfo / AKShare
curl http://localhost:8000/healthcurl -X POST http://localhost:8000/search \
-H "Content-Type: application/json" \
-d '{"query":"fastapi searxng","top_k":5}'curl -X POST http://localhost:8000/extract \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com"}'curl -X POST http://localhost:8000/research \
-H "Content-Type: application/json" \
-d '{"query":"000001 业绩","top_k":2}'/research keeps the existing search + extract flow. If the query contains an A-share ticker such as 000001, the response also includes stock_context when local stock data can be collected.
Recommended first call for agents:
curl "http://localhost:8000/company/000001/overview?refresh=true"Example overview response shape:
{
"ticker": "000001.SZ",
"company": {
"data": { "ticker": "000001.SZ", "name": "Ping An Bank", "source": "tushare" },
"data_status": {
"status": "fresh",
"updated_at": "2026-03-17T00:00:00Z",
"source": "tushare",
"ttl_hours": 24,
"cache_hit": true,
"error_message": null
}
}
}curl "http://localhost:8000/company/000001"curl "http://localhost:8000/company/000001/events?limit=10"curl "http://localhost:8000/company/000001/financials?limit=4"curl "http://localhost:8000/company/000001/prices?limit=30"curl "http://localhost:8000/company/000001/overview"curl "http://localhost:8000/company/000001/timeline"curl "http://localhost:8000/company/000001/risk-flags"curl "http://localhost:8000/company/002837/prices?limit=5&debug=true"curl "http://localhost:8000/company/002837/events?limit=10&refresh=true&debug=true"python -m app.scripts.sync_stock --tickers 000001,002837 --refresh --price-limit 60 --event-limit 10Incremental sync:
python -m app.scripts.sync_stock --tickers 000001,002837 --incremental --json-report out/sync-report.jsonDry run:
python -m app.scripts.sync_stock --tickers 000001 --dry-run --skip-overview --verboseOr via make:
TICKERS=000001,002837 REFRESH=1 PRICE_LIMIT=60 EVENT_LIMIT=10 make syncThe sync command warms:
- company profile
- financial summaries
- daily prices
- recent events
- aggregate overview
The command also prints:
- source-level debug for prices
- source-level debug for events
- per-ticker failure summary
- This repo remains a tool layer, not a search engine.
- The stock layer is intentionally vertical and A-share focused.
/company/{ticker}/overviewis the main stock entry point.- Current risk flags are heuristic and deterministic. There is no LLM summarization.
timelinenow emphasizes major financial updates and large price moves.risk-flagsnow surfaces missing data, drawdowns, volatility, and low-margin or negative-growth signals.- Docker Compose mode keeps the local-first workflow intact.