Skip to content

pulp-bio/VowelNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VowelNet

Copyright (C) 2025 ETH Zurich, Switzerland. SPDX-License-Identifier: Apache-2.0. See LICENSE file for details. Author: Thorir Mar Ingolfsson.

About

VowelNet is the public training and evaluation code accompanying:

Thorir Mar Ingolfsson, Victor Kartsch, Luca Benini, and Andrea Cossettini,
"A Wearable Ultra-Low-Power System for EEG-based Speech-Imagery Interfaces",
IEEE Transactions on Biomedical Circuits and Systems, 2025.

This repository focuses on the offline software pipeline used in the paper: preprocessing private BioGAP EEG recordings into segmented NumPy tensors, training the VowelNet models, and reproducing the non-deployment experiments reported in the publication.

Features

  • Utilities for converting private raw recordings into segmented .npy datasets
  • Per-session, multi-session, continual-learning, channel-ablation, confusion-matrix, and LOSO experiment scripts
  • Paper-oriented orchestration, best-cut selection, result summarization, and plotting helpers

Installation

We recommend using a virtual environment and installing the Python dependencies from requirements.txt.

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Dataset Preparation

This repository does not include EEG data. The code expects private processed data to be stored as NumPy arrays with the following structure:

<data-root>/
  <subject>/
    0_data.npy
    0_labels.npy
    1_data.npy
    1_labels.npy
    ...

Each *_data.npy file is expected to have shape (channels, time_samples, trials). Each *_labels.npy file is expected to have shape (trials,).

Acquisition Setup

The journal experiments use EEG acquired at 500 Hz with a BioGAP-equipped headband using eight fully-dry channels.

The paper states that the electrodes are positioned approximately at:

  • T7
  • TP7
  • P7
  • O1
  • O2
  • P8
  • TP8
  • T8

according to the 10-10 reference system.

The public code assumes this eight-channel order for the input tensors:

[T7, TP7, P7, O1, O2, P8, TP8, T8]

This is also the order used by the channel-ablation experiments:

  • temporal subset: T7, TP7, TP8, T8
  • occipital subset: P7, O1, O2, P8

Trimmed windows used in the paper are generated into a parallel directory:

<cut-root>/
  <subject>/
    start/
      0_data_05.npy
      ...
    end/
      0_data_05.npy
      ...

The public code uses the trigger mapping from the paper:

  • 50: /a/
  • 60: /e/
  • 70: /i/
  • 80: /o/
  • 90: /u/
  • 100: up
  • 51: down
  • 52: forward
  • 53: back
  • 54: left
  • 55: right
  • 56: center
  • 110: rest

To generate the private processed inputs from the original recordings:

python data_processing/convert_data.py \
  --raw-dir /path/to/private/raw/SubjectA \
  --csv-dir /path/to/private/csv/SubjectA \
  --processed-dir /path/to/private/data/SubjectA

To generate the trimmed windows used by the trimming experiments:

python data_processing/make_cut_data.py \
  --input-dir /path/to/private/data/SubjectA \
  --output-dir /path/to/private/data_cut/SubjectA

How To Run

Run per-session LOOCV:

python training/per_session.py \
  --data-root /path/to/private/data \
  --subject SubjectA \
  --task-group vowel-binary

Run progressive multi-session LOOCV:

python training/multi_session.py \
  --data-root /path/to/private/data \
  --subject SubjectA \
  --task-group all-multiclass

Run continual-learning experiments:

python training/continual_learning.py \
  --data-root /path/to/private/data \
  --subject SubjectA \
  --task all \
  --cl-method baseline

Run the temporal-vs-occipital channel ablation:

python training/channel_ablation.py \
  --data-root /path/to/private/data \
  --cut-root /path/to/private/data_cut \
  --subject SubjectA \
  --reference-results results/SubjectA/multi_session.csv \
  --prefer-trimmed

Generate the 13-class confusion matrix:

python training/confusion_matrix.py \
  --data-root /path/to/private/data \
  --cut-root /path/to/private/data_cut \
  --subject SubjectA \
  --task all \
  --reference-results results/SubjectA/multi_session.csv \
  --prefer-trimmed

Build the continual-learning comparison used in Fig. 9:

python training/continual_learning_comparison.py \
  --subject SubjectA \
  --results-root results

Run leave-one-subject-out evaluation:

python training/loso_train.py \
  --data-dir /path/to/private/data \
  --dataset-type all \
  --classes 13

Run the full offline paper pipeline:

python training/run_paper_experiments.py \
  --data-root /path/to/private/data \
  --cut-root /path/to/private/data_cut \
  --subjects Subject1 Subject2 Subject3 Subject4

Select best cuts from a finished result file:

python analysis/select_best_cuts.py \
  --input results/SubjectA/multi_session.csv \
  --output results/SubjectA/best_cuts.csv \
  --prefer-trimmed

Create summary CSVs and plots for the paper:

python analysis/summarize_results.py --results-root results
python analysis/plot_paper_figures.py --summaries-dir results/summaries

Paper Coverage

The public code covers the non-deployment, non-data parts of the paper:

  • Table II signal-trimming comparisons from Session 1
  • Fig. 3 per-session vowel experiments
  • Fig. 4 multi-session vowel experiments with trimming
  • Fig. 5 multi-session HMI experiments with trimming
  • Fig. 6 13-class per-session and multi-session experiments
  • Fig. 7 13-class confusion matrix generation
  • Fig. 8 temporal-vs-occipital channel ablation
  • Fig. 9 continual-learning comparison (No CL (a), CL, LwF, Experience Replay, No CL (b))
  • LOSO adaptation sweeps for the 13-class task

Repository Structure

VowelNet/
├── analysis
├── data_processing
├── training
├── utils
├── README.md
├── LICENSE
├── CITATION.cff
└── requirements.txt

Notes

  • Some experiments require pre-generated trimmed windows produced with data_processing/make_cut_data.py.
  • Channel-ablation and confusion-matrix runs that reuse the paper's best trimmed configurations require both the trimmed data directory and the corresponding reference multi-session results CSV.

Citation

If you use this code, please cite the associated journal article and the software release metadata in CITATION.cff.

License

All source code in this repository is released under the Apache 2.0 license. See the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages