Thank you for your excellent work and code. I am interested in using your pertained models for research purposes.
But while I was doing pre-training with my EMG data, I got some questions compared to papers: https://arxiv.org/pdf/2512.15729
-
Question about normalization
The paper describes how input normalization is handled during pretraining, but when I checked the public code, some parts seemed slightly different from how I understood it.
Could you please clarify what the exact normalization method used in the paper was?
In particular, I would like to confirm whether Min-Max normalization was actually used, or whether there were any additional preprocessing or normalization steps that I may have missed.
-
Question about RobustQuartile normalization
I also noticed that the codebase contains a RobustQuartile-type normalization.
Could you please clarify whether this was actually used for the TinyMyo pretraining experiments, or whether it was included only for other experiments, an earlier version, or as an alternative option?
In other words, I would like to know which normalization method should be considered the correct reference when reproducing the results reported in the paper.
-
Question about the loss function
From the paper, I understood that both masked and unmasked reconstruction losses are used. However, in the public code, the loss used for optimization and the loss shown in training/validation logging seem potentially different, so I want to make sure I am interpreting this correctly.
When the paper reports the pretraining loss, does it refer to:
the masked loss only, or
the total loss consisting of masked loss plus weighted unmasked loss?
I would appreciate clarification on which definition is correct.
- Question about faithful reproduction of the paper
If I want to reproduce the paper as faithfully as possible using the public GitHub code, are there any parts among the following that must be modified separately to match the exact paper setting?
-normalization
-masking strategy
-reconstruction loss
-logging or model selection criterion
It is also possible that I may have misunderstood parts of the paper, so if possible, I would greatly appreciate it if you could briefly summarize the final pretraining setting actually used in the paper experiments.
Thank you for your excellent work and code. I am interested in using your pertained models for research purposes.
But while I was doing pre-training with my EMG data, I got some questions compared to papers: https://arxiv.org/pdf/2512.15729
Question about normalization
The paper describes how input normalization is handled during pretraining, but when I checked the public code, some parts seemed slightly different from how I understood it.
Could you please clarify what the exact normalization method used in the paper was?
In particular, I would like to confirm whether Min-Max normalization was actually used, or whether there were any additional preprocessing or normalization steps that I may have missed.
Question about RobustQuartile normalization
I also noticed that the codebase contains a RobustQuartile-type normalization.
Could you please clarify whether this was actually used for the TinyMyo pretraining experiments, or whether it was included only for other experiments, an earlier version, or as an alternative option?
In other words, I would like to know which normalization method should be considered the correct reference when reproducing the results reported in the paper.
Question about the loss function
From the paper, I understood that both masked and unmasked reconstruction losses are used. However, in the public code, the loss used for optimization and the loss shown in training/validation logging seem potentially different, so I want to make sure I am interpreting this correctly.
When the paper reports the pretraining loss, does it refer to:
the masked loss only, or
the total loss consisting of masked loss plus weighted unmasked loss?
I would appreciate clarification on which definition is correct.
If I want to reproduce the paper as faithfully as possible using the public GitHub code, are there any parts among the following that must be modified separately to match the exact paper setting?
-normalization
-masking strategy
-reconstruction loss
-logging or model selection criterion
It is also possible that I may have misunderstood parts of the paper, so if possible, I would greatly appreciate it if you could briefly summarize the final pretraining setting actually used in the paper experiments.