My issue is about: seeing differences in the behavior of add_task and add_transformer methods implemented in the progressive_learner class. Calling add_task and add_transformer for different tasks lead to drastically different BTE behaviors, even though add_task just calls add_transformer.

Figure from @srahul1222.
My issue is about: seeing differences in the behavior of
add_taskandadd_transformermethods implemented in theprogressive_learnerclass. Callingadd_taskandadd_transformerfor different tasks lead to drastically different BTE behaviors, even thoughadd_taskjust callsadd_transformer.Figure from @srahul1222.