Noticed that the test sets are different for single-sentence and cross-sentence.
(1) How did you determine the single-sentence instances? Did you select the instances with only one "." as single-sentence? Can you provide that partial test datasets?
(2) Did you use the same training data for these two settings?
(3) How did you split the dataset into train/dev/test? Do you mind sharing the train/dev/test set?
Thanks a lot!
Noticed that the test sets are different for single-sentence and cross-sentence.
(1) How did you determine the single-sentence instances? Did you select the instances with only one "." as single-sentence? Can you provide that partial test datasets?
(2) Did you use the same training data for these two settings?
(3) How did you split the dataset into train/dev/test? Do you mind sharing the train/dev/test set?
Thanks a lot!