Skip to content

Trim console lines that are longer than limit#2466

Draft
trancexpress wants to merge 1 commit intoeclipse-platform:masterfrom
trancexpress:gh2465
Draft

Trim console lines that are longer than limit#2466
trancexpress wants to merge 1 commit intoeclipse-platform:masterfrom
trancexpress:gh2465

Conversation

@trancexpress
Copy link
Contributor

Fixes: #2465

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR attempts to fix issue #2465, where the console character limit doesn't apply to lines that are longer than the limit. The issue occurs when a single line without newlines exceeds the console's high water mark - the existing trimming logic tries to trim to the start of the line (offset 0), resulting in no content being removed.

Changes:

  • Modified the trim() method in IOConsolePartitioner to handle the case where a single line exceeds the character limit
Comments suppressed due to low confidence (1)

debug/org.eclipse.ui.console/src/org/eclipse/ui/internal/console/IOConsolePartitioner.java:1207

  • This fix lacks test coverage for the specific issue it's attempting to address. The existing testTrim() method only tests trimming with multiple lines (each line is "0123456789\n"). There should be a test that reproduces the scenario from issue #2465: a single very long line (e.g., 100,000 characters without newlines) with the console limit set to a lower value (e.g., 80,000 characters), to verify that the console properly trims the content even when it's all on one line.
						// deal with case of one long line
						cutOffset = Math.max(cutOffset, truncateOffset);
						cutOffset = document.getLineOffset(cutoffLine);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

debug/org.eclipse.ui.console/src/org/eclipse/ui/internal/console/IOConsolePartitioner.java:1206

  • The comment "deal with case of one long line" is misleading. This logic applies to any line that is longer than the trim threshold, not just when there is only one line in the document. Consider updating the comment to clarify that this handles cases where individual lines exceed the buffer limit, for example: "trim within line if line is longer than buffer limit" or "ensure trimming happens even when line start would keep too much data".
						// deal with case of one long line

debug/org.eclipse.ui.console/src/org/eclipse/ui/internal/console/IOConsolePartitioner.java:1207

  • The fix addresses a specific edge case (very long single lines exceeding the console limit), but there is no test coverage added for this scenario. Consider adding a test similar to testTrim() but with a single very long line (e.g., 100,000 characters without newlines) to verify that trimming works correctly when the high water mark is exceeded. This would prevent regression of the bug described in issue #2465.
						// deal with case of one long line
						cutOffset = Math.max(cutOffset, truncateOffset);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@trancexpress
Copy link
Contributor Author

Test fails to look into:


org.eclipse.debug.tests.console.IOConsoleTests.testTrim -- Time elapsed: 0.164 s <<< FAILURE!
org.opentest4j.AssertionFailedError: 
Expected string not found in console document. ==> expected: <0123456789> but was: <
012345678>
	at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
	at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
	at org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
	at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1156)
	at org.eclipse.debug.tests.console.IOConsoleTestUtil.verifyContentByOffset(IOConsoleTestUtil.java:572)
	at org.eclipse.debug.tests.console.IOConsoleTests.testTrim(IOConsoleTests.java:718)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
 org.eclipse.debug.tests.console.IOConsoleFixedWidthTests.testTrim -- Time elapsed: 0.234 s <<< FAILURE!
org.opentest4j.AssertionFailedError: 
Expected string not found in console document. ==> expected: <0123456789> but was: <
012345678>
	at org.eclipse.debug.tests.console.IOConsoleTestUtil.verifyContentByOffset(IOConsoleTestUtil.java:572)
	at org.eclipse.debug.tests.console.IOConsoleTests.testTrim(IOConsoleTests.java:718)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
 org.eclipse.debug.tests.console.ProcessConsoleManagerTests.testProcessConsoleLifecycle(TestInfo) -- Time elapsed: 0.099 s <<< FAILURE!
java.lang.AssertionError: 
[console has been added] 
Expected size: 1 but was: 3 in:
[org.eclipse.ui.console.IOConsole@128d3d86,
    org.eclipse.ui.console.IOConsole@2ae8a311,
    org.eclipse.debug.internal.ui.views.console.ProcessConsole@1ed3b30]
	at org.eclipse.debug.tests.console.ProcessConsoleManagerTests.testProcessConsoleLifecycle(ProcessConsoleManagerTests.java:88)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)

@github-actions
Copy link
Contributor

github-actions bot commented Feb 24, 2026

Test Results

 1 980 files  ±0   1 980 suites  ±0   1h 34m 11s ⏱️ +29s
 4 746 tests ±0   4 722 ✅ ±0   24 💤 ±0  0 ❌ ±0 
14 238 runs  ±0  14 056 ✅ ±0  182 💤 ±0  0 ❌ ±0 

Results for commit 7be84d8. ± Comparison against base commit 2f050bb.

♻️ This comment has been updated with latest results.

@trancexpress
Copy link
Contributor Author

trancexpress commented Feb 24, 2026

The fail in IOConsoleTests.testTrim is demonstrated by this snippet:

	public static void main(String[] args) {
		String text =
				"""
				first
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				0123456789
				last
				""";
		System.out.println(text.substring(text.length() - 50));
	}

truncateOffset passed to IOConsolePartitioner.trim() is 181 = 231 - 50, 231 being document length and 50 being lowWaterMark. Cutting at this results in:


0123456789
0123456789
0123456789
0123456789
last

(i.e. the output of the snippet above)

While the test expects:

0123456789
0123456789
0123456789
0123456789
0123456789
last

The expectations are due to the code we need to change:

	private void trim(int truncateOffset, boolean truncateToOffsetLineStart) {
		if (document != null) {
			{
				try {
					int length = document.getLength();
					int cutOffset = truncateOffset;
					if (truncateToOffsetLineStart) {
						int cutoffLine = document.getLineOfOffset(truncateOffset);
						cutOffset = document.getLineOffset(cutoffLine);

If truncateToOffsetLineStart is set (and it is in this case), the code will find the line at which truncateOffset is. Then, it will include this entire line in the resulting document. Which is why long lines remain uncut.

IMO the test expectations should be adjusted. @iloveeclipse WDYT?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

debug/org.eclipse.debug.tests/src/org/eclipse/debug/tests/console/IOConsoleTests.java:719

  • Changing the expected offset to 1 indicates trimming now leaves a leading newline (i.e., trims at an arbitrary offset instead of the start of a line). That seems like a regression from the original “trim to line start” behavior and doesn’t directly validate the long-line limit fix from #2465. Consider keeping the expectation at offset 0 for the normal multi-line case, and add a dedicated assertion/test scenario for a single line longer than the high-water mark (no newline) to ensure the buffer actually shrinks when truncateToOffsetLineStart would otherwise result in cutOffset == 0.
				c.getConsole().setWaterMarks(50, 100);
				c.waitForScheduledJobs();
				c.verifyContentByOffset("0123456789", 1);
				assertTrue(c.getDocument().getNumberOfLines() < 15, "Document not trimmed.");

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@iloveeclipse
Copy link
Member

IMO the test expectations should be adjusted

Yes.

@eclipse-platform-bot
Copy link
Contributor

This pull request changes some projects for the first time in this development cycle.
Therefore the following files need a version increment:

debug/org.eclipse.debug.tests/META-INF/MANIFEST.MF
debug/org.eclipse.ui.console/META-INF/MANIFEST.MF

An additional commit containing all the necessary changes was pushed to the top of this PR's branch. To obtain these changes (for example if you want to push more changes) either fetch from your fork or apply the git patch.

Git patch
From 4a05f884965144009670b365292e1ae026e75ff3 Mon Sep 17 00:00:00 2001
From: Eclipse Platform Bot <platform-bot@eclipse.org>
Date: Sat, 28 Feb 2026 06:42:13 +0000
Subject: [PATCH] Version bump(s) for 4.40 stream


diff --git a/debug/org.eclipse.debug.tests/META-INF/MANIFEST.MF b/debug/org.eclipse.debug.tests/META-INF/MANIFEST.MF
index 9423ff5b0a..aa3897ece1 100644
--- a/debug/org.eclipse.debug.tests/META-INF/MANIFEST.MF
+++ b/debug/org.eclipse.debug.tests/META-INF/MANIFEST.MF
@@ -2,7 +2,7 @@ Manifest-Version: 1.0
 Bundle-ManifestVersion: 2
 Bundle-Name: %pluginName
 Bundle-SymbolicName: org.eclipse.debug.tests;singleton:=true
-Bundle-Version: 3.15.300.qualifier
+Bundle-Version: 3.15.400.qualifier
 Bundle-Localization: plugin
 Require-Bundle: org.eclipse.ui;bundle-version="[3.6.0,4.0.0)",
  org.eclipse.core.runtime;bundle-version="[3.29.0,4.0.0)",
diff --git a/debug/org.eclipse.ui.console/META-INF/MANIFEST.MF b/debug/org.eclipse.ui.console/META-INF/MANIFEST.MF
index d66f083c3a..c63a3299d4 100644
--- a/debug/org.eclipse.ui.console/META-INF/MANIFEST.MF
+++ b/debug/org.eclipse.ui.console/META-INF/MANIFEST.MF
@@ -2,7 +2,7 @@ Manifest-Version: 1.0
 Bundle-ManifestVersion: 2
 Bundle-Name: %pluginName
 Bundle-SymbolicName: org.eclipse.ui.console; singleton:=true
-Bundle-Version: 3.16.0.qualifier
+Bundle-Version: 3.16.100.qualifier
 Bundle-Activator: org.eclipse.ui.console.ConsolePlugin
 Bundle-Vendor: %providerName
 Bundle-Localization: plugin
-- 
2.53.0

Further information are available in Common Build Issues - Missing version increments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

debug/org.eclipse.debug.tests/src/org/eclipse/debug/tests/console/IOConsoleTests.java:718

  • The updated expectation (verifyContentByOffset(..., 1)) suggests trimming may now leave an extra leading character (often a lone \n from splitting a CRLF delimiter) rather than trimming cleanly at a line boundary. It would be better for the implementation to avoid producing a leading empty line / partial delimiter and keep the test asserting content begins at offset 0; also consider adding a dedicated test that reproduces the reported bug (single output line longer than the high-water mark) so the fix is validated directly.
				c.verifyContentByOffset("0123456789", 1);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@merks merks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logic could/would arbitrarily split a Unicode surrogate pair, wouldn't it? The copilot seems to suggest that even a \r\n might be strange split...

@iloveeclipse
Copy link
Member

I think this logic could/would arbitrarily split a Unicode surrogate pair, wouldn't it?

No idea, but an extra test wouldn't hurt, also for \r\n.

@trancexpress
Copy link
Contributor Author

I think this logic could/would arbitrarily split a Unicode surrogate pair, wouldn't it? The copilot seems to suggest that even a \r\n might be strange split...

I assume it will, yes.

Considering the split is at the start of the console contents (oldest output), is that a concern?

Even normal text in the oldest line will likely lose meaning when we cut it off.

I assume the limit is meant to keep the latest important console output and drop the rest.

@iloveeclipse
Copy link
Member

I assume the limit is meant to keep the latest important console output and drop the rest.

I wonder if current code "properly" deals with Unicode surrogate pairs and \r\n 's ? If not, we can ignore that too here.

@merks
Copy link
Contributor

merks commented Feb 28, 2026

I'm not so much concerned about the meaning of the text; obviously arbitrarily clipped text loses contextual meaning. It's the rendering that's a problem. A 1/2 surrogate is invalid and cannot be properly rendered. Yes, it's a corner case, so you might not care, but many languages (e.g., Chinese and Japanese) heavily use the surrogate pairs and they will likely notice such corner cases frequently whereas we might only notice if an emoji is corrupted.

Anyway I didn't create a blocking review. Use your own good judgement. You might use java.lang.Character.isHighSurrogate(char) to move the cutoff point to the right.

What's actually sort of worse is that folks are very likely to expect that the cutoff number is the number of characters and if you only count unicode UTF-8 characters where a pair often represent a codepoint which is a character the user actually sees, the user might be surprised that the number does not represent visible characters. Again these corners you might ignore and are easier to ignore when your language has a simpler alphabet...

@trancexpress
Copy link
Contributor Author

I wonder if current code "properly" deals with Unicode surrogate pairs and \r\n 's ? If not, we can ignore that too here.

Currently the code doesnt trim within a line. So I dont think there are problems with Windows line ends on master. Or composed symbols.

@trancexpress
Copy link
Contributor Author

Anyway I didn't create a blocking review. Use your own good judgement. You might use java.lang.Character.isHighSurrogate(char) to move the cutoff point to the right.

I can try this tomorrow, it might not make things that more complicated.

What's actually sort of worse is that folks are very likely to expect that the cutoff number is the number of characters and if you only count unicode UTF-8 characters where a pair often represent a codepoint which is a character the user actually sees, the user might be surprised that the number does not represent visible characters. Again these corners you might ignore and are easier to ignore when your language has a simpler alphabet...

The code on master already counts UTF-8 characters and truncates accordingly, I think? I can check with Japanese, no problem.

@trancexpress
Copy link
Contributor Author

What's actually sort of worse is that folks are very likely to expect that the cutoff number is the number of characters and if you only count unicode UTF-8 characters where a pair often represent a codepoint which is a character the user actually sees, the user might be surprised that the number does not represent visible characters.

See e.g.:

    public static void main(String[] args) {
//        String t = "😀😀😀😀😀";
        String t = "1212121212";
        int M = 100_000;
        for (int i = 0; i < M; ++i) {
            System.out.println(t);
        }
    }

On master, with a console limit of 80000, both will result in the same number of lines in the console 7273 lines. If each smiley would be a separate symbol for the user, the user would be surprised to see the half number of symbols.

So IMO if this is a concern, best to open a new ticket.

@trancexpress
Copy link
Contributor Author

trancexpress commented Mar 1, 2026

Anyway I didn't create a blocking review. Use your own good judgement. You might use java.lang.Character.isHighSurrogate(char) to move the cutoff point to the right.

This seems to be a big problem actually.

I have this snippet:

    public static void main(String[] args) {
        String s = "1😀".repeat(10000);
        System.out.println(s);
    }

I lower the console limit to 1000 (the lowest possible value).

With the change here, I see no symbols in the console. Splitting the smiley somehow breaks the output, the console becomes empty. Maybe the underlying widget text support on Linux breaks...

Not only that, but for the smiley char we need to truncate at, Character.isHighSurrogate reports false(i.e. for the 2nd char of the smiley)...

This particular smiley is composed of char with hex values 0xD83D, 0xDE00. isHighSurrogate() checks for interval [0xD800, 0xDBFF].

So apparently we must check not only the character at which we want to truncate, but also the one before (if any)... I think if either return Character.isHighSurrogate(), we should give up and not trim the line... Since otherwise we would need to scan potentially the entire line for a safe spot to cut, if e.g. the entire line consists of smileys...

And this is complicated enough to need a test...

For the \r\n case I'm not sure, since the code hear deals with trimming inside a line. I guess if the Windows newline is at the very limit, we could just off everything up until and including \r. Maybe we can add this case to the check for isHighSurrogate(), since we will already be checking the previous character...

@trancexpress trancexpress marked this pull request as draft March 1, 2026 10:46
@merks
Copy link
Contributor

merks commented Mar 1, 2026

Yes, it's kind of horrible. Java characters are a UTF-16 encoding of Unicode which is a 32 bit encoding that in java is a codepoint,i.e., an int; the UTF-8 encoding is bytes. If a Java character is a isLowSurrogate, you should truncate before that character, i.e., the character should not be at the end of the string. If it is a isHighSurrogate, you should truncate after that character, i.e., the character should not be at the start of the string.. Keep in mind that these things come in pairs so it's mostly a matter increasing or decreasing the cutoff point by 1 to not be between the two characters of the pair. And yes, one concern is that if the encoding is invalid that some tools or views might have a problems processing the content, certainly all will have problems rendering the content given it's not a meaningful unicode codepoint.

Anyway, thanks for investigating! 🏆

@trancexpress
Copy link
Contributor Author

Ah, I see. I'll try using isLowSurrogate then... I didn't realize there are two methods for this, I'm not familiar with UTF-16 encoding in UTF-8.

@trancexpress
Copy link
Contributor Author

Ah, I see. I'll try using isLowSurrogate then... I didn't realize there are two methods for this, I'm not familiar with UTF-16 encoding in UTF-8.

Seems to work as expected. I still need to add tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Console character limit doesn't apply to long lines

5 participants