Formalize the Vortex type system by connortsui20 · Pull Request #29 · vortex-data/rfcs

connortsui20 · 2026-03-06T22:19:18Z

I wanted to write this for 2 reasons, the first being that we do not have a formalized definition of the Vortex type system. Note that I'm not saying we don't understand how it works (I think all of us intuitively understand it), but I thought it would be good to map it to actual theory.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

asubiotto

Thank you for writing this RFC, it's very useful to formalize the type system a little to help motivate changes/design to the type system.

I think that it would also be useful to spell out the motivation for the existence of separate concepts in the type system in order to inform the decision framework. These are things that we can probably internally/intuitively articulate but again I think it's helpful to spell it out. Specifically:

Why do we define DTypes as logical types separately from physical encodings?
Why do we have the concept of canonical physical representations? What's the goal?
What is the goal of extenstion types? How are they different from first-class dtypes?

Other than that, I think I mostly agree with the RFC. The conclusion I take away from the FSB discussion is that FSB should be part of the possible canonicalization targets of the Binary DType. Similarly, FixedSizeList should not be its own DType and another canonicalization target of the List DType.

One other thing I'm curious about which might be good to add to the RFC is "what amount of gating is required for a data type to be considered an extension type rather than a first-class dtype". Every type could essentially be sugar on a bytes type.

proposed/0029-types.md

connortsui20 · 2026-03-09T13:34:54Z

Thoughts on me splitting this RFC into 2 RFCs? The first can just be the formalization and the second can be the other proposal.

Edit: I am going to split this RFC.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

connortsui20 · 2026-03-09T16:21:17Z

After some offline discussion I'm going to completely pull out the second part of this RFC as we need to better understand how execute should work before we think about execution targets.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

connortsui20 · 2026-03-09T17:09:17Z

@asubiotto Note that this RFC doesn't make any claims that FixedSizeList shouldn't be a dtype, nor that FixedSizeBinary should be. It only has a framework in which we can think about these things.

connortsui20 added 11 commits March 6, 2026 11:50

first commit

b201866

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

add motivation section

b4bfc26

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

add some basic type theory background

a6f20c3

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

add sections and canonical description

6da7707

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

add section on confluence

f2a797a

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

rename

2b5ccfe

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

list vs list view

7df2097

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

add design section

21dd7ce

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

almost done

ab1da6e

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

first draft done

453fe14

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

fix

74a0ac9

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

connortsui20 force-pushed the ct/types branch from 25d9ccc to 74a0ac9 Compare March 6, 2026 22:20

connortsui20 requested review from gatesn and joseph-isaacs March 6, 2026 22:24

asubiotto reviewed Mar 7, 2026

View reviewed changes