Skip to content

Formalize the Vortex type system#29

Open
connortsui20 wants to merge 16 commits intodevelopfrom
ct/types
Open

Formalize the Vortex type system#29
connortsui20 wants to merge 16 commits intodevelopfrom
ct/types

Conversation

@connortsui20
Copy link
Contributor

@connortsui20 connortsui20 commented Mar 6, 2026

Rendered

I wanted to write this for 2 reasons, the first being that we do not have a formalized definition of the Vortex type system. Note that I'm not saying we don't understand how it works (I think all of us intuitively understand it), but I thought it would be good to map it to actual theory.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Copy link

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for writing this RFC, it's very useful to formalize the type system a little to help motivate changes/design to the type system.

I think that it would also be useful to spell out the motivation for the existence of separate concepts in the type system in order to inform the decision framework. These are things that we can probably internally/intuitively articulate but again I think it's helpful to spell it out. Specifically:

  1. Why do we define DTypes as logical types separately from physical encodings?
  2. Why do we have the concept of canonical physical representations? What's the goal?
  3. What is the goal of extenstion types? How are they different from first-class dtypes?

Other than that, I think I mostly agree with the RFC. The conclusion I take away from the FSB discussion is that FSB should be part of the possible canonicalization targets of the Binary DType. Similarly, FixedSizeList should not be its own DType and another canonicalization target of the List DType.

One other thing I'm curious about which might be good to add to the RFC is "what amount of gating is required for a data type to be considered an extension type rather than a first-class dtype". Every type could essentially be sugar on a bytes type.

@connortsui20
Copy link
Contributor Author

connortsui20 commented Mar 9, 2026

Thoughts on me splitting this RFC into 2 RFCs? The first can just be the formalization and the second can be the other proposal.

Edit: I am going to split this RFC.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
@connortsui20 connortsui20 changed the title Formalize the Vortex type system + add CanonicalTarget Formalize the Vortex type system Mar 9, 2026
@connortsui20
Copy link
Contributor Author

After some offline discussion I'm going to completely pull out the second part of this RFC as we need to better understand how execute should work before we think about execution targets.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
@connortsui20
Copy link
Contributor Author

@asubiotto Note that this RFC doesn't make any claims that FixedSizeList shouldn't be a dtype, nor that FixedSizeBinary should be. It only has a framework in which we can think about these things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants